Recursion tutorial

Oct. 2019

All the algorithms and tests discussed in this post are ready-to-run at codeSkulptor3. You can also download the file: recursion-tutorial.zip

Recursion is one of the most perplexing concepts in CS-101 and yet it is fundamental to the way of thinking of computer scientists. Explaining it is also fairly challenging because ‘getting’ it at the intellectual level just doesn’t cut it; we need to sit down with paper and pencil and work at it a few times. We’ll start with the recursive versions of factorial and fibonacci, which are some of the simplest recursive routines that we will find. Then, we’ll add a couple more – quick sort and binary sort – under the assumption that exposure to more and more examples will do the trick. Hopefully this post will help someone grasp recursion faster.

Recursive calls are an example of the technique of solving a problem by solving a smaller version of it. Hence we need two things:

  1. A ‘base case’: a case in which the problem is trivially solved,
    which we reach when a certain condition is satisfied
  2. A ‘general case’: all the other cases expressed as the solution of
    one or more smaller versions of the original problem.

Each recursive call has to take us a bit closer to the solution, a bit closer to the base case. The basic form of a recursive function is:

After our initial call to foo(), foo() will take control and will call itself n times until, in call n+1, the stop condition is true:

enter image description here

Once the stop condition of the (n+1)-th foo() is true, in line 2, this (n+1)-th foo() returns, in line 3, to the line 5 of the n-th foo(), i.e., to the instance of foo() that called it.

The n-th foo(), who now its done with line 5, executes line 6 and goes back to the line 5 of the (n-1)-th foo(), i.e., to the instance of foo() that called it. This process continues until we return, one by one, all the way back to the initial call to foo().

We can also write the general template of a recursive call without using the else statement because if the base case applies, the function returns immediately, but if it doesn’t, the execution ‘falls through’ to the statements that follow it, e.g.,

Factorial and summation

Let’s go over the factorial(n) function, which is the product of all the integers from 1 to n. Thus,

$$\mbox{factorial}(5) = 1 \times 2 \times 3 \times 4 \times 5$$

We complete the definition of the function setting

$$\mbox{factorial}(0) = 1$$

so now factorial works for all non-negative integers:

$$
\mbox{factorial}(n) = \left\{
\begin{array}{ll}
1&\mbox{n = 0}\\
\mbox{factorial}(n-1) \times n&\mbox{n > 1}
\end{array}
\right.
$$

Let write factorial in an iterative way first:

In this case, we start with $%product = 1$% and multiply it by $%i$% as we move towards $%n$%, e.g.,
$$
\begin{align}
\mbox{product} &= 1\\
\mbox{product} &= 1 \times 2\\
\mbox{product} &= (1 \times 2) \times 3\\
\mbox{product} &= ((1 \times 2) \times 3) \times 4\\
\end{align}$$and so on.

Now let’s write it recursively. Our base condition is the case in which factorial is found trivially. For example, we know that if $%n = 0$%, then $%\mbox{factorial}(n) = 1$%. We do not need to do anything additional so that is our base case. To find the general case, we decompose the problem using a smaller version of itself. Look at the lines of ‘product’ above. We can rewrite them as:
$$
\begin{align}
\mbox{factorial} (1) &= 1\\
\mbox{factorial} (2) &= 1 \times 2\\
\mbox{factorial} (3) &= (1 \times 2) \times 3\\
\mbox{factorial} (4) &= ((1 \times 2) \times 3) \times 4\\
\end{align}
$$

or

$$
\begin{align}
\mbox{factorial} (1) &= 1\\
\mbox{factorial} (2) &= \mbox{factorial} (1) \times 2\\
\mbox{factorial} (3) &= \mbox{factorial} (2) \times 3\\
\mbox{factorial} (4) &= \mbox{factorial} (3) \times 4\\
\vdots\\
\mbox{factorial} (n) &= \mbox{factorial} (n-1) \times n\\
\end{align}
$$

Since we have our base and general cases, we use the template for recursive functions, i.e.,

to create our recursive function:

We are forming a chain of functions, each one waiting for the result from another call to the function before returning its own value. Eventually, the call made using $%0$% as the argument (our base case) returns a $%1$% to the function that called it who, in turn, returns a $%1 \times 2$% to the function that called it who, in turn, returns $%(1 \times 2) \times 3$% to the function that called it… and so on.

Now you try it… using the template and without looking at the code of $%\mbox{factorial_recursive}()$%, write the recursive function $%\mbox{sumFirstN}(n)$% that adds the numbers from $%0$% to $%n$%. What is our base condition? What is our general case? The function to write is:

$$
\mbox{sumFirstN}(n) = \left\{
\begin{array}{ll}
0&n = 0\\
\mbox{sumFirstN}(n-1) + n&\mbox{n > 1}
\end{array}
\right.
$$

We should run $%\mbox{sumFirstN}(n)$% with a few values of $%n$% to make sure that we got it right. For $%n=0$%, it should return $%0$%, and for any integer $%n > 0$% it should return $%n(n+1)/2$%. So, yes… the only purpose of this particular function is to help us learn recursion; in practice, though, we should use the closed-form solution.

Fibonacci and Lucas numbers

The Fibonacci sequence is often used to demonstrate both how to do recursion and when not to use recursion. Fibonacci is expressed naturally as a recursive function:
$$
\mbox{fibo}(n) = \left\{
\begin{array}{ll}
0&\mbox{n = 0}\\
1&\mbox{n = 1}\\
\mbox{fibo}(n-1)+\mbox{fibo}(n-2)&\mbox{n > 1}
\end{array}
\right.
$$
The definition itself tells us both the base case and the general case. Let’s plug them in our template:

becomes

This recursive function is expensive because each call to the function calls two other functions, each of which, in turn, calls two more and so on. Hence, we have a geometric explosion of function calls.

Now its your turn. Let’s write the function $%\mbox{lucas}(n)$% that gives us the Lucas number, which are defined as:

$$
\mbox{lucas}(n) = \left\{
\begin{array}{ll}
2&\mbox{n = 0}\\
1&\mbox{n = 1}\\
\mbox{lucas}(n-1)+\mbox{lucas}(n-2)&\mbox{n > 1}
\end{array}
\right.
$$

Like Fibonacci, Lucas is not a good function to implement recursively because it creates a geometric explosion of calls. We can avoid the geometric explosion of calls in the recursive implementation of both Fibonacci and Lucas using memoization.

Multiple recursive calls – Quicksort

Some recursive problems are chains in which the execution flow moves forwards and then backwards, in a straight way. These programs make only one recursive call in their bodies, like $%\mbox{factorial}(n)$%. Some others, though, use more than one recursive call, like $%\mbox{fibonacci}()$%. Another recursive procedure that uses more than one recursive call is Quicksort, which sorts a list of numbers. Let’s look at a version that although not the most efficient, illustrates well the principles of recursion:

We can run this code in codeskulptor and verify that the result is the sorted list

$$[1, 2, 3, 3, 4, 5, 8]$$

Quicksort uses the divide-and-conquer technique; it splits the elements of the list in three lists: a given element – called the pivot, and two sorted lists – one that contains all the elements smaller than the pivot, and another one that contains all the elements greater that or equal to the pivot. The returned solution is the concatenation of the sorted ‘less-than’ list with the pivot with the sorted ‘greater-than’ list, in that order. Let’s see how it goes about to do this.

The base case happens when the list that we are sorting has 0 or 1 elements, in which case the list is already sorted and we return the list without changing it. This base case is trivial to solve, there is nothing to do. We just need to recognize it.

In the general case, we remove a random value from the list, i.e., the pivot, we create the two lists with elements smaller and larger than or equal to the pivot, we sort each of them calling quicksort, and finally, we form the result concatenating the sorted ‘less’ list, the pivot, and the ‘greater-or-equal’ list, in that order. The ‘less’ and ‘greater-or-equal’ lists are as valid an argument of quicksort as the list used in the initial call to quicksort; the only difference is that they are both smaller than the original list and thus, closer to satisfy the stop condition, i.e., their lengths are closer to 0 or 1. Hence, each call to quicksort generates two additional calls with lists that, in the average, are about half as big as the argument of the call.

I know… I know… you traced quicksort and now you fully understand how it works. Still, let’s do it, by hand, just one more time; in each call, we’ll choose the pivot at random:

quicksort behavior

Each call to quicksort (i.e., qs()) descends one level, and it is either solved by another call to quicksort or by satisfying the base case (in black). A solution to a quicksort call, returned up to the calling function, is the concatenation of a sorted less-than-the-pivot list (in blue), the pivot (in green, selected at random), and a sorted greater-than-or-equal-to-the-pivot list (in red).

Quicksort is one of the fastest sorts known, and it is part of the standard library of many languages and operating systems.

Your turn: binary search

Let’s write down a routine that tells us whether a sorted list of numbers $%A$% contains a particular value $%x$%. The solution will be at the end of the post but, of course, the point is to work it out without looking at the solution. Although we can solve this problem iterating over each item in turn and comparing it to $%x$%, there is a better strategy: we can check if the value in the middle of the list is greater than $%x$%; if it is, all we need to check is the first half of the list, and if it is not, all we need to check is the second half of the list. However, checking each sublist is an equivalent problem to that which we started with, so this is a good candidate for recursion:

  • What is our base case? What are the list or lists for which we can determine, trivially, whether they contain $%x$% or not?

  • What is our general case? We are going to take the middle of the list, i.e., $%\mbox{mid_index} = A[len(A) // 2]$% and determine if $%x < \mbox{mid_index}$% or not. If $%x < \mbox{mid_index}$%, then we need to check the list $%A[ :\mbox{mid_index}]$% (i.e., the first half of the list); otherwise, we need to check $%A[\mbox{mid_index}:]$% (i.e., the second half of the list).

Think a bit about it and then let’s continue with the same example but with more detail.

Let our value be $%x = 24$%, and our list be
$$A = [1, 2, 3, 3, 3, 6, 8, 9, 13, 13, 14, 17, 21, 22, 23, 25]$$

The index in the middle of the list is

$$\mbox{len}(A) // 2 = 16 // 2 = 8$$

so $%A[\mbox{mid_index}] = 13$% (the first of the two consecutive 13s). Since $%13$% is not smaller than $%24$% then we will check the second half of the list. We need to create a new list starting at the 9th index, i.e.,

$$\mbox{new_list} = [13, 14, 17, 21, 22, 23, 25]$$

and call the routine again.

Since we are calling our function with ever smaller lists as arguments, we will have two base cases:

  1. when the list is empty, in which case $%24$% is not in it, and we return $%\mbox{False}$%
  2. when the list contains a single value, in which case we return $%\mbox{True}$% or $%\mbox{False}$% depending on whether this single value is $%24$%.

The following should be some of our outputs:

where $%\mbox{assert()}$% is a built-in function of Python, commonly used for testing programs, that does nothing if its argument is $%\mbox{True}$% but stops the program with an alert if it is $%\mbox{False}$%.

Your turn again: binary search depth

If we had troubles with the previous exercise, we should get the solution at the bottom of this post and study it until we can follow it. An incredibly useful way to study an algorithm is to follow it with paper and pencil, away from the computer: we write down what each variable is, and update the values as we follow the flow of calls by hand. This is a much better way to understand a program than to follow it on the screen, even with a tracer or a debugger, although, admittedly, it is only convenient to do so with small programs.

We are going to modify $%\mbox{bin_search}()$% so it will tell us how many recursive calls it made to find our answer. If we were to check every value of the list, we would need a number of comparisons equal to the number of elements of the list, e.g., if we have $%n$% elements and $%x$% is in the list, in the average-case, we would need $%n/2$% comparisons to find it; this is referred as taking $%O(n)$% time, i.e., a time proportional to $%n$%. However, in binary search we are only checking half of the list each time, and only using a single comparison to find if we should check for $%x$% in the first or second half of the list. The result is that we only need $%O(log_2(n))$% comparisons to check the list, i.e., 4 comparisons instead of 16. Let’s verify that with our program.

We are going to return a list in which the first element is a boolean telling us whether the value is in the list (same as before) and the second element is the depth of recursive calls. We know that when we arrive to the base case we are at the deepest level so we return a 0 back the chain. As we keep going backwards, we will add 1 to this level so that when we emerge from the original call we will have the accumulation of all the levels.

We will find that, indeed, every call is solved in 4 recursion calls and since there is only one comparison per call, it is solved in 4 comparisons, instead of the 16 needed in the worst case scenario of the brute force approach. I know you already figured out that we can do better. Indeed, $%log_2(n)$% is actually a worst case scenario here too. We’ll work on your solution in a bit.

Your turn again: binary search where

In $%\mbox{bin_search}()$% we determined if a number $%x$% was in a sorted list $%A$%. Here we will modify that program to also obtain where in the list we can find it. We are not asking for the first of the occurrences; all we want is that if we get an answer such as $%[\mbox{True}, i]$% then $%A[i] = x$%.

The final value of the index at which we will find $%x$% depends on the indices of all the lists in the preceding function calls. We will find $%x$% is our base case, at which point we will know its index to be 0; As we emerge from the calls, we add to this index the offset that we used to create the list. Each time we are going to return a list with two values: the first value is a boolean indicating whether the number was found in the list, and the second one will be either $%\mbox{None}$%, if the number was not found, or the index.

Think a bit about it and then let’s continue with the same example, but with more detail.

Let our value be 24 and our list be

$$A = [1, 2, 3, 3, 3, 6, 8, 9, 13, 13, 14, 17, 21, 22, 23, 25]$$

Our function now must receive the list and the value that we are searching for. In the example, $%\mbox{mid_index} = 8$% and $%A[\mbox{mid_index}] = 13$%, so we will check the second half of the list. We create
$$\mbox{new_list} = [13, 14, 17, 21, 22, 23, 25]$$
and call the routine again.

  1. Our base case returns now a list of the form [False, None] if the list is empty. If the list is 1 element long then it needs to return [False, None] if the element in the list is not 24, or [True, 0] if it is. This $%0$% is the index of $%24$% in the 1-element-long list
  2. In the general case, we have to add to the index that we received the offset of the list that we used to call the routine recursively, i.e., either 0, if we called the function with the first half of the list, or mid_index, if we called it with the second half of the list.

The following should be some of our outputs:

Your turn one last time: list binary search depth 2

Let’s go back to the idea that we can find whether a value is in the list in less than $%log_2(n)$% comparisons. Modify $%\mbox{bin_search_depth}()$% to return immediately if the value in the middle of the array is equal to our value; after all we need to do the comparison anyways, to determine whether we should search for $%x$% in the lower or higher half of the list. This scenario is somewhat unusual because we will end up with two return statements in the bodies of both the base and the general case.

Solutions to binary search

These programs solve the problems stated above, but they are not ideal as actual routines to do the job and much less as examples of how to write in Python; their purpose is simply to illustrate the principles of recursion. We can find these programs, and all the other programs of this post, ready-to-run, at codeSkulptor3.


Recursion tutorial

This post is an update of the following original post:

Title: Python 101 – Unit 6 – Yet another attempt to explain recursion
Forum of “Introduction to Computer Science” by David Evans offered under Udacity, presently offered as “Introduction to Python programming”
First publication date: Feb. 2013
Last updated: Oct. 2019
Link: https://discussions.udacity.com/t/python-101-unit-6-yet-another-attempt-to-explain-recursion/83570; this post might still be reachable using a Udacity account but it is no longer public; I posted it using the screenname ‘Goldsong’.