Thinking Recursively

Of all ideas I have introduced to children, recursion stands out as the one idea that is particularly able to evoke an excited response.
-- Seymour Papert, Mindstorms

Dear Santa Claus

I realize that as fellow Pythonistas we are all consenting adults here, but children seem to grok the beauty of recursion better. So let's NOT be adults here for a moment and talk about how we can use recursion to help Santa Claus.

Have you ever wondered how Christmas presents are delivered? I sure have, and I believe Santa Claus has a list of houses he loops through. He goes to a house, drops off the presents, eats the cookies and milk and moves on to the next house on the list. Since this algorithm for delivering presents is based on an explicit loop construction, it is called an iterative algorithm.

The algorithm for iterative present delivery implemented in Python:

In [1]:
houses = ["Eric's house", "Kenny's house", "Kyle's house", "Stan's house"]

def deliver_presents_iteratively():
    for house in houses:
        print("Delivering presents to", house)
        
deliver_presents_iteratively()
Delivering presents to Eric's house
Delivering presents to Kenny's house
Delivering presents to Kyle's house
Delivering presents to Stan's house

But I feel for Santa, at his age he shouldn't have to deliver all the presents by himself. I propose an algorithm using which he can divide the work of delivering presents among his elves:

  1. Appoint an elf, assign all the work to him
  2. Title and responsibilities of an elf depend on the number of houses he has to deliver to:
    1. $(> 1)$ He is a manager and can appoint two elves and divide his work among them
    2. $(= 1)$ He is a worker and has to deliver the presents to the house assigned to him

This is the typical structure of a recursive algorithm. If the current problem represents a simple case, solve it. If not, divide it into subproblems and apply the same strategy to them.
The algorithm for recursive present delivery implemented in Python:

In [2]:
houses = ["Eric's house", "Kenny's house", "Kyle's house", "Stan's house"]

# Each function call represents an elf doing his work 
def deliver_presents_recursively(houses):
    # Worker elf doing his work
    if len(houses) == 1:
        house = houses[0]
        print("Delivering presents to", house)
    
    # Manager elf doing his work
    else:
        mid = len(houses) // 2
        first_half = houses[:mid]
        second_half = houses[mid:]
        
        # Divides his work among two elves
        deliver_presents_recursively(first_half)
        deliver_presents_recursively(second_half)

deliver_presents_recursively(houses)
Delivering presents to Eric's house
Delivering presents to Kenny's house
Delivering presents to Kyle's house
Delivering presents to Stan's house

Recursive Functions

Now that we have some intuition about recursion, let's introduce the formal definition of a recursive function. Recursive function is a function defined in terms of itself via self-referential expressions. It means that the function will continue to call itself and repeat its behavior until some condition is met to return a result. All recursive functions share a common structure made up of two parts: base case and recursive case. To demonstrate this structure let's write a recursive function for calculating $n!$ :

  1. Decompose the original problem into simpler instances of the same problem. This is the recursive case:
    $n! = n \times (n-1) \times (n-2) \times (n-3) \times \cdot \cdot \cdot \times \, 3 \times 2 \times 1$
    $n! = n \times (n-1)!$
  2. As the large problem is broken down into successively less complex ones, those subproblems must eventually become so simple that they can be solved without further subdivision. This is the base case:
    $n! = n \times (n-1)!$
    $n! = n \times (n-1) \times (n-2)!$
    $n! = n \times (n-1) \times (n-2) \times (n-3)!$
    $\cdot$
    $\cdot$
    $n! = n \times (n-1) \times (n-2) \times (n-3) \times \cdot \cdot \cdot \times \, 3!$
    $n! = n \times (n-1) \times (n-2) \times (n-3) \times \cdot \cdot \cdot \times \, 3 \times 2!$
    $n! = n \times (n-1) \times (n-2) \times (n-3) \times \cdot \cdot \cdot \times \, 3 \times 2 \times 1!$
    Here $1!$ is our base case and it equals 1.

Recursive function for calculating $n!$ implemented in Python:

In [3]:
def factorial_recursive(n):
    # Base case: 1! = 1
    if n == 1:
        return 1
    
    # Recursive case: n! = n * (n-1)!
    else:
        return n * factorial_recursive(n-1)

factorial_recursive(5)
Out[3]:
120

Behind the scenes, each recursive call adds a stack frame (containing its execution context) to the call stack until we reach the base case. Then, the stack begins to unwind as each call returns its results:

Maintaining State

When dealing with recursive functions, keep in mind that each recursive call has its own execution context so to maintain state during recursion you either have to:

  1. Thread the state through each recursive call so that the current state is part of the current call's execution context
  2. Keep the state in global scope

A demonstration should make things clearer, let's calculate $1+2+3+\cdot\cdot10$ using recursion. The state that we have to maintain is (current number we are adding, accumulated sum till now), let's do that by threading it through each recursive call i.e. passing the updated current state to each recursive call as arguments:

In [4]:
def sum_recursive(current_number, accumulated_sum):
    # Base case
    # Return the final state
    if current_number == 11:
        return accumulated_sum
    
    # Recursive case
    # Thread the state through the recursive call
    else:
        return sum_recursive(current_number + 1, accumulated_sum + current_number)

# Pass the initial state
sum_recursive(1, 0)
Out[4]:
55

The other option is to maintain the state by keeping it in global scope:

In [5]:
# Global mutable state
current_number = 1
accumulated_sum = 0


def sum_recursive():
    global current_number
    global accumulated_sum
    # Base case
    if current_number == 11:
        return accumulated_sum
    # Recursive case
    else:
        accumulated_sum = accumulated_sum + current_number
        current_number = current_number + 1
        return sum_recursive()
    
sum_recursive()
Out[5]:
55

I prefer threading the state through each recursive call because I find global mutable state to be evil, but that's a discussion for a later time.

Recursive Data Structures

A data structure is recursive if it can be defined in terms of a smaller version of itself. A list is an example of a recursive data structure, let me demonstrate, assume that you only have an empty list at your disposal, and the only operation you can perform on it is this:

In [6]:
# Return a new list which is the result of
# adding element to the head(i.e. front) of input_list
def attach_head(element, input_list):
    return [element] + input_list

Now using the empty list and the attach_head operation you can generate any list, for example let's generate [1, 46, -31, "hello"]:

In [7]:
attach_head(1,                                                  # Will return [1, 46, -31, "hello"]
            attach_head(46,                                     # Will return [46, -31, "hello"]
                        attach_head(-31,                        # Will return [-31, "hello"]
                                    attach_head("hello", [])))) # Will return ["hello"]
Out[7]:
[1, 46, -31, 'hello']

I agree that the above code is not pretty but I structured it that way to make two things explicit:

  1. Starting from an empty list you can generate any list by recursively applying the attach_head function, and thus the list data structure can be defined recursively as: $$ list = \begin{cases} attach\_head(element,\,smaller\,list)\\ empty\,list \end{cases}$$
  2. Recursion can also be seen as self-referential function composition. We apply a function to an argument, then pass that result on as an argument to a second application of the same function and so on. Repeatedly composing attach_head with itself is the same as attach_head calling itself repeatedly.

List is not the only recursive data structure, other examples include set, tree, dictionary etc.

Recursive data structures and recursive functions go together like bread and butter. The recursive function's structure can often be modeled after the definition of the recursive data structure it takes as an input. Let me demonstrate this by calculating sum of all the elements of a list recursively:

In [8]:
def list_sum_recursive(input_list):
    # Base case
    if input_list == []:
        return 0
    
    # Recursive case
    # Decompose the original problem into simpler instances of the same problem
    # by making use of the fact that the input is a recursive data structure
    # and can be defined in terms of a smaller version of itself
    else:
        head = input_list[0]
        smaller_list = input_list[1:]
        return head + list_sum_recursive(smaller_list)
    
list_sum_recursive([1, 2, 3])
Out[8]:
6

Naive Recursion is Naive

The Fibonacci numbers were originally defined by the Italian mathematician Fibonacci in the thirteenth century to model the growth of rabbit populations. Fibonacci surmised that the number of pairs of rabbits born in a given year is equal to the number of pairs of rabbits born in each of the two previous years, starting from one pair of rabbits in the first year. To count the number of rabbits born in the nth year, he defined the recurrence relation: $$F_n = F_{n-1} + F_{n-2}$$

with basis cases $F_0 = 0$ and $F_1 = 1$. Let's write a recursive function to compute the nth Fibonacci number:

In [9]:
def fibonacci_recursive(n):
    print("Calculating F", "(", n, ")", sep="", end=", ")
    
    # Base case
    if n == 0:
        return 0
    elif n ==1:
        return 1
    
    # Recursive case
    else:
        return fibonacci_recursive(n-1) + fibonacci_recursive(n-2)
    
fibonacci_recursive(5)
Calculating F(5), Calculating F(4), Calculating F(3), Calculating F(2), Calculating F(1), Calculating F(0), Calculating F(1), Calculating F(2), Calculating F(1), Calculating F(0), Calculating F(3), Calculating F(2), Calculating F(1), Calculating F(0), Calculating F(1), 
Out[9]:
5

Naively following the recursive definition of nth Fibonacci number has given us a rather inefficient result. As you can see from the output above, we are unnecessarily recomputing values. So let's try and improve fibonacci_recursive by caching the results of each Fibonacci computation F(k):

In [10]:
from functools import lru_cache

@lru_cache(maxsize=None)
def fibonacci_recursive(n):
    print("Calculating F", n, sep="", end=", ")
    
    # Base case
    if n == 0:
        return 0
    elif n ==1:
        return 1
    
    # Recursive case
    else:
        return fibonacci_recursive(n-1) + fibonacci_recursive(n-2)
    
fibonacci_recursive(5)
Calculating F5, Calculating F4, Calculating F3, Calculating F2, Calculating F1, Calculating F0, 
Out[10]:
5

lru_cache is a decorator which caches the results, thus we avoid recomputation by explicitly checking for the value before trying to compute it. One thing to keep in mind about lru_cache is that since it uses a dictionary to cache results, the positional and keyword arguments (which serve as keys in that dictionary) to the function must be hashable.

Pesky Details

Python doesn't have support for tail-call optimization which means that you can cause a stack overflow if you end up using more stack frames than:

In [11]:
import sys

# Default call stack depth
sys.getrecursionlimit()
Out[11]:
3000

Keep this limit in mind if you have a program that requires deep recursion.

Also, Python's mutable data structures don't support structural sharing so treating them like immutable data structures is going to negatively affect your space and GC (garbage collection) efficiency as you are going to end up unnecessarily copying a lot of mutable objects. For example, I have used this pattern to decompose lists and recurse over them:

In [12]:
input_list = [1, 2, 3]
head = input_list[0]
tail = input_list[1:]

print("head --", head)
print("tail --", tail)
head -- 1
tail -- [2, 3]

I did that to simplify things for the sake of understanding. Keep in mind that tail is being created by copying and recursively doing that over large lists can negatively affect your space and GC efficiency.

Fin

I was once asked to explain recursion in an interview, I took a sheet of paper and wrote 'Please turn over' on both its sides. The interviewer didn't get the joke, but now that you have read this article, hopefully you do :)

References

  1. Thinking Recursively by Eric S. Roberts
  2. The Little Schemer by Daniel P. Friedman and Matthias Felleisen
  3. Concepts, Techniques, and Models of Computer Programming by Peter Van-Roy
  4. The Algorithm Design Manual by Steven Skiena
  5. Haskell Programming from First Principles by Christopher Allen and Julie Moronuki