A Recipe for Backtracking

based on The Algorithm Design Manual by Steven S. Skiena

Backtracking is a systematic way to iterate through all the possible configurations of a search space. We must generate each possible configuration exactly once. Avoiding both repetitions and missing configurations means that we must define a systematic generation order.

We will model our combinatorial search solution as a vector $a = (a_1, a_2,..., a_n)$, where each element $a_i$ is selected from a finite ordered set $S_i$. Such a vector might represent an arrangement where $a_i$ contains the ith element of the permutation. Or, the vector might represent a given subset S, where $a_i$ is true if and only if the ith element of the universe is in S. The vector can even represent a sequence of moves in a game or a path in a graph, where $a_i$ contains the ith event in the sequence.

At each step in the backtracking algorithm, we try to extend a given partial solution $a = (a_1, a_2,..., a_k)$ by adding another element at the end. After extending it, we must test whether what we now have is a solution: if so, we should print it or count it. If not, we must check whether the partial solution is still potentially extendible to some complete solution.

Backtracking constructs a tree of partial solutions, where each vertex represents a partial solution. There is an edge from x to y if node y was created by advancing from x. This tree of partial solutions provides an alternative way to think about backtracking, for the process of constructing the solutions corresponds exactly to doing a depth-first traversal of the backtrack tree. Viewing backtracking as a depth-first search on an implicit graph yields a natural recursive implementation of the basic algorithm.

Backtrack-DFS(A,k)
if A = $(a_1, a_2,..., a_k)$ is a solution, report it.
else
  k = k + 1
  compute $S_k$
  while $S_k \neq \phi$ do
    $a_k$ = an element in $S_k$
    $S_k$ = $S_k$ − $a_k$
    Backtrack-DFS(A,k)

Implementation

The honest working backtrack code is given below:

In [1]:
# found all solutions yet?
finished = False 

def backtrack(a, data):
    if is_a_solution(a, data):
        process_solution(a, data)
    else:
        # candidates for next position
        candidate_list = construct_candidates(a, data)
        for candidate in candidate_list:
            make_move()
            backtrack(a + [candidate], data)
            unmake_move()
            # terminate early 
            if finished: return

Backtracking ensures correctness by enumerating all possibilities. It ensures efficiency by never visiting a state more than once. Study how recursion yields an elegant and easy implementation of the backtracking algorithm. Because a new candidates list candidate_list is created with each recursive procedure call, the subsets of not-yet-considered extension candidates at each position will not interfere with each other. The application-specific parts of this algorithm consists of five subroutines:

  1. is_a_solution(a, data): This Boolean function tests whether vector a is a complete solution for the given problem. The second argument, data, allows us to pass general information into the routine. We can use it to specify n—the size of a target solution. This makes sense when constructing permutations or subsets of n elements, but other data may be relevant when constructing variable-sized objects such as sequences of moves in a game.

  2. construct_candidates(a, data): This routine returns a list with the complete set of possible candidates for the next position of a, given the contents of all the previous positions. Again, data may be used to pass auxiliary information.

  3. process_solution(a, data): This routine prints, counts, or however processes a complete solution once it is constructed.

  4. make_move(): This routine enables us to modify a data structure in response to the latest move.

  5. unmake_move(): This routine enables us to clean up the data structure we modified in make_move() if we decide to take back the move. Such a data structure could be rebuilt from scratch from the solution vector as needed, but this is inefficient when each move involves incremental changes that can easily be undone.

We include a finished flag to allow for premature termination, which could be set in any application-specific routine. To really understand how backtracking works, you must see how such objects as permutations and subsets can be constructed by defining the right state spaces. Examples of several state spaces are described in sections below.

Constructing All Subsets

A critical issue when designing state spaces to represent combinatorial objects is how many objects need representing. How many subsets are there of an n-element set, say the integers {1, . . . , n}? There are exactly two subsets for n = 1, namely {} and {1}. There are four subsets for n = 2, and eight subsets for n = 3. Each new element doubles the number of possibilities, so there are $2^n$ subsets of n elements.
Each subset is described by elements that are in it. To construct all $2^n$ subsets, we set up a vector $a = (a_1, a_2,..., a_n)$, where the value of $a_i$ (true or false) signifies whether the ith item is in the given subset. In the scheme of our general backtrack algorithm, $S_k = (true, false)$ and $a$ is a solution whenever $k = n$. We can now construct all subsets with simple implementations of is_a_solution(), construct_candidates(), and process_solution(). make_move() and unmake_move() are not employed in this example and are thus stubbed out.

In [2]:
def generate_subsets(n):
    backtrack([], n)

def is_a_solution(a, data):
    return len(a) == data

def process_solution(a, data):
    print("{", end = "")
    for exists, elem in zip(a, range(1, data + 1)):
        if exists: print(elem, end = "")
    print("}", end = " ")

def construct_candidates(a, data):
    return [True, False]

def make_move():
    pass

def unmake_move():
    pass
In [3]:
generate_subsets(4)
{1234} {123} {124} {12} {134} {13} {14} {1} {234} {23} {24} {2} {34} {3} {4} {} 

Constructing All Permutations

Counting permutations of {1, . . . , n} is a necessary prerequisite to generating them. There are n distinct choices for the value of the first element of a permutation. Once we have fixed $a_1$, there are n − 1 candidates remaining for the second position, since we can have any value except $a_1$ (repetitions are forbidden in permutation). Repeating this argument yields a total of $n! = \prod_{i=1}^{n} i$ distinct permutations.

This counting argument suggests a suitable representation. Set up a vector $a = (a_1, a_2,..., a_n)$. The set of candidates for the ith position will be the set of elements that have not appeared in the $(i − 1)$ elements of the partial solution, corresponding to the first $i − 1$ elements of the permutation.
In the scheme of the general backtrack algorithm, $S_k = \{1, . . . , n\} − a$, and $a$ is a solution whenever $k = n$. We can now construct all permutations with simple implementations of is_a_solution(), construct_candidates(), and process_solution(). make_move() and unmake_move() are not employed in this example and are thus stubbed out.

In [4]:
def generate_permutations(n):
    backtrack([], n)

def is_a_solution(a, data):
    return len(a) == data

def process_solution(a, data):
    print(a, end = " ")

def construct_candidates(a, data):
    return set(range(1, data + 1)) - set(a)

def make_move():
    pass

def unmake_move():
    pass
In [5]:
generate_permutations(3)
[1, 2, 3] [1, 3, 2] [2, 1, 3] [2, 3, 1] [3, 1, 2] [3, 2, 1] 

Sudoku


What is Sudoku? In its most common form, it consists of a 9×9 grid filled with blanks and the digits 1 to 9. The puzzle is completed when every row, column, and sector (3×3 subproblems corresponding to the nine sectors of a tic-tac-toe puzzle) contain the digits 1 through 9 with no deletions or repetition.
Backtracking lends itself nicely to the problem of solving Sudoku puzzles. We will use the puzzle here to better illustrate the algorithmic technique. Our state space will be the sequence of open squares, each of which must ultimately be filled in with a number. The candidates for open squares $(i,j)$ are exactly the integers from 1 to 9 that have not yet appeared in row $i$, column $j$, or the 3 × 3 sector containing $(i,j)$. We backtrack as soon as we are out of candidates for a square. The basic data structures we need to support our solution are:

In [6]:
from collections import namedtuple

Board = namedtuple("Board", ["m","freecount"])
# matrix of board contents
# we use 0 to represent an open square
m = [[3,0,6,5,0,8,4,0,0],
     [5,2,0,0,0,0,0,0,0],
     [0,8,7,0,0,0,0,3,1],
     [0,0,3,0,1,0,0,8,0],
     [9,0,0,8,6,3,0,0,5],
     [0,5,0,0,9,0,6,0,0],
     [1,3,0,0,0,0,2,5,0],
     [0,0,0,0,0,0,0,7,4],
     [0,0,5,2,0,6,3,0,0]]

# how many open squares remain?
freecount = sum(1 for row in m for elem in row if elem == 0)
board = Board(m = m, freecount = freecount)

Constructing the candidates for the next solution position involves first picking the open square we want to fill next(next_square). We pick the first open square we encounter:

In [7]:
def next_square(board):
    for row in range(9):
        for column in range(9):
            if board.m[row][column] == 0:
                return (row, column)
    return ()

Next, we need to identify which numbers are candidates to fill that square (possible_values). The candidates to fill the open square $(i,j)$ are:

$U - ( R \cup C \cup S ) \text{ where :}$
$U = \{1, 2, 3, 4, 5, 6, 7, 8, 9\}$
$R = \{x \mid x \text{ exists in row i}\}$
$C = \{x \mid x \text{ exists in column j}\}$
$S = \{x \mid x \text{ exists in (i, j)'s sector}\}$

In [8]:
# returns the top-left corner 
# of the 3 × 3 sector containing (x, y)
def sector_top_left(x, y):
    return ((x // 3) * 3, (y // 3) * 3)

def possible_values(board, x, y):
    (s_x, s_y) = sector_top_left(x, y)
    universal_set = set(range(1,10))
    row_set = set(board.m[x])
    column_set = {board.m[i][y] for i in range(9)}
    sector_set = {board.m[i+s_x][j+s_y] for i in range(3) for j in range(3)}
    union_set = (row_set | column_set | sector_set)
    return universal_set - union_set

We must update our board data structure to reflect the effect of filling a candidate value into a square, as well as remove these changes should we backtrack away from this position. These updates are handled by make_move and unmake_move, both of which are called directly from backtrack:

In [9]:
def make_move(board, x, y, value):
    m = board.m
    freecount = board.freecount
    m[x][y] = value
    freecount -= 1
    return Board(m = m, freecount = freecount)

def unmake_move(board, x, y):
    m = board.m
    freecount = board.freecount
    m[x][y] = 0
    freecount += 1
    return Board(m = m, freecount = freecount)

One important job for these board update routines is maintaining how many free squares remain on the board. A solution is found when there are no more free squares remaining to be filled:

In [10]:
def is_a_solution(board):
    return board.freecount == 0

The rest of the program is:

In [11]:
def process_solution(board):
    for row in board.m:
        print(row)

def construct_candidates(board):
    (x, y) = next_square(board)
    value_set = possible_values(board, x, y)
    if value_set:
        return {(x, y, value) for value in value_set}
    else:
        return set()

def backtrack(board):
    if is_a_solution(board):
        process_solution(board)
    else:
        candidate_list = construct_candidates(board)
        for x, y, value in candidate_list:
            board = make_move(board, x, y, value)
            backtrack(board)
            board = unmake_move(board, x , y)

def sudoku_solver(board):
    backtrack(board)
In [12]:
sudoku_solver(board)
[3, 1, 6, 5, 7, 8, 4, 9, 2]
[5, 2, 9, 1, 3, 4, 7, 6, 8]
[4, 8, 7, 6, 2, 9, 5, 3, 1]
[2, 6, 3, 4, 1, 5, 9, 8, 7]
[9, 7, 4, 8, 6, 3, 1, 2, 5]
[8, 5, 1, 7, 9, 2, 6, 4, 3]
[1, 3, 8, 9, 4, 7, 2, 5, 6]
[6, 9, 2, 3, 5, 1, 8, 7, 4]
[7, 4, 5, 2, 8, 6, 3, 1, 9]

The N-queens Problem

The N-queens problem asks:

How can N queens be placed on an N × N chessboard so that no two queens attack each other by being in the same column, row, or diagonal?

The basic data structures we need to support our solution are:

In [13]:
from collections import namedtuple

# m -- matrix representing the chessboard
# we use 0 to represent an open square and 1 to represent a queen
# n -- the required number of queens
# nq -- the number of queens currently on the board
Board = namedtuple("Board", ["m","nq","n"])
finished = False

def nqueen(n):
    # starting off with an empty chessboard
    board = Board(m = [[0]*n for _ in range(n)], nq = 0, n = n)
    backtrack(board)

A solution is found when we have the required number of queens on the board:

In [14]:
def is_a_solution(board):
    return board.nq == board.n

We print the configuration and turn off the backtrack search by setting off the global finished flag on finding a solution:

In [15]:
def process_solution(board):
    for row in board.m:
        print(row)
    global finished
    finished = True

Constructing the candidates for the next solution position involves finding all the open squares where the queen cannot be attacked:

In [16]:
# Let (x, y) be the square we are checking for legality

def any_queen_in_row(board, x):
    return any(elem == 1 for elem in board.m[x])

def any_queen_in_column(board, y):
    return any(board.m[i][y] == 1 for i in range(board.n))

def any_queen_in_diag(board, x, y):
    xs_down = range(x + 1, board.n)    # rows below (x, y)
    xs_up = range(x - 1, -1, -1)       # rows above (x, y)
    ys_right = range(y + 1, board.n)   # columns to the right of (x, y)
    ys_left = range(y-1, -1, -1)       # columns to the left of (x, y)
    diag_1 = list(zip(xs_up, ys_right)) + list(zip(xs_down, ys_left))
    diag_2 = list(zip(xs_up, ys_left)) + list(zip(xs_down, ys_right))
    diag = diag_1 + diag_2
    return any(board.m[d_x][d_y] == 1 for (d_x, d_y) in diag)

def square_is_legal(board, x, y):
    return (not any_queen_in_row(board, x)
            and not any_queen_in_column(board, y)
            and not any_queen_in_diag(board, x, y))

def construct_candidates(board):
    candidate_list = []
    for x in range(board.n):
        for y in range(board.n):
            if square_is_legal(board, x, y):
                candidate_list.append((x, y))
    return candidate_list

We must update our board data structure to reflect the effect of putting a queen on a chessboard square, as well as remove these changes should we backtrack away from this move. These updates are handled by make_move and unmake_move, both of which are called directly from backtrack:

In [17]:
def make_move(board, x, y):
    m = board.m
    nq = board.nq
    m[x][y] = 1
    nq += 1
    return Board(m = m, nq = nq, n = board.n)

def unmake_move(board, x, y):
    m = board.m
    nq = board.nq
    m[x][y] = 0
    nq -= 1
    return Board(m = m, nq = nq, n = board.n)

def backtrack(board):
    if is_a_solution(board):
        process_solution(board)
    else:
        candidate_list = construct_candidates(board)
        for (x, y) in candidate_list:
            board = make_move(board, x, y)
            backtrack(board)
            board = unmake_move(board, x , y)
            global finished
            if finished: return
In [18]:
nqueen(8)
[1, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 1, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 1]
[0, 0, 0, 0, 0, 1, 0, 0]
[0, 0, 1, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 1, 0]
[0, 1, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 1, 0, 0, 0, 0]