Depth-first and breadth-first search in Haskell

2021-01-09

In this blog post I will discuss how to implement Depth-first search (DFS) and Breadth-first search (BFS) in the functional programming language Haskell.

Before we get started by defining types for search problems and algorithms, let us import some useful library functions.

A classic way to show the difference between DFS and BFS uses binary trees:

But search problems and algorithms are more general than binary trees.

What is a search problem?

In general, a search problem is given by:

  • a start value where we begin our search,
  • an expand method to compute possible new values,
  • a property isDone telling us whether a given value is or suffices as a goal.

Here is a very simple search problem: How can we go from 23 to 42 by repeatedly adding 2 or 5?

In order to see what is happening and in which order, we use trace:

Another example: How can we move from (0,0) to (4,4) in a two-dimensional grid? (Note that here we use trace again to print intermediate steps.)

And another example, with a binary tree:

What is a Search Algorithm?

Well, something that hopefully solves a search problem. So a search algorithm takes a search problem and then maybe returns a value:

DFS and BFS in Haskell

To summarise the definition of loop: If the current x is what we wanted to find, then stop with Just x. Otherwise, apply the expand function to x and then map the loop function over this list.

Here it is crucial that Haskell is lazy: the resulting list expand x will not be fully evaluated before the loop function is called again. Instead, loop will be applied to the first element of expand x and only if the result is Nothing the next element will be tried. (By the way, this means dfs can work even if we have an infinitely branching search problem where expand x is an infinite list.)

BFS also uses a loop. But crucially, the parameter of loop in bfs is a list xs instead of a single point x.

DFS and BFS also differ in which function that is mapped:

  • In DFS we map loop over the result of expand x.
  • In BFS we map expand over all xs and apply loop to the result.

Examples

λ> dfs findNumber
Just 42
λ> bfs findNumber
Just 42

Well, that is not very informative, both algorithms find the same result of course. But the way they find it is different, and we can see that if we use findNumer' which prints intermediate steps.

λ> dfs findNumber'
23
25
27
29
31
33
35
37
39
41
43
46
44
Just 42
λ> bfs findNumber'
23
25
28
27
30
30
33
29
32
32
35
32
35
35
38
31
34
34
37
Just 42

Looking at the grid example we can also see that our bfs implementations is very inefficient. It visits many fields in the grid multiple times.

λ> dfs grid
(0,0)
(1,0)
(2,0)
(3,0)
(4,0)
(4,1)
(4,2)
(4,3)
Just (4,4)
λ> bfs grid
(0,0)
(1,0)
(0,1)
(2,0)
(1,1)
(1,1)
(0,2)
(3,0)
(2,1)
(2,1)
(1,2)
(2,1)
(1,2)
(1,2)
(0,3)
(4,0)
(3,1)
(3,1)
(2,2)
(3,1)
(2,2)
(2,2)
(1,3)
(3,1)
(2,2)
(2,2)
(1,3)
(2,2)
(1,3)
(1,3)
(0,4)
(4,1)
(4,1)
(3,2)
(4,1)
(3,2)
(3,2)
(2,3)
(4,1)
(3,2)
(3,2)
(2,3)
(3,2)
(2,3)
(2,3)
(1,4)
(4,1)
(3,2)
(3,2)
(2,3)
(3,2)
(2,3)
(2,3)
(1,4)
(3,2)
(2,3)
(2,3)
(1,4)
(2,3)
(1,4)
(1,4)
(4,2)
(4,2)
(4,2)
(3,3)
(4,2)
(4,2)
(3,3)
(4,2)
(3,3)
(3,3)
(2,4)
(4,2)
(4,2)
(3,3)
(4,2)
(3,3)
(3,3)
(2,4)
(4,2)
(3,3)
(3,3)
(2,4)
(3,3)
(2,4)
(2,4)
(4,2)
(4,2)
(3,3)
(4,2)
(3,3)
(3,3)
(2,4)
(4,2)
(3,3)
(3,3)
(2,4)
(3,3)
(2,4)
(2,4)
(4,2)
(3,3)
(3,3)
(2,4)
(3,3)
(2,4)
(2,4)
(3,3)
(2,4)
(2,4)
(2,4)
(4,3)
Just (4,4)

We can improve this a bit with a nub to remove duplicates from the list passed to "loop".

λ> slightlyBetterBfs grid
(0,0)
(1,0)
(0,1)
(2,0)
(1,1)
(0,2)
(3,0)
(2,1)
(1,2)
(0,3)
(4,0)
(3,1)
(2,2)
(1,3)
(0,4)
(4,1)
(3,2)
(2,3)
(1,4)
(4,2)
(3,3)
(2,4)
(4,3)
Just (4,4)

Getting back to binary trees, we can see that DFS and BFS will find different answers:

λ> dfs findNeedle
Just (Node "Needle A" Empty Empty)
λ> bfs findNeedle
Just (Node "Needle B" (Node "Hay" Empty Empty) (Node "Hay" Empty Empty))

Finally, it is very easy to define search problems on which DFS will run forever.

λ> dfs (SearchProb (1::Int) (\n -> [n*3,n*2]) even)
^CInterrupted.
λ> bfs (SearchProb (1::Int) (\n -> [n*3,n*2]) even)
Just 2

The order in which expand generates new values also matters for the result.

λ> bfs (1::Int) (\\n -> [n*3,n*2]) (\\n -> even n && n > 200)
Just 486
λ> bfs (1::Int) (\\n -> [n*2,n*3]) (\\n -> even n && n > 200)
Just 216

λ> dfs (1::Int) (\\n -> [n*2,n*3]) (\\n -> even n && n > 200)
Just 256
λ> dfs (1::Int) (\\n -> [n*3,n*2]) (\\n -> even n && n > 200)
^CInterrupted.