Friday, 3 July 2009

Finding the Target

I've been steadily working my way through Paradigms of Artificial Intelligence Programming which I am thoroughly enjoying. This time round it's time to look at simple search algorithms. Norvig introduces all the traditional tree search algorithms based off a single function shown below:

From this basic algorithm we can express two simple tree search algorithms, depth first and breadth first.

For depth-first, we simply follow one branch of the tree until it matches the goal or fails to complete. The combiner function puts the next node to be searched at the front of the list of states so that we always follow paths to completion. Depths first search can have problems when the structure to be searched is infinite, as it will blindly follow a path forever. Breadth-first works by exploring all paths at the same level. Breadth-first search doesn't suffer from the same infinite search problem as depth-first search, but it's potentially slower because it is covering each and every node. Both depth and breadth based searches have the same fundamental disadvantage, they don't have any knowledge about the state space and just methodically follow the same pattern regardless of the search.

Best-first search chooses the successor nodes to search first by searching on the results of a cost-function.

This will still visit all the nodes, but the choice of what to search next is based on the cost function. If the cost function is right, then best-first-search can be significantly better than depth or breadth first searching. Beam search is an optimization of best search that forgoes the requirement of searching every node. Instead only a fixed number of alternative states (the beam width) are considered.

The advantage of this is lower memory requirements and that the nodes that the beam width covers can be searched deeper. However, if the beam width is too small, then this search algorithm can fall into the trap of endlessly searching down the wrong path.

To illustrate these search algorithms, we'll use the ridiculous example of the game of darts. Briefly, Darts is played by drinking vast quantities of alcohol and then trying to throw darts at a board. You score points between 1 and 20 (either normal, double or triple), and there's also a outer bull (25) and an inner bull (50). You start from 501, and try to get to zero as quickly as possible, with the caveat that the last dart you throw should be a double or a bulls-eye.

The code below models darts as a series of throws:

Now we can use this with the search algorithms described above.

From the descriptions, you'd expect breadth and depth first search to find the right solution, but be a bit rubbish about doing so. Because they have no idea of the means to get to the goal, they'll just chuck darts about randomly and eventually weave their way to a solution. And indeed they do.

Solving the game of darts using depth-first search from 501 is not ideal! The first branch is happens to explore is the contains 1, so the search repeatedly throws 1's until it's not a legal move. Not good!> (count (:throws (solve-darts-depth-first 501)))     

Let's see if beam search can do any better:

The cost function models it as however close we are to zero divided by the number of throws of the data. The goal should be to get to the finish as quickly as possible.> (:throws (solve-darts-beam-search 501))
[441 381 321 261 201 141 84 30 0]    

We can see that this does achieve the perfect 9 dart finish. (T20,T20,T20,T20,T20,T20,T19,T17,D15). Finishing from 1000001 will apparently take 16667 darts!

The approach described in the book is very clear. The key realization for me was that you don't have to have the entire state space in memory at once for a tree algorithm to succeed; you just need to think in terms of the next step to take.