BEER BEAR BEAD BEAK BEAT BELT WELT WILT WILE WINE
Let's write a simple program that calculates word ladders for a simplified version of the game where each change can only change a single letter at a time (rather than allow deletes and inserts too). Finding a word ladder can be thought about as a graph problem. Each node is a word, and each edge represents a connection to another valid word. The graph below is a partial representation of the graph starting with BEER. The real graph is much more complicated!
First things first, how do we build the graph? For starters we need to know if two words are neighbours or not. The following simple functions determine the difference between two strings. Remember that I'm only working on the simplest possible distance metric at the moment, that is allowing a single character to change.
neighbour :: String -> String -> Bool
  neighbour x y = difference x y == 1
                                     
  difference :: String -> String -> Int                     
  difference [] [] = 0
  difference (x:xs) (y:ys) | x == y = difference xs ys
                           | otherwise = 1 + difference xs ys
  difference _ _ = error "Two strings must be the same length"
Next thing we need to grab is a huge list of words. We'll store these words in a Set because we'll want to test for membership frequently (an O(lg(N)) operation. An alternative would be to use a perfect hash (this is an option because the dictionary is fixed). This would give O(1) lookup times, and it looks like there is (as almost always) a library on Hackage that does just that. The simple distance metric chosen means we can limit the number of words based on the size of the input word.
import qualified Data.Set as S
  type WordSet = S.Set String
  wordListPath :: String
  wordListPath = "/usr/share/dict/british-english"
  createDictionary :: Int -> IO WordSet    
  createDictionary n = do
    file <- readFile wordListPath 
    return $ S.fromList $ filter (\x -> length x == n && all isAlpha x) (map (map toLower) $ words file)
Once we've got a dictionary, all we need to do is build the graph. Since Haskell is lazy, we don't need to worry about the space complexity of the graph - we'll just build it lazily and only the bit that is explored will be resident in memory.
Each node contains the word it represents and the links to the child elements. The graph is built by starting at a root, and filling all the valid neighbours. Each time we place a word in the graph we remove it from the dictionary, otherwise we'll get cycles in the graph.
data Node = Node String [Node] deriving Show
  buildGraph :: WordSet -> String -> Node 
  buildGraph wordset top = Node top (map (buildGraph smaller) neighbours)
    where
      neighbours = S.toList (S.filter (neighbour top) smaller)
      smaller = S.delete top wordset 
The graph is *huge*, so we need to find some way to limit the search space. The most obvious way is to give a restriction on the depth of the search. A word ladder that is 10000 words rungs high is probably not much fun to complete. We can also cut the search short if the word is too many changes away given the maximum depth (for example, if 4 characters need to change in a 5 letter word and the maximum left to search is 3 then we can prune this search branch).
search :: Node -> Int -> String -> [String]
  search graph maxDepth goal = search' graph maxDepth goal []
  search' :: Node -> Int -> String -> [String] -> [String]
  search' (Node end children) maxDepth goal path 
    | end == goal    = end : path 
    | null children  = [] 
    | length path >= maxDepth = [] -- too deep
    | difference end goal >= maxDepth - length path = [] -- too much difference
    | otherwise = first
      where
        childRoutes = filter (not . null) $ 
                      map (\child -> search' child maxDepth goal (end : path)) children
        first | null childRoutes = []
              | otherwise        = head childRoutes          
        quickest | null childRoutes = []
                 | otherwise = minimumBy (comparing length) childRoutes                         
The way we search the children is important. In this case we've used
first gone for the first available route that satisfies the depth guarantee, but isn't guaranteed to be the shortest route.  quickest on the other hand calculates all child routes and finds the minimum length part.Finally, we can put this all together and write a simple search search function.
makeLadder :: Int-> String -> String -> IO [String]
  makeLadder maxDepth start end 
    | length start /= length end = error "Only two strings of equal length are currently supported."
    | otherwise = do    
        dict <- createDictionary (length start)
        if (S.member start dict && S.member end dict)
          then return $ search (buildGraph dict start) maxDepth end
          else return []
The complete code for this version available on my git hub repo here.
This version has several problems.
- It's too slow - searching for the minimal path can take considerable time
 - It's not very flexible
 
  data DistanceMetric = DistanceMetric (Word -> Word -> Int) (Word -> WordSet)
  difference :: Word -> Word -> Int                     
  difference x y 
    | length x /= length y = 999999
    | otherwise = sum $ zipWith (\c1 c2 -> if c1 == c2 then 0 else 1) x y
                
  transposeChar :: Word -> [Word] 
  transposeChar [] = []
  transposeChar (x:xs) = map (:xs) (validChars \\ [x])
                
  deleteChar :: Word -> [Word]                       
  deleteChar [] = []
  deleteChar (x:xs) = [xs]
  insertChar :: Word -> [Word]
  insertChar [] = []
  insertChar (x:xs) = map (\y -> y:x:xs) validChars
  differenceEdit :: Word -> WordSet
  differenceEdit x = edit' x [transposeChar]
    
  editDistanceEdits :: Word -> WordSet
  editDistanceEdits x = edit' x [insertChar,transposeChar,deleteChar]
  edit' :: Word -> [Word -> [Word]] -> WordSet
  edit' w fns = S.fromList $ concat $ 
                zipWith (\x y -> map (\z -> x ++ z) (concatMap (\x -> x y) fns)) 
                (inits w) (tails w)
  simple :: DistanceMetric
  simple = DistanceMetric difference differenceEdit
  edits :: DistanceMetric
  edits = DistanceMetric editDistance editDistanceEdits
This gives two distance functions and two ways of generating edits.  The Levenshtein distance  of 1 is generated by transposing, deleting and inserting characters from the original word.
This gives us the flexibility, because another distance metric could be put in place (anagrams perhaps?).  Next to performance.
  buildGraph :: DistanceMetric -> WordSet -> Word -> Node 
  buildGraph d@(DistanceMetric dist edits) wordset top = Node top (map (buildGraph d smaller) neighbours)
    where
      possibleNeighbours = edits top
      neighbours = S.toList (smaller `S.intersection` possibleNeighbours)
      smaller = S.delete top wordset 
    
  search :: DistanceMetric -> Node -> Int -> Word -> [Word]
  search (DistanceMetric dist _) graph maxDepth goal = search' graph []
    where 
      search' (Node end children) path 
        | end == goal    = end : path 
        | length path >= maxDepth = [] -- too deep
        | dist end goal >= maxDepth - length path = [] -- too much difference
        | otherwise = first
          where
            -- Find the best node to search by comparing it against the goal
            costForNextChild :: [(Int,Node)]
            costForNextChild = zip (map (\(Node x _) -> dist x goal) children) children
            bestFirst = map snd $ sortBy (comparing fst) costForNextChild
        
            -- Best first search
            childRoutes = filter (not . null) $ map (\child -> search' child (end : path)) bestFirst
      
            first | null childRoutes = []
                  | otherwise        = head childRoutes                                                                                      
Two things have changed from the original code.  The first is that the graph is built by comparing the edits against the dictionary, rather than the word against the whole dictionary.  This is the main saving and makes it *hugely* faster (thanks to jkkramer for the pointer and this post.)  The only other change is that we decide which node to search next based on how close it is to the goal (a best-first search).
With these changes it can now solve all of the problems I've tried at wordchains.com.  Neat.
The complete code is available here.
