Sunday, 14 December 2008

Recursion in Clojure

How do you write functions that don't explode the stack if Clojure doesn't have TCO?

Let's start with a bad definition of factorial:


(defn factorial [x]
(if (= x 0)
1
(* x (factorial (- x 1)))))


Using it in the REPL is fine:


user> (factorial 100)
933262154439441526816992388562667004907
159682643816214685929638952175999932299
156089414639761565182862536979208272237
582511852109168640000000000000000000000
00


Up to a point...


user> (factorial 100000)
; Evaluation aborted
java.lang.StackOverflowError (NO_SOURCE_FILE:0)
[Thrown class clojure.lang.Compiler$CompilerException]

Restarts:
0: [ABORT] Return to SLIME's top level.
1: [CAUSE] Throw cause of this exception


This is barfing because the evaluator has to keep around state for each call due to the expression (* x (factorial (- x 1))) . We need to make this function tail recursive.

recur can be thought of as the Clojure operator for looping. Think of it like a function call for the nearest enclosing let or function definition supplied with new variables. Naively we can switch over to using this by doing:


user> (defn factorial2 [x]
(if (= x 0)
1
(* x (recur (- x 1)))))


But this is a compile-time error (which in itself is pretty neat!).


java.lang.UnsupportedOperationException: Can only recur from tail position (NO_SOURCE_FILE:4)


An accumulator parameter is an extra parameter to a function that's used to gather intermediate parts of the calculation. If we do this, we can make sure that the recur call is in the tail position. Using an anonymous function we get:

(defn factorial3 [x]
((fn [x y]
(if (= x 0)
y
(recur (- x 1) (* x y)))) x 1))


Now when recur is used, it doesn't need to keep any of the previous stack frame around. This means we can finally calculate factorial 1000000, which begins with 282 and ends with lots of zeros!