## Sunday, 14 June 2009

### The Shunting Yard Algorithm (2)

In my previous post, I looked at the Shunting Yard algorithm and implemented a *really* bad version which required you to escape parenthesis. Not very Lispy!

The code below makes a very small change which converts nested expressions (via lists) into RPN.

`(defn shunting-yard  ([expr]      (shunting-yard expr []))  ([expr stack]     (if (empty? expr)       stack       (let [token (first expr)             remainder (rest expr)]         (cond           (number? token) (lazy-seq                             (cons token (shunting-yard remainder stack)))           (operator? token) (if (operator? (first stack))                               (let [stackfn (partial op-compare token)]                                 (lazy-seq                                   (concat (take-while stackfn stack)                                           (shunting-yard remainder                                                          (cons token                                                                (drop-while stackfn stack))))))                               (shunting-yard remainder (cons token stack)))                      (sequential? token) (lazy-seq                                 (concat (shunting-yard token) (shunting-yard remainder stack)))                      :else (assert false))))))`

This is much nicer as now, as the tests below show, you can write expressions in a more natural way.

`(deftest test-shuntingyard  ;; Basic operators defined  (is (= (pretty-print (shunting-yard [100 + 200])) '(100 200 +)))  (is (= (pretty-print (shunting-yard [100 * 200])) '(100 200 *)))  (is (= (pretty-print (shunting-yard [100 / 200])) '(100 200 /)))  (is (= (pretty-print (shunting-yard [100 - 200])) '(100 200 -)))  ;; Test some precedence rules  (is (= (pretty-print (shunting-yard [100 * 200 + 300])) '(100 200 * 300 +)))  (is (= (pretty-print (shunting-yard [100 + 200 * 300])) '(100 200 300 * +)))  ;; Redundant Parenthesis  (is (= (pretty-print (shunting-yard [[[[1 + 1]]]] '(1 1 +)))))  (is (= (pretty-print (shunting-yard [1 + ] '(1 1 +)))))  (is (= (pretty-print (shunting-yard [ + ] '(1 1 +)))))  ;; Bracketing of expressions  (is (= (pretty-print (shunting-yard [[1 + 2 + 3] + 4])) '(1 2 + 3 + 4 +)))   (is (= (pretty-print (shunting-yard [[1 + 2 * 3] + 4])) '(1 2 3 * + 4 +)))  (is (= (pretty-print (shunting-yard [[4 + 5] / [7 - 8]])) '(4 5 + 7 8 - /))))`

I was thinking (as was Mark) that I could write a macro in order to simplify writing expressions. The idea would be to write something to convert infix to prefix (rather than postfix as here).

On a slight tangent, in other Lisps, you can write a reader macro which would allow you to annotate an expression with a symbol that your reader would use to convert into the appropriate form. Clojure doesn't allow reader macros (there's a discussion here [see around 19:00] as to the reasons why).

Before we write this as a macro, we'll first need an evaluator loop. This is dead simple and just involves (again) a stack. This is a very basic version that assumes all operators have arity 2.

`(defn rpn-evaluator  ([expr]     (rpn-evaluator expr []))  ([expr stack]     (if (empty? expr)       (if (= 1 (count stack))         (first stack)         (assert false)) ;; unbalanced       (let [f (first expr)]         (cond              (number? f) (recur (rest expr) (cons f stack))                      ;; Should look up arity of operator!           (operator? f) (recur (rest expr)                                 (cons                                 (f (second stack) (first stack))                                 (drop 2 stack))))))))`

Having written this code, what does it buy me to write it as a macro? Not much! A macro is simply a translation of some source code to another lump of source code. In this case I'm better off writing a simple function, such as `(def infix (comp rpn-evaluator shunting-yard))` to get the job done. The main point is that the macro doesn't have access to the values in the expression (unless they are constant). If the code was `(infix [a + b])` then the macro can not evaluation a and b because the values will not be known at macro expansion time.

The one use of the macro that I can see would be for constant folding which I'm sure already occurs as part of the compilation process!

Programming Clojure gives a taxonomy of macros as:

• Conditional Evaluation
• Defining vars
• Java interop
• Postponing evaluation
• Wrapping evaluation
• Avoiding a lambda

None of these seem appropriate here to me, so I *think* writing it as a function is the right way to do it. I need to read On Lisp again since I think my macro understanding is a little off.