Saturday 13 June 2009

The Shunting Yard Algorithm

The Shunting Yard Algorithm converts expressions written in infix notation (i.e. with the operator between the operands) to Reverse Polish Notation (RPN). The algorithm was published by Dijkstra in November 1961 and the original paper can be found here.

RPN is often used in stack-based languages, such as Joy. In this respect, Clojure is to prefix notation as Joy is to stack-based notation.

The algorithm is simple and the description on Wikipedia is very complete; but it still was a great learning exercise for lazy sequences. The code implements a very simple expression parser based on a sequence of elements. Currently nesting expressions isn't supported directly, you have to use \( and \) to escape parenthesis.

;;; Shunting Yard Algorithm
(:use clojure.contrib.test-is))

(defstruct operator :name :precedence :associativity)

(def operators
#{(struct operator + '1 'left)
(struct operator - '1 'left)
(struct operator * '2 'left)
(struct operator / '2 'left)
(struct operator '^ 3 'right)})

(defn lookup-operator
(first (filter (fn [x] (= (:name x) symb)) operators)))

(defn operator?
(not (nil? (lookup-operator op))))

(defn op-compare
[op1 op2]
(let [operand1 (lookup-operator op1)
operand2 (lookup-operator op2)]
(and (= 'left (:associativity operand1)) (<= (:precedence operand1) (:precedence operand2)))
(and (= 'right (:associativity operand1)) (<= (:precedence operand1) (:precedence operand2))))))

(defn- open-bracket? [op]
(= op \())

(defn- close-bracket? [op]
(= op \)))

(defn shunting-yard
(shunting-yard expr []))
([expr stack]
(if (empty? expr)
(if-not (some (partial = \() stack)
(assert false))
(let [token (first expr)
remainder (rest expr)]

(number? token) (lazy-seq
(cons token (shunting-yard remainder stack)))

(operator? token) (if (operator? (first stack))
(concat (take-while (partial op-compare token) stack)
(shunting-yard remainder
(cons token
(drop-while (partial op-compare token) stack)))))
(shunting-yard remainder (cons token stack)))

(open-bracket? token) (shunting-yard remainder (cons token stack))

(close-bracket? token) (let [ops (take-while (comp not open-bracket?) stack)
ret (drop-while (comp not open-bracket?) stack)]
(assert (= (first ret) \())
(concat ops (shunting-yard remainder (rest ret)))))

:else (assert false))))))

(defn pretty-print
(map (fn [x]
(= x +) '+
(= x -) '-
(= x /) '/
(= x *) '*
:else x)) expr))

(deftest test-shuntingyard
(is (= (pretty-print (shunting-yard [100 + 200])) '(100 200 +)))
(is (= (pretty-print (shunting-yard [100 * 200])) '(100 200 *)))
(is (= (pretty-print (shunting-yard [100 / 200])) '(100 200 /)))
(is (= (pretty-print (shunting-yard [100 - 200])) '(100 200 -)))
(is (= (pretty-print (shunting-yard [4 + 5 + \( 6 * 7 \)] '(4 5 + 6 7 * +)))))
(is (= (pretty-print (shunting-yard [\( \( \( 6 * 7 \) \) \)] '(6 7 *)))))
(is (= (pretty-print (shunting-yard [3 + 4 * 5])) '(3 4 5 * +))))

Most of the entries on my blog could be considered Coding Dojo type tasks. I'm basically after problems I can solve in a few hours that I can learn something from!


  1. Neat!

    I was recently wondering how straight-forward this type of conversion would be and, in particular, whether it would be possible to create some kind of macro to convert infix to prefix.

    It would be great to have a macro that allows more maths-intensive algorithms in Clojure to be expressed in an infix style. For example, one could write (i2p [4/3 * Math.PI * r^3]) and have it expanded to the equivalent prefix Clojure code before it is executed.

    How difficult do you think something like this would be?

  2. Thanks for the comments.

    That's exactly why I was looking at it! I tried to jump immediately to writing the macro, but quickly got myself tied up in knot.

    Now I've got something simple in place (with tests) hopefully I can refactor it into a function working with nested lists, and then into a macro.

    I'll have a go at doing it tomorrow and post the results here!