Friday, 26 December 2008

Bubbling Clojure

There's quite a few sorting algorithms around. I wanted to do something similar to Sorting Algorithms, only in Clojure + Swing as some kind of learning exercise. (99 Problems is still on-going, but to be honest it's boring (!) I'll reach the end one day...).

Perhaps the simplest is bubble sort, go through the list and if you find two adjacent items out of order, swap them. Repeat until done.


(defn bubble [lst]
(if (or (nil? lst) (nil? (second lst)))
lst
(if (> (first lst) (second lst))
(cons (second lst) (cons (first lst) (nthrest lst 2)))
(lazy-cons (first lst) (bubble (rest lst))))))

(defn bubble-sort [lst]
(if (= (bubble lst) lst)
lst
(recur (bubble lst))))


I couldn't find a neater way of doing it. An alternative is to use iterate to apply bubble until no more swaps are necessary. This at least makes the time complexity pretty obvious.


(defn bubble-sort [lst]
(last (take (* (count lst) (count lst)) (iterate bubble lst))))


We can simplify this further. The bubble function sucks because it only applies a single swap and then just rebuilds up the sequence. Applying the function as we go up we get:


(defn bubble [lst]
(if (or (nil? lst) (nil? (second lst)))
lst
(if (> (first lst) (second lst))
(cons (second lst) (lazy-cons (first lst) (bubble (nthrest lst 2))))
(lazy-cons (first lst) (bubble (rest lst))))))


Quicksort is a much better algorithm. The algorithm is:

  • Pick a pivot X

  • Move all items less than the pivot to the left and all greater to the right

  • Apply recursively to each side



This has a very simple implementation in Clojure


(defn qsort [lst]
(if (nil? lst)
lst
(concat
(qsort (filter (partial > (first lst)) (rest lst)))
(list (first lst))
(qsort (filter (partial <= (first lst)) (rest lst))))))


How do the two compare?

First I needed to find out how to generate a big random sequence. repeatedly applies a function of no arguments and generates an infinite list. Using repeatedly and take, you can get a list of random elements like this:


(take 500 (repeatedly (fn [] (rand-int 100)))


So now I can compare the two:


user> (time (count (bubble-sort (take 1000 (repeatedly (fn [] (rand-int 100)))))))
"Elapsed time: 575.713898 msecs"
1000
user> (time (count (qsort (take 1000 (repeatedly (fn [] (rand-int 100)))))))
"Elapsed time: 38.933079 msecs"
1000


Things I still need to work out:

  • Why does my bubble sort implementation blow the stack with large lists? Shouldn't the laziness mean it shouldn't? last looks daft, replace with take/drop combo
  • Multiple return values (or faking them). It'd be nice to stop swapping in the bubble sort when no swaps are done. Clojure doesn't support multiple return values (for compatibility with Java)


Things I have learnt:

  • Iterate is a cool function