Saturday 31 January 2009

Google is Malware

Google is Malware?
Today, Google had a big bug, and marked all search results as malware

Thursday 29 January 2009


Eliza was one of the first AI programs, written by Weizenbaum in 1966. It's the basis of the Emacs Psychotherapist.

It is very basic and simply works on pattern matching. For example:

Pattern: (I need a ?X)
Input: (I need a vacation)
Output: {?X vacation}

By formulating some stock rules (and writing the pattern matching code obviously) we get Eliza.

Code is in my GitHub repository as eliza.clj.

Useful functions I found whilst developing this:

  • Declare for forward declarations
  • Some - which returns the first non-true value for fn[x] in any sequence e.g.

    user> (some (fn [x] (when (> x 0) x)) '(0 -1 -2 -1 0 1 2 3)) ==> 1

  • Replace - takes a map of source->replacement and performs replacements

    user> (replace '{a,b b,c c,d} '(a b c)) ==> (b c d)

Tuesday 27 January 2009

Clojure Forward Declarations

Whilst working on Eliza I found I had a couple of functions that were mutually recursive; this meant I couldn't compile them because one didn't have the declaration of the other.

The solution is declare e.g.

(declare my-odd? my-even?)
(defn my-even? [x]
(if (zero? x)
(my-odd? (dec x))))

(defn my-odd? [x]
(if (zero? x)
(my-even? (dec x))))

(Update after comment from Matt)

As pointed out, this doesn't actually work because the stack can blow:

user> (my-even? 1000000000) ==> stackoverflow!

Using trampoline can fix the issue if you rewrite it so that the tail calls are actually closures

(declare my-odd?)
(defn my-even? [x] (if (zero? x) true #(my-odd? (dec x))))
(defn my-odd? [x] (if (zero? x) false #(my-even? (dec x))))

user> (trampoline #(my-even? 100000000)) ==> true

The syntax #(my-even? 1000000000) creates an anonymous function of no arguments (a thunk to be evaluated later) e.g.

user> (apply #(+ 1 1) ) ==> 2

Passing Parameters in Clojure

Clojure has support for a number of ways of passing parameters. A function body can have multiple function bodies associated with it, each with different arity. This isn't any where near as general as Haskell's pattern matching (see [1]).

user> (defn foo ([x] x) ([x y] (+ x y)) ([x y z] (+ x y z)))
user> (foo 1)
user> (foo 1 2)
user> (foo 1 2 3)

Similarly, you can make a var arg function with the & notation.

user> (defn bar ([x] x) ([x & rest-args] (reduce + (cons x rest-args))))
user> (bar 1)
user> (bar 1 2 3 4 5 6 7)

The last option for parameter passing is keywords, the parameter list becomes a map. This is discussed in more detail here.

Monday 26 January 2009

Design Patterns in Dynamic Programming

Peter Norvig (of Google fame) has a page Design Patterns in Dynamic Programming which is exactly the kind of reference I've been after, in terms of discussing common object-oriented patterns in a functional context. It also features a number of other FP patterns:

  • Coroutines
  • Control Abstraction
  • Memoization
  • Rule-based Translator

As an aside, Norvig's book (Paradigms of AI Programming) is next on my list to port a few interesting examples across from. Eliza is first on my hit list!

I'm still yet to see something akin to GoF for functional programming. Perhaps this is because it doesn't exist and functional programming "in-the-large" is exactly the same as in the small?

Wednesday 21 January 2009

Multi-methods in Clojure

Multimethods (more commonly known as multiple dispatch) means that a method call is dispatched dynamically based upon the type of multiple arguments.

In Java / C++ you'd typically approach this with the visitor pattern, with something like the following.

interface Foo {
// ... lots of exciting methods
void accept( FooVisitor visitor );

interface FooVisitor {
void visit( Bar bar );
void visit( Baz baz );

class Bar implements Foo {

//... implementation

public void accept ( FooVisitor visitor ) {
visitor.visit( this );

class Baz implements Foo {

//... implementation

public void accept( FooVisitor visitor ) {
visitor.visit( this );

Why's this not good? Well, the FooVisitor has to know about all the subclasses, which is pretty nasty. In addition, well, it's just a lot of typing - way too much scaffolding. Languages like Smalltalk and Lisp both support multi-methods natively which give you all the benefits of this "pattern", without the crazy verbosity of the Java version.

Clojure multimethods support dispatching on variables other than the runtime type of an argument. From the Clojure docs, you can dispatch on "types, values, attributes and metadata of, and relationships between, one or more arguments".

Clojure multi-methods consist of two things:

    * Some methods (the multi!)
    * A dispatcher function (chooses which one)

Here's a trivial example defining an increment function that dispatches on the class of the arguments. class is a built in function within Clojure core that gives the class of the object.

(defmulti my-increment class)

(defmethod my-increment String [s]
(let [x (Integer/valueOf s)]
(str (my-increment x))))

(defmethod my-increment Integer [i]
(+ i 1))

user> (my-increment 4) ==> 5
user> (my-increment "4") ==> "5"

But, we do not just have to be boring and dispatch only on the type of the argument, we could be dispatching on the value:

(defmulti my-decrement identity) ;; identify is built in (fn [x] x)

(defmethod my-decrement 0 [x]

(defmethod my-decrement :default [x]
(- x 1))

user> (my-decrement 2) ==> 1
user> (my-decrement 1) ==> 0
user> (my-decrement 0) ==> 99999

The dispatching function isn't limited to just the first argument, you could dispatch on the type of multiple arguments e.g.

(defmulti my-add (fn [x y] (and (string? x) (string? y))))

(defmethod my-add true [x y]
(str x y))

(defmethod my-add false [x y]
(+ x y))

user> (my-add 3 4) ==> 7
user> (my-add "3" "4") ==> "34"

This gives about the most general method of method dispatch imaginable!

The documentation explains how this goes further still, allowing relationships to be created with derive and how you can add/remove/define an ordering implementations of methods at runtime.

Reading Programming Clojure shows that multi-methods are very rarely used in most of the current Clojure code (only about 1 occurrence per thousand lines of code). There's some guidelines in the book about choosing whether to use multi-methods, but it's summed up best with "try writing the function in both styles, and pick the one that seems more reliable".

Tuesday 20 January 2009


Compojure is a web-framework for Clojure.

Grab the latest source from Git and build with Ant.

git clone git://
cd compojure

Now you've got to add Compojure into your classpath. I use Emacs and Slime so I edited my .emacs file based on some suggestions from IRC and got (in addition to the usual)

(setq swank-clojure-jar-path (concat clj-root "clojure/trunk/clojure.jar")))
(setq swank-clojure-extra-classpaths (directory-files "~/lisp/clj/" t ".jar$"))

When ~/lisp/clj contains all my JARs (well symlinks anyway).

After this, I wasted (well wasted probably isn't true, I got some new knowledge) precious minutes of my life wondering why I couldn't get anything working, but on the plus side I found some functions that helped:

user> (all-ns)
;; big list of name spaces, check that you've got Compojure on here!

user> (System/getProperty "java.class.path")
;; big class path list, make sure you have Compojure + all the dependent JARs

Then, I got a bit lost because I'd mismatched versions of compojure.jar and! Rather than using the version of Clojure that came with Compojure, I was using the head revision - bad idea! Use a consistent set of JARs that come from the git repository.

And now I'm able to compile the examples from the Wikibook and all is well.

I also decided to finally by the Programming Clojure book, now all I need is a decent e-book reader!

Monday 19 January 2009

Functional Tree Editing

The zipper is a Functional Pearl which allows localized editing of a list-based data structure (most commonly trees).

The idea is very simple, for any list-based data structure you can create a data-structure (the zipper) which allows you to navigate (next, previous, up, down) but also do functional-editing (by which I mean localized editing without having to reconstruct the entire data structure). The original data is untouched.

In the example below (see [1] for docs), we create a zip structure from a vector zip/vector-zip. Navigating along we move along with zip/next, zip/remove and return to the root with zip/root

user> (let [zip ( [1 2 3 4 5])]
( ( ( zip))))
[2 3 4 5]

This example looks a bit backward; we can make things clearer with ->. This is a macro that "threads" function calls, we can explain this a bit clearer with macroexpand as this shows what a macro actually becomes.

user> (macroexpand '(-> [1 2 3] rest rest))
(rest (clojure.core/-> [1 2 3] rest))

user> (macroexpand '(-> [1 2 3] rest))
(rest [1 2 3])

;; Put the two together and we get (rest (rest [1 2 3])) ==> 3

If we apply this code to the previous zip example, we get something much clearer (apart from the namespace gumph).

user> (let [zip ( [1 2 3 4 5])]
(-> zip

The main point of zippers is that the original is untouched.

user> (let [data [1 2 3 4 5] zip ( data)]
(prn "Changed is " (-> zip
(prn "Original is " data))

"Changed is " [2 3 4 5]
"Original is " [1 2 3 4 5]

Clojure provides zippers for vectors, sequences and XML elements.

Learning Emacs

Emacs and vi/vim are the two most popular text editors on Unix. I'm currently learning Emacs (yes, learning, there's a bit more to it than Notepad).

The coolest thing I've found so far is Emacs keyboard macros.

  • Start recording with C-x (
  • Type keys
  • End recording with C-x )
  • Execute recording with C-x e

You can combine the execution with C-u to repeat the command a given number of times.

Wednesday 14 January 2009

Using Agents

I've had a hard time groking Clojure agents, so this is mostly just a series of daft micro-examples to understand them together with restating the bleeding obvious.

An agent is created with agent and the associated data. You can use deref or @ to access the data associated with an agent (same syntax as I mentioned previously for atoms).

user> (agent '(1 2 3 4 5 6))
user> @(agent '(1 2 3 4 5 6))
(1 2 3 4 5 6)

Agents are reactive - they'll only do something if you tell them so. All communication with agents is through functions. There are two commands for sending functions to agents, send (used for non-blocking calls) and send-off (used for potentially blocking calls). Both send and send-off return immediately, the difference being that send-off guarantees the message will be processed in a different thread.

user> (let [a (agent 4)]
(send a + 1) ; schedules a call of (apply + agent_state 1)
(await a)
(prn @a))

Without the invocation of await, this may return 4 not 5. await blocks the current thread indefinitely until all actions have been completed (which makes it quite dangerous!).

What if the function you are sending has errors? Let's look at a divide by zero:

user> (let [a (agent 4)]
(send a / 0)
(await a)
(prn @a))
; Evaluation aborted.
java.lang.Exception: Agent has errors (NO_SOURCE_FILE:0)
[Thrown class clojure.lang.Compiler$CompilerException]

Errors in agents can be inspected with agent-errors which returns a sequence of exceptions. Once an agent is in an error state it can not process any more messages until the errors have been cleared with clear-agent-errors.

user> (let [a (agent 4)]
(send a / 0)
(await a)
(prn (agent-errors a))
(clear-agent-errors a)
(prn @a))


So agents seem incredibly simple - why are they so powerful?

  • Integration with STM, which leads too...
  • Always observable - no matter what calculation is going on, you can always get the current (valid) value with deref.
  • No message-loops - agents are purely reactive which (to me) seems a simpler programming model to understand
  • Open - agents are mutated by functions and therefore this can be extended by anyone (in contrast with having to add a new type of message for the loop to switch on)

Monday 12 January 2009

Metadata in Clojure

Each Clojure function can have a documentation string associated with it. This must be declared immediately before the parameters, for example:

(defn foo-baz
"Foo-bazes bar"
(+ bar 77))

user> (doc foo-baz)
Foo-bazes bar

This is actually short hand for:

(defn #^{:doc "Foo-bazes bar"} foo-baz
(+ bar 77))

#^ indicates to the reader that the following map is metadata to be associated with the next element read. You can use this to document def'd values too (e.g. (def #^{:doc "This is the value x"} x).

Metadata is associated with symbols or collections, and meta can be used to view the associated values. Initially I was doing (meta count) and expecting it to come up with the details about agent. This is wrong because count is an instance of the function, whereas what I need to pass as an argument to meta is the symbol associated with the function e.g.

user> (meta count)
user> (meta #'count) ; ^#'count also works (^ is a reader character to get the metadat)
{:ns #, :name count, :doc "Returns the number of items in the collection.
(count nil) returns\n 0. Also works on strings, arrays, and Java Collections and Maps",
:arglists ([coll]), :line 842, :file "core.clj"}

Note how extra metadata has arrived when it was compiled, including the line number of the original file, name space etc.

There are many cool potential applications of metadata in Clojure, though I haven't seen any implemented yet!:

  • Aspect oriented programming (side note inspiration for this came from Lisp Advice (see Emacsspeak mentioned in the book Beautiful Code.)
  • Software tools - knowing the line number and file helps in a number of ways, maybe could be used to do semantic diff (for example, re-ordering my functions wouldn't make any difference because they'd semantically still be the same).


The Ants simulation is the de-facto example of agents in Clojure. Grab the full code from the Clojure Concurrency presentation.

Saturday 10 January 2009

Software Transactional Memory

Software Transactional Memory is a technique that aims to simplify developing concurrent programming. It allows you to define units of work (transactions) which obey the ACID properties:

  • Atomic - work either happens completely or not at all.
  • Consistent - only valid work is carried out
  • Isolation - no-one can see intermediate work, only the final result
  • Durable - Once the work has been completed, it's done and will survive failure

During a unit of work each thread completes modifications to the (shared) data structure. All modifications are recorded in a log. When the transaction reaches the end, the memory logs are analysed; if the logs are consistent (e.g. no read/write of the same variable) then the transaction is commited, if not the transaction is aborted or retried. All functions in a transaction must be pure. IO in a transaction is wrong. A language like Haskell enforces this via its type system - IO operations are performed with a monad. STM is still very new, there's no guarantee that this will be the "winner" in the concurrency programming war...

Friday 9 January 2009

Generating Text

Often you need a chunk of vaguely real looking text to test some code, web layout, file handling etc. In ANSI Common Lisp (which I'm currently reading if you hadn't guessed) there's an example of how to generate random text.

The idea is simple - read in a list of words, for each pair of words keep a count of the number of occurrences of that pair. Once you've got that data, you can pick a word at random and then pick from a probability distribution what the next word should be. Apply that pattern until you've generated enough text. There's much more sophisticated work based on same ideas.

The example Lisp code is quite nasty and it's written in what feels like an iterative style. In Clojure we have access to a richer standard library (Java). For example, we can read in a list of words thus:

(defn file-as-wordlist [f]
(filter (fn [x] (> (count x) 0)) (.split (slurp f) "\n|[ ]|\r|[.]|[,]|[\"]")))

This takes a file name as an arguments, slurps the entire file into memory and splits it using String.split

Next we need to be a frequency map which has an entry for each word, together with a count of each word that follows it. We use a map of Word => (Word => Count).

(defn build-frequency-map [words]
(let [word-pairs (mapcat (fn [x y] (list [x y])) (cons (last words) words) words)]
(reduce (fn [accum v]
(let [w1 (first v) w2 (second v) val (get accum w1)]
(if (nil? val)
(assoc accum w1 {w2 1})
(let [currentVal (get val w2)]
(if (nil? currentVal)
(assoc accum w1 (conj val {w2 1}))
(assoc accum w1 (conj val {w2 (inc currentVal)})))))))
{} word-pairs)))

This is a beefy function, but I couldn't see how to simplify it. It also assumes that the last word maps to the first, which is daft but then this is only playing around! The idea is to get the two sequences, shift one to the left and then count the two in a map. For example the frequency map of Caroline Kennedy would be, well you know?, something like this:

user> (build-frequency-map ["you" "know" "it's" "great" "you" "know"])
{"know" {"it's" 1, "you" 1}, "great" {"you" 1}, "it's" {"great" 1}, "you" {"know" 2}}

So once we've generated a frequency map we need to work out what the next work should be based on the probability. "know" is followed by 1 occurrence of either "it's" or "you", so each should have a 50/50 chance of getting picked.

(defn frequency-map-count [m word]
(let [v (get m word)]
(if (nil? v)
(reduce (fn [x y] (+ x (second y))) 0 v))))

(defn next-word [m word]
(let [following (get m word) p (rand-int (frequency-map-count m word))]
((fn [words prob]
(let [word-count (second (first words))]
(if (>= word-count prob)
(first (first words))
(recur (rest words) (- prob word-count))))) following p)))

frequency-map-count counts the number of occurrences of a word. next-word rolls a dice and selects the next word based on the number of occurrences.

Finally, all we need is a function to apply next-word

(defn generate-text [example n & [start]]
(let [words (file-as-wordlist example) fm (build-frequency-map words)
start-word (if (nil? start) "the" start)]
(apply str (interpose " " (take n (iterate (partial next-word fm) start-word))))))

There's several nice things here from a functional programming point of view:
* Use of partial the function of a single argument (currying)
* Use of iterate/take to have an infinite stream of text (laziness)

So is it any good? Here's 250 words Dracula

user> (generate-text "/home/jfoster/Documents/345.txt" 250)

"the same condition of those which pretend and said Finis and I agree later period He evidently the geologic and women jumped down one whom each side manifestly as to set in them he had pressed upon us? Be not in through my poor Lucy noticed it up the section their swords can be more and had expected A great brain just finished the third person or sea I do for the belt with so sweet little before he woke bright smile of it afterwards when we did anent something like a night till tonight it and on in paragraph 1 October --As I describe when Arthur I have unchecked sway a low shrill call him what will have not friend John was no opinion there would indeed happiness were parted at rest Happily Van Helsing hypnotized me But you? Whilst we may deign to be at rest seemed to the facts of a cluster and mixed up the wide spread out one by the library of whips The soft voice gave them both quiet joy when we were they come to the big swan thought struck with my eyelids and pondered over to take away I took out of the free Nay sir bethink yourself It would like it so strange tongues fierce voice came close to bear you consated his hands Our dear Lucy after you shall not look into my hands our intent I will grace said nothing The opiate of white cloud in spirit Perhaps it"

Well, it's not going to win the Booker Prize, but it's almost believable! Full code available here

Other applications of this:
  • Predictive messaging
  • Generating auto-blog posts by "rewriting" other content
  • Language recognition based on work occurrence / frequency
  • Wednesday 7 January 2009

    Regular Expressions in Clojure

    Regular Expressions are a incredibly powerful language for performing text manipulation.

    ANSI Common Lisp doesn't feature regex support, though there are a large number of libraries that do (see here). Clojure uses the Java implementation of regular expressions (see the regex tutorial) and adds some nice syntactic sugar to the mix.

    To define a regex pattern in Clojure you use the #"<regex>" syntax. This is actually the regex pattern, so evaluating it at the REPL gives you itself. The pattern is compiled, so putting anything nonsensical results in an error.

    user> #"[0-9]+"

    user> #"[]+"
    ; Evaluation aborted.
    ; clojure.lang.LispReader$ReaderException: java.util.regex.PatternSyntaxException:
    ; Unclosed character class near index 2

    The Clojure regex functions are well documented. Unfortunately, I didn't really read the documentation! re-find returns either the first match (directly as a string) OR a vector of matches if multiple matches. This is optimized for the normal case where a user enters the text and the regex is well known ahead of time e.g.

    user>(re-find #"bar" "bar")

    user>(re-find #"(foo)|(bar)" "foo bar")
    ["foo" "foo" nil]

    When learning regular expressions, I found Regex Coach invaluable (which is actually Lisp powered!). It's an application that lets you immediately see how a regular expression matches some text. Let's do the basics of this in Clojure.

    Regular Expression Viewer

    You have two areas, one for the regular expression and one for the text. As you press keys the matching regular expression (if any) is highlighted in the text area.

    Firstly we need a function that given a regular expression and some text returns where to do some highlighting:

    (defn first-match [m]
    (if (coll? m) (first m) m))

    (defn match [regex text]
    (let [m (first-match (re-find (re-pattern regex) text))]
    (if (nil? m)
    [0 0]
    (let [ind (.indexOf text m) len (.length m)]
    [ind (+ ind len)]))))

    first-match is a helper function that gives the first match (handling the case of re-find returning multiple entries). match just gives you a vector [x y] representing the index of the match.

    Next a bit of UI that features a couple of new things:

    Each time a key is pressed match returns the area to highlight and Highlighter does the rest. The exception handling ensures that if there was an error compiling the regex then that's printed on the status bar.

    (defn regexcoach []
    (let [frame (JFrame. "Regular Expression Coach") pane (JPanel.) regexText (JTextField.)
    targetText (JTextField. "")
    statusBar (JLabel. "Match from 0 to 0")
    keyHandler (proxy [KeyAdapter] []
    (keyTyped [keyEvent]
    (let [m (match (.getText regexText) (.getText targetText))
    hl (.getHighlighter targetText)
    pen (DefaultHighlighter$DefaultHighlightPainter. Color/RED)]
    (.removeAllHighlights hl)
    (.addHighlight hl (first m) (second m) pen)
    (.setText statusBar (format "Match from %s to %s" (first m) (second m))))
    (catch PatternSyntaxException e (.setText statusBar (.getMessage e))))))]
    (doto regexText
    (.addKeyListener keyHandler))
    (doto targetText
    (.addKeyListener keyHandler))
    (doto pane
    (.setLayout (BoxLayout. pane BoxLayout/Y_AXIS))
    (.add (JLabel. "Regular Expression"))
    (.add regexText)
    (.add (JLabel. "Target String"))
    (.add targetText)
    (.add statusBar))
    (doto frame
    (.add pane)
    (.setSize 300 300)
    (.setVisible true))))

    Full code here.

    Tuesday 6 January 2009

    Game of Life Part II

    A helpful comment on my previous blog entry suggested that instead of crazy zip rubbish, I could instead view the grid as a Point => Value mapping. Is that a good change to make?

    Here's the changes from the previous version. Full code on my Git repository.

    (defstruct point :x :y)

    (defn world-at [world point]
    (get world point))

    (defn toggle-pos [world point]
    (if (zero? (world-at world point))
    (assoc world point 1)
    (assoc world point 0)))

    (defn neighbours [p]
    (let [x (:x p) y (:y p)]
    [(struct point (dec x) (dec y)) (struct point x (dec y)) (struct point (inc x) (dec y))
    (struct point (dec x) y) (struct point (inc x) y)
    (struct point (dec x) (inc y)) (struct point x (inc y)) (struct point (inc x) (inc y))]))

    (defn neighbour-count [world p]
    (reduce + (map (fn [x] (let [v (world-at world x)] (if (nil? v) 0 v))) (neighbours p))))

    (defn new-state [world p]
    (let [neighbours (neighbour-count world p) alive (world-at world p)]
    (and (= alive 1) (< neighbours 2)) 0 ;; under population
    (and (= alive 1) (> neighbours 3)) 0 ;; over-crowding
    (and (= alive 1) (or (= 2 neighbours) (= 3 neighbours))) 1 ;; unchanged to the next generation
    (and (= 3 neighbours)) 1 ;; any tile with exactly 3 live neighbour cells becomes alive
    :else 0)))

    (defn life-step [w]
    (into (hash-map) (map (fn [x] [(first x) (new-state w (first x))]) w)))

    (defn create-world [w h]
    (let [x (range 0 w) y (range 0 h)]
    (apply hash-map (mapcat (fn [a] (mapcat (fn [b] (list (struct point a b) 0)) y)) x))))

    Most of the functions above are much clearer (apart from create-world) than they were previously. In addition the SLOC has decreased from 74 to 66, so the code is more concise too.

    I did sacrifice sparseness for neatness. I needed to have the values populated as dead such that life-step could just be written as a map function. If the values didn't exist, I'd have to create something from nothing. In this case, I think the trade off is OK.

    Overall, I think this is a definite improvement over the previous version.

    Monday 5 January 2009

    The Game of Life in Clojure.

    Game of Life in Clojure

    Conway's Game of Life is by far the most famous example of cellular automaton. It's not really a game, you set an initial state and then watch it evolve.

    The "game" is played on a grid with each cell having two states (alive or dead). The next state can be computed from the previous one using the following rules:

    1. Any live cell with fewer than two live neighbours dies, as if by needs caused by under-population.
    2. Any live cell with more than three live neighbours dies, as if by overcrowding.
    3. Any live cell with two or three live neighbours lives, unchanged, to the next generation.
    4. Any tile with exactly three live neighbours cells will be populated with a living cell.

    The first thing we need to do is determine a representation for a grid. I've gone for a simple approach of representing a grid as a list of lists. create-world creates a blank grid

    (defn create-world [w h]
    (replicate h (replicate w 0)))

    Now we need some helper functions for getting the state of an individual element and toggling values.

    (defn world-at [world x y]
    (if (and (>= x 0) (>= y 0) (< x (count world)) (< y (count (first world))))
    (nth (nth world x) y)

    (defn toggle [x]
    (if (= x 0) 1 0))

    (defn toggle-row-at [row pos]
    (map (fn [x] (if (= pos (first x)) (toggle (second x)) (second x))) (zipmap (range 0 (count row)) row)))

    (defn toggle-pos [world x y]
    (map (fn [v] (if (= (first v) x)
    (toggle-row-at (second v) y)
    (second v)))
    (zipmap (range 0 (count world)) world)))

    toggle-row-at and toggle-pos both feel wrong to me - I'm not sure of a better way to carry position information into a map function (for example, I want to perform a map operation with positional information - zip is the only way I can see of doing that - wonder if there is a better way?).

    Armed with functions to explore the world we need to write the logic to transition from one-state to the next.

    (defn neighbour-count [world x y]
    (+ (world-at world (dec x) (dec y)) (world-at world x (dec y)) (world-at world (inc x) (dec y))
    (world-at world (dec x) y) (world-at world (inc x) y)
    (world-at world (dec x) (inc y)) (world-at world x (inc y)) (world-at world (inc x) (inc y))))

    (defn new-state [world x y]
    (let [neighbours (neighbour-count world x y) alive (world-at world x y)]
    (and (= alive 1) (< neighbours 2)) 0 ;; under population
    (and (= alive 1) (> neighbours 3)) 0 ;; over-crowding
    (and (= alive 1) (or (= 2 neighbours) (= 3 neighbours))) 1 ;; unchanged to the next generation
    (and (= 3 neighbours)) 1 ;; any tile with exactly 3 live neighbour cells becomes alive
    :else 0)))

    new-state encodes the rules, and neighbour-count just does exactly what it says on the tin. Now all we need to do is apply new-state to each cell in the grid and produce the next one. This is suprisingly simple:

    (defn life-step [w]
    (let [width (count w) height (count (first w))]
    (fn [row] (map (fn [col]
    (let [x (first row) y (first col)]
    (new-state w x y)))
    (zipmap (range 0 height) (second row))))
    (zipmap (range 0 width) w))))

    It again makes use of the zipmap idiom for applying mapping function with positional information. I wish I could find a reference which states whether this is a good thing or a bad thing :)

    In order to implement the GUI around this, we define *world* as an atom, and provide a mouse listener which on a left-click toggles elements, and on a right-click moves forward one step. Animating it would be fairly simple too (just hook up a Swing timer a la bubble sort).

    Full code is availabe on my Git Repository

    Sunday 4 January 2009

    Ray Tracing in Clojure (Part II)

    In my previous post, I ported across a ray-tracing example from ANSI Common Lisp. Now, I'll try and work out how threading functions work.

    Each area of the viewing image is independent, so let's divide up the viewing surface into a number of independent buffered images. We have a little helper function which given a width, height and tile size gives us back a list of positions to draw at.

    (defn create-work-list [width height unitX unitY]
    (let [xs (range 0 width unitX) ys (range 0 height unitY)]
    (mapcat (fn [x] (mapcat (fn [y] (list (list x y))) ys)) xs)))

    For example, if we want to draw use a tile size of 150x150 with a viewing surface of 300x300 we get the following offsets:

    user> (create-work-list 300 300 150 150)
    ((0 0) (0 150) (150 0) (150 150))

    Now we should make the ray-trace function take notice of these co-ordinates. The previous version of the ray-trace function wasn't very functional as it performed IO (drawing to the canvas). Side-effects are the enemy of referential transparency.

    (defn ray-trace [world w h ox oy]
    (let [buffered-image (BufferedImage. w h BufferedImage/TYPE_BYTE_GRAY)]
    (doseq [x (range 0 (dec w))]
    (doseq [y (range 0 (dec h))]
    (.setRGB buffered-image x y (color-at (+ x ox) (+ y oy)))))

    Now the function is pure because it always gives the same output for the same input and no IO occurs.

    Finally, we need to distribute this work across all the cores. Clojure has a built in function, pmap that works the same as map but in parallel.

    Initially I performed the drawing to the canvas within a pmap expression - this is very wrong! pmap is lazy, it's not evaluated unless it is needed. Clojure doesn't know that I intended that to always been evaluated, so it was only ever drawing the first four tiles (presumably because that's how many cores my desktop has).

    If you want to force the evaluation there's a number of functions which do that:
    • doseq
    • dorun
    • doall
    In this case I decided I'd use ray-trace in a pmap expression to produce a list of images, and then use doseq to perform the side effects.
    (def canvas (proxy [JPanel] []
    (paintComponent [g]
    (proxy-super paintComponent g)    
    (.setColor g Color/RED)
    (let [width (.getWidth this) height (.getHeight this) unitX (/ width 16) unitY (/ height 16) work-list (create-work-list width height unitX unitY)]
    (doseq [image (pmap (fn [pos] (list (apply ray-trace (list world unitX unitY (first pos) (second pos))) (first pos) (second pos))) work-list)]
    (.drawImage g (first image) (second image) (nth image 2) unitX unitY nil))))))
    The separation of IO and pure functions is something advocated in "A wish list for the next mainstream programming language". Clojure doesn't force this, whereas something like Haskell does. Haskell uses monads to achieve this, which is something I'll visit at some point. See LtU for some explanations. Ok, enough theory - what difference does this actually make? Well for me, about a 4x difference, and I have 4 cores, so that's good! Timing's aren't that exciting, but you can see the difference with the system monitor. Ray Tracing Performance Picture
    Pretty much full CPU utilization across all cores, with only a few lines of code changed, funky!

    Saturday 3 January 2009

    Ray Tracing in Clojure

    ANSI Common Lisp, by Paul Graham, is a great introduction to Lisp. Lots of compelling examples on run-length encoding (see previous post), poetry generation and ray-tracing.

    This post looks at translating the example (from Chapter 9) into Clojure.

    Ray-tracing is a very simple technique. From ACL:
    "To generate a 3D image, we need to define at least four things: an eye, one or more light sources, a simulated world consisting of one or more surfaces, and a plane (the image plane) that serves as a window onto this world. The image we generate is the projection of the world onto a region of the image plane."

    So how do we generate the pictures? That's pretty simple too, all we do is for every pixel in the image plane just trace the ray that the eye would see from there. Done.

    We start off by defining some simple maths functions and an object to represent a point in 3D space.

    (defn square [x] (* x x))
    (defstruct point :x :y :z)
    (defn magnitude [p]
    (Math/sqrt (+ (square (:x p)) (square (:y p)) (square (:z p)))))
    (defn unit-vector [p]
    (let [d (magnitude p)]
    (struct point (/ (:x p) d) (/ (:y p) d) (/ (:z p) d))))
    (defn point-subtract [p1 p2]
    (struct point 
    (- (:x p1) (:x p2))
    (- (:y p1) (:y p2))
    (- (:z p1) (:z p2))))
    (defn distance [p1 p2]
    (magnitude (point-subtract p1 p2)))
    (defn minroot [a b c]
    (if (zero? a)
    (/ (- c) b)
    (let [disc (- (square b) (* 4 a c))]
    (if (> disc 0)
    (let [discroot (Math/sqrt disc)]
    (min (/ (+ (- b) discroot) (* 2 a))
    (/ (- (- b) discroot) (* 2 a))))))))

    The original Lisp code mixed the point structure with individual values. I felt this made the code a bit ugly and hard to read, so in here we try to use the point structure as much as possible. (struct point 1 2 3) feels like clunky syntax, but I was unable to find anything better. Perhaps an alternative is to just use a plain vector / map? Or wait for the future and see if struct support improves?

    Anyway, the code above is self explanatory, minroot is the big one and that's just a solver for the quadratic equation. function.

    Next we need to define some of the environment. For this we'll fix the image plan between (0,0) and (300,300) and we'll just render spheres. Each surface has a grey-scale colour associated with it (a surface).

    (def eye (struct point 150 150 200))
    (defstruct surface :color)
    (defstruct sphere :color :radius :centre)
    (defn defsphere [point r c]
    (struct sphere c r point))
    (def world [(defsphere (struct point 150 150 -600) 250 0.32)
    (defsphere (struct point 175 175 -300) 100 0.64)])

    One thing I did find was that Clojure doesn't support the :include for structs that Common Lisp does. For this example, the world is a couple of spheres one smaller than the other and in front (and slightly brighter).

    The following functions determine where a sphere gets hit with a ray from a specific source (in this case a point) and the surface normal of a hit.

    (defn sphere-normal [s pt]
    (let [c (:centre s)]
    (unit-vector (point-subtract c pt))))
    (defn sphere-intersect [s pt ray]
    (let [c (:centre s)
    n (minroot (+ (square (:x ray)) (square (:y ray)) (square (:z ray)))
    (* 2 (+ (* (- (:x pt) (:x c)) (:x ray))
    (* (- (:y pt) (:y c)) (:y ray))
    (* (- (:z pt) (:z c)) (:z ray))))
    (+ (square (- (:x pt) (:x c)))
    (square (- (:y pt) (:y c)))
    (square (- (:z pt) (:z c)))
    (- (square (:radius s)))))]
    (if n
    (struct point (+ (:x pt) (* n (:x ray)))
    (+ (:y pt) (* n (:y ray)))
    (+ (:z pt) (* n (:z ray)))))))

    sphere-intersect can return nil if it doesn't hit. Now we define the Lambert function

    (defn lambert [s intersection ray]
    (let [normal (sphere-normal s intersection)]
    (max 0 (+ (* (:x ray) (:x normal))
    (* (:y ray) (:y normal))
    (* (:z ray) (:z normal))))))

    That's it for the machinery to actually generate the image - now we need some UI and something to actually draw it. The original code generated a PPM format image, but since Clojure has a decent UI toolkit with Swing, let's just render something in a Window instead. The UI just uses the "canvas" idiom I used for the bubble sort application.

    (def canvas (proxy [JPanel] []
    (paintComponent [g]
    (proxy-super paintComponent g)    
    (.setColor g Color/RED)
    (ray-trace world 1 g (.getWidth this) (.getHeight this)))))
    (defn raytraceapp []
    (let [frame (JFrame. "Ray Tracing")]
    (doto frame
    (.add canvas)
    (.setSize 300 300)
    (.setResizable false)
    (.setVisible true))))

    All that remains is to define the ray-trace function

    ;; second item = what we hit
    ;; first item = where we hit
    (defn first-hit [pt ray]
    (fn [x y]
    (let [hx (first x) hy (first y)]
    (nil? hx) y
    (nil? hy) x
    :else (let [d1 (distance hx pt) d2 (distance hy pt)]
    (if (< d1 d2) x y)))))
    (map (fn [obj]
    (let [h (sphere-intersect obj pt ray)]
    (if (not (nil? h))
    [h obj]))) world)))
    (defn send-ray [src ray]
    (let [hit (first-hit src ray)]
    (if (not (nil? hit))
    (let [int (first hit) s (second hit)]
    (* (lambert s ray int) (:color s)))
    (defn color-at [x y]
    (let [ray (unit-vector (point-subtract (struct point x y 0) eye))]
    (* (send-ray eye ray) 255)))
    (defn ray-trace [world res g w h]
    (let [buffered-image (BufferedImage. w h BufferedImage/TYPE_BYTE_GRAY)]
    (doseq [x (range 1 w)]
    (doseq [y (range 1 h)]
    (.setRGB buffered-image x y (color-at x y))))
    (.drawImage g buffered-image 0 0 Color/RED nil)))
    The only major difference between this and the ACL code, is prefering to use map and reduce instead of the nested do code. This feels more functional to me and also opens up parallelism opportunities which I'll look at for the next post. So what does it look like (not very good, but a background of code looks cool!)? Ray Trace application example picture

    Thursday 1 January 2009

    Functional Programming Patterns.

    Roundabout presents a set of patterns for writing recursive programs.

    • Structural Recursion - use recursion to traverse inductively defined data structures (e.g. list ortree)

    • Interface Procedure - use a helper method to adapt a function call if you need an extra argument

    • Mutual Recursion - use multiple recursive functions to traverse inductively defined data structures where elements in the data structure are also inductively defined

    • Accumulator Variable - use an additional parameter to carry calculations through recursion

    • Syntax Procedure - extract methods that mean something rather than accessing raw members of data structures (e.g. for trees written as lists use (lhs foo) instead of (first foo)

    • Local Procedure - use anonymous methods to hide details from the outside world

    • Program Derivation - inline code if you want more performance

    The relationship between patterns and refactoring has already been written about ("Refactoring to Patterns"). In this case the relationships are quite clear.

    Not all of the patterns are directly applicable to all languages. For example, Program Derivation is unwarranted for Lisp type languages because macros provide all those capabilities. I guess that's another one for the design patterns are a weakness of the language meme.