Friday, 27 August 2010

Ants and Haskell

Software Transactional Memory (STM) is a concurrency control mechanism designed to simplify programming for shared memory computers. Beautiful Code contains a great introduction to the concepts of STM in the Haskell language

In most languages, locks and condition variables are the main mechanisms for controlling access, but these are notoriously hard to get right. Java Concurrency in Practice is a good read to understand just how many ways there are to shoot yourself in the foot (too few locks, too many locks, wrong locks, race conditions, wrong order, error conditions, deadlock, livelock, live stock, brain explosion). STM simplifies shared memory programming by providing database like semantics for changing memory. Reads/Writes to shared memory happen within a transaction - each memory access appears to happens in isolation from the others and appears atomically to observers. If a transaction conflict with another, then one of the transactions is retried. An implementation typically records the memory accesses somehow and then can decide whether there was a conflict. Languages that restrict mutability (like Clojure and Haskell) have a significantly simpler implementation than imperative languages such as C/C++.

Composability is another advantage for STM. For example, take java.util.Hashtable - what if you want to do an insert/delete as a single atomic operation and only make the contents visible to other threads once finished? As the original design didn't do this, you're on your own. In contrast STM composes well.

Both Clojure and Haskell feature support for STM. The canonical example in Clojure is Ants.clj that demonstrates STM via a simple simulation of foraging ants (see also Flocking about with Clojure). As a learning exercise I thought it'd be neat to try to convert this over to Haskell!

Ants in Haskell

To model the ants world, I use the following data structures. Transactional variables (TVars) are used to hold a reference to a mutable variable. For the Ants simulation, I use a Vector of TCell's to represent the ants world.

TVars can only be modified within the STM context. The key thing is that the only way to mutate transactional variables is from within the STM monad. To fiddle with the variables within TVar you can use newTVar, readTVar and writeTVar. Oddly there didn't seem to be a primitive operation to update a TVar based on its current value. updateTVar updates the TVar by applying a function to the value inside.

check verifies that a condition is true and if it isn't true then the transaction is retried. The key point is that the transaction is only retried when there's a reason to do so (e.g. memory read/write) so you aren't just heating the CPU whilst the condition is being validated. As an example, when we move an ant forward, we want to check that there is not an ant in the way. If there is an ant in the way, we'll wait till the coast is clear before moving.

At some point you have to run your STM actions. atomically runs the STM transactions from the IO monad (e.g. the top-level) and returns the result. A very important point is that you want your actions to be in the STM monad as much as possible. If all of your functions are run within the IO monad then you lose the composability aspect. The pattern to use is make everything use the STM monad and glue together randomly and you won't have a threading problem (you still have other problems though, it's not magic).

The Clojure code used agents to represent each ant. I'm not sure what the most idiomatic translation to Haskell is, but I spawned a thread for each ant using Control.Concurrent and forkIO. Haskell threads are incredibly light-weight so spawning even thousands of them is not a problem. Each ant thread simply evaluates the behaviour, moves, sleeps and repeats.

The rest of the code is more or less a direction translation from the Clojure. It's pretty verbose so I won't bother posting it here, but the full code is on my github page. You should be able to compile it with ghc -lglut --make -main-is AntsVis AntsVis.hs. Any hints on how to make it suck less appreciated!

Performance seems very good with the default compiler options, I'm able to run with 100+ ant agents all running concurrently.  The programming is very simple and once I'd found out and added the appropriate check logic into move everything worked properly.

Hurrah for simple concurrent programming!

(update 30/8/2010 - after finding out the performance sucked after a while with a few more ants than I'd tested with I looked at speeding up the Ants program).