Saturday, 4 April 2009

Twitter using Scala

Twitter was previously using Ruby for large parts of the application, but has recently introduced some parts of Scala.

An interview with some of the developers over at Artima discusses some of the reasoning there.

It's an interesting decision, seemingly motivated by the need for performance and static typing. Let's look at the static typing reasoning:

As our system has grown, a lot of the logic in our Ruby system sort of replicates a type system, either in our unit tests or as validations on models. I think it may just be a property of large systems in dynamic languages, that eventually you end up rewriting your own type system, and you sort of do it badly. You’re checking for null values all over the place. There’s lots of calls to Ruby’s kind_of? method, which asks, “Is this a kind of User object? Because that’s what we’re expecting. If we don’t get that, this is going to explode.”

I'm not sure I agree with this reasoning of the Twitter guys description. Firstly, Ruby code with a bunch of kind_of? littered everywhere is almost certainly bad. Perhaps this is because as the system has evolved the initial design decisions are not quite right, and the system hasn't evolved and accrued too much technical debt in the process. The benefits of hindsight mean that you could probably rewrite bits in any language and see improvements.

Secondly, what does replicating a type-system mean? Before we can replicate a type system, we need to know what one is:

[A type system is a] tractable syntactic method for proving the absence of certain program behaviors by classifying phrases according to the kinds of values they compute. (Pierce 2002).

This sounds an awful lot like how I define a test; I'm looking to test that certain behaviours hold true under whatever conditions I write my test for.

Has using Scala static-typing meant that you can write any less tests? The flavour of errors that static typing can detect is pretty large but it's not exhaustive, you've still got to write tests to show that some behaviours work as expected. I can't find any information on this, but this would be really good information to get!

The other argument seems to be performance. Scala (and Clojure) leverage the JVM so have the advantage of 10 years+ of development on that. There's a version of Ruby for the JVM (JRuby), but that was rejected because some third-party Ruby libraries depended on native C libraries that aren't compatible.

I'm not sure where I sit yet on the dynamic / static typing debate, and to be honest I'm not sure it's that interesting given the tests as types argument (unless the amount of tests decreases significantly if you use types). I don't think the Artima article tells us any of the real answers of whether one is better than the other for Twitter. As I mentioned before, rewriting it in any language is probably going to result in cleaner, clearer code just because of the accumulated knowledge of the past one.