Monday, 23 February 2009

Bit Fields using Clojure

A bit array is a way of getting a very compact array of Boolean values with each value being represented as a single bit. Bit arrays are usually associated with low(er)-level like C, but you can do them in Clojure too.

Clojure provides array functions through aget and aset (slightly strange in that it mutates a data structure in place). Using Java arrays isn't very functional, but in my opinion this is about providing the balance between strictness (e.g. Haskell) and leniency (e.g. C).

We define a bit-array as a structure consisting of some array data and the width of each field.

(defstruct bit-field :element-width :array-data)

(defn bit-array
(struct bit-field 31 (int-array (inc (int (/ n 31))))))

Where 31 is the range of an integer in Java (it doesn't support unsigned) (yeah, my thinking is fuzzy here, but I think that's right... Certainly 32 fails all my tests.).

A bit-array only consists of a couple of operations, get-bit and set-bit!. The bang (!) notation at the end of the set-bit function name is an informal way of indicating that the function mutates its data.

(defn set-bit!
[bitfield bit val]
(let [r (mod bit (bitfield :element-width))
n (int (/ bit (bitfield :element-width)))
x (aget (bitfield :array-data) n)]
(if (not (zero? val))
(aset (bitfield :array-data) n (bit-or x (bit-shift-left 1 r)))
(aset (bitfield :array-data) n (bit-xor x (bit-shift-left 1 r))))

(defn get-bit
[bitfield bit]
(let [r (mod bit (bitfield :element-width))
x (aget (bitfield :array-data) (int (/ bit (bitfield :element-width))))]
(if (= 0 (bit-and x (bit-shift-left 1 r))) 0 1)))

We work out the index in the array to change by dividing the bit to set by the width of each element and use bit-shift-left to identify the bit in question to twiddle.

How can we be sure this works? Well, we can never be 100% sure without a proof, but we can at least run it against a reasonable set of data! Clojure Contrib has a very simple library for testing. You define tests using the deftest macro, each test consists of a number of is functions that verify assertions.

(deftest test-bits
(let [n 3000
f (bit-array n)]
(is (= 31 (f :element-width)))
(doseq [x (range 0 n)]
(is (= 0 (get-bit f x))))
(doseq [x (range 0 n)]
(set-bit! f x 1)
(is (= 1 (get-bit f x))))
(doseq [x (range 0 n)]
(set-bit! f x 0)

Tests are run using run-tests. For example:

bitarray> (run-tests 'bitarray)

Testing bitarray

Ran 1 tests containing 9001 assertions.
0 failures, 0 errors.

So it passes the tests therefore it at least vaguely works!