Saturday, 26 September 2009

Building Basic Types in Haskell

Types are the building blocks of Haskell programs. Algebraic Data Types (not to be confused with Abstract Data types) are one category of types. An algebraic data type is defined as either an enumeration (a set of distinct types) or a discriminated union (a choice between multiple alternatives). For example:

-- An enumeration type
data PrimaryColor = Red
| Green
| Blue

*Main> :type Red
Red :: PrimaryColor

-- A discrimated union type
data MyIntList = Empty
| List Int MyIntList
deriving Show

*Main> :t Empty
Empty :: MyIntList

*Main> :t List 3 Empty
List 3 Empty :: MyIntList

Note that MyIntList is a recursively defined data type because it uses itself in the definition. This type of structural type recursion is just the same as calling a function recursively. As long as we have a base case (in this case Empty then we can easily build the structure. The way an instance of a type was constructed is available at runtime. Pattern matching is a way of breaking down a value into constiuent parts. For example, we could (but should not!) write a function to calculate the length of the MyIntList type as follows:

myIntListLength :: MyIntList -> Integer
myIntListLength Empty = 0
myIntListLength (List _ rest) = 1 + myIntListLength rest

There's a good talk about the use of OCaml for financial trading systems available here. Amongst other things the strengths of the pattern matching idea are discussed. In a large system you can change a type, recompile and find out any redundant, missed or impossible cases. In OCaml this seems to be on by default, with GHC you need to use the "-fwarn-incomplete-patterns" option.

Record syntax is an alternative way of specifying a type such that parameters can be referred to be name.

data Person = Person {
firstName :: String
, lastName :: String
, title :: String
} deriving (Show)

-- Normal way of creating a Person
*Main> Person "Bill" "Bob" "Mr"
Person {firstName = "Bill", lastName = "Bob", title = "Mr"}

-- Record syntax allows you to specify parameters by name
-- and therefore order does not matter
*Main> Person { title="Mr", firstName="bill", lastName="bob" }
Person {firstName = "bill", lastName = "bob", title = "Mr"}

The MyIntList type is very poor. It's only defined for lists of type integer. We can improve this by making it a parametrized type. In the example below we defined MyList for any type a.

data MyList a = Empty
| MyList a (MyList a)
deriving (Show)

-- MyList of numbers
*Main> :t MyList 3 (MyList 4 Empty)
MyList 3 (MyList 4 Empty) :: (Num t) => MyList t

-- MyList of characters
*Main> :t MyList 'a' (MyList 'b' Empty)
MyList 'a' (MyList 'b' Empty) :: MyList Char

You can have multiple parametrized types. For example, if we wanted to create a list that alternated types, we could:

data MyOddList a b = OddEmpty
| MyOddList a (MyOddList b a)
deriving (Show)

*Main> :t MyOddList 3 (MyOddList 'c' (MyOddList 4 OddEmpty))
MyOddList 3 (MyOddList 'c' (MyOddList 4 OddEmpty)) :: (Num t) => MyOddList t Char