Monday, 9 November 2015

The (very) Basics of R with the Game of Life

R is a programming language for statistical computing and graphics.  It's also the language of choice amongst pirates.  Arrr! R is increasingly important for big data analysis, and both Oracle and Microsoft have recently announced support for database analytics using R.

So, how do you get started with R?  Well, for the rest of this I'm going to assume that you already know how to program in a { } language like Java / C# and I'm going to cover the minimum amount possible to do something vaguely useful. The first step is to download the environment.  You can get this from here.  Once you've got something downloaded and installed you should be able to bring up a terminal and start R.  I really like the built in demos.  Bring up a list of them with demo() and type demo(graphics) to get an idea of the capabilities of R.

These are the boring syntax bits:
  • R is a case sensitive language
  • Comments start with # and run to the end of the line
  • Functions are called with parentheses e.g. f(x,y,z)
The "standard library" of R is called the R Base Package. When you bring up R, you bring up a workspace.  A workspace is just what is in scope as any one time.  You can examine the workspace by calling the ls function.

    # Initially my workspace is empty
    > ls()
    character(0)

    # Now I set a value and lo-and-behold, it's in my workspace
    > x = "banana"
    > ls()
    [1] "x"

    # I can save my workspace with
    > save(file="~/foo.RData");

    # I can load my workspace with
    > load("~/foo.RData");

    # I can remove elements from the workspace with rm
    > rm(x)
    > ls()
    character(0)

    # I can nuke my workspace with rm(list=ls())
    > x = 'banana'
    > rm(list=ls())
    > ls()
    character(0)   

We've seen above that R supports string data, but it also supports vectors, lists, arrays, matrices, tables and data frames. To define a vector you use the c function.  For example:


    > x = c(1,2,3,4,5)
    [1] 1 2 3 4 5

    > length(x)
    [1] 5

Remember everything in a vector must be of the same type.  Elements are co-erced to the same type, so c(1,'1',TRUE) results in a vector of string types.  Indexing into vectors starts at 1 (not zero).  You can use Python style list selection:

    > x = c(1,2,3,4,5,6,7,8,9,10)
    >  x[7:10] # select 7 thru 10
    [1]  7  8  9 10
    > x[-(1:3)] # - does exclusion
    [1]  4  5  6  7  8  9 10

To define a list, you use, ahem, list.  Items in list are named components (see the rules of variable naming).

    > y = list(name="Fred", lastname="Bloggs", age=21)
    > y
    $name
    [1] "Fred"
    $lastname
    [1] "Bloggs"
    $age
    [1] 21
    > y$name # access the name property
    [1] "Fred"

Finally, let's look at matrices.  You construct them with matrix and pass in a vector to construct from, together with the size.

  > m = matrix( c(1,2,3,4,5,6,7,8,9), nrow=3, ncol=3)
  > m
           [,1] [,2] [,3]
    [1,]    1    4    7
    [2,]    2    5    8
    [3,]    3    6    9

OK, that should be enough boring information out the way to let me write a function for the Game of Life.  All I want to do is take a matrix in, apply the update rules, and return a new one.  How hard can that be? You write R files with the extension ".R" and bring them into your workspace with the source function. Here's an embarrassingly poor go at the Game of Life (note I've only spent 5 minutes with the language, so if you've got any improvements to suggest or more idiomatic ways of doing the same thing, they are greatly received!).



Testing this at the REPL with a simple pattern.

    > blinker = matrix(0,5,5)
    > blinker[3,2:4]=1
    > blinker
         [,1] [,2] [,3] [,4] [,5]
    [1,]    0    0    0    0    0
    [2,]    0    0    0    0    0
    [3,]    0    1    1    1    0
    [4,]    0    0    0    0    0
    [5,]    0    0    0    0    0
    > nextStep(blinker)
         [,1] [,2] [,3] [,4] [,5]
    [1,]    0    0    0    0    0
    [2,]    0    0    1    0    0
    [3,]    0    0    1    0    0
    [4,]    0    0    1    0    0
    [5,]    0    0    0    0    0

Huzzah! Next steps are probably to write some unit tests [PDF] around it, but learning how to install packages can wait till another day!