Thursday 13 December 2012

Technetium scans with mock objects

I've been doing a lot of work recently with legacy code.  I like Michael Feathers definition of legacy code, namely that it's code that isn't under test.  This particular code goes further still, by not being under test and having incomprehensible dependencies.  I'm looking at something at the moment that takes 26 parameters as input (and has a million and one hidden dependencies).

With code like this, reasoning about it is really hard.  The particular area I struggle with is capturing all of the second order dependencies; assumptions about the dependencies that aren't immediately appaernt.  For example, the Law of Demeter (a.b.c) is violated, then I need to capture that knowledge somehow that a must return a non-null b. Once I've got this information, it's a lot easier to act upon it. (perhaps I just introduce c on the interface of a and then I've broken a dependency relationship).

In medicine, a technetium scan is a technique where a radioactive isotype is injected into a patient, and this trace allows you to visualize what's happening internally. We can use mock objects to simulate this technique with code. A mock object provides a simulated object that you can inspect to see how it is used. It is the radioactive isotope object, injected in and observed.


I found it super useful to capture hidden behaviour about a system. As a simple example, let's take a hideous class like this.

class HorribleMess {
       HorribleMess(Foo foo, Bar bar, Baz baz) {}

       DoHorribleThings() {
         var xyz = foo.getXYZ();
         bar(xyz.abc(), xyz.def());
       }
    }

There's a slightly hidden dependency here. HorribleMess depends not only on Foo, but on the type returned by Foo.getXYZ(). Not only that, but it also depends on the results of methods invoked on those objects. In this example, this is dead simple to see, but in a real legacy code base finding this information is a real challenge.

This is where I think mock objects can help. The way I've found useful is to create a unit test class and just try to instantiate the object passing in mocks wherever possible. If you can construct the object immediately, great, there's no dependencies on constructors and you can start to explore how the methods work. By using strict mocks you can force yourself to spell out the dependencies in the test (the test will fail unless you explicitly set the response of the mock object).

The pattern my tests often end up with is a series of documentation about the dependencies of the class. This is similar in spirit to the effect sketching advocated in Working Effectively with Legacy Code, but with strict mocks it has the advantage of being more difficult to make a mistake. It also serves as a living record of hidden dependencies for a particular class.

  class HorribleMessTest {
 
    // 1st order dependencies
    MockFoo mockFoo;
    MockBar mockBar;
    MockBaz mockBaz;

    // 2nd order dependencies
    MockXyz mockXyz;

    void Test() {
      // 1st order
      mockFoo.When(mockFoo.getXYZ()).Return(mockXyz);
 
      // 2nd order
      // setup on mockXyz
      
      new HorribleMess(mockFoo, mockBar, mockBar);
    }
  }

In legacy code, I often find that there's three or more levels of dependencies. Once these are explicitly spelt out you have a trace of a particular execution path through the code and you can start to feel slightly more confident about changing it.

The spelt out dependencies often immediately suggest the refactoring needed to make the solution cleaner. I've found remove the middle-man to be a great first step in eliminating the multiple dependencies.

I do have some concerns whether this'll continue to be a useful technique in the future. Perhaps this will create too much baggage in the code base (in terms of tests needing to be kept up to date). Time will tell!

Sunday 9 December 2012

Code Retreat Cambridge

Yesterday I had the pleasure of attending a Global Day of Code Retreat event held at Red Gate, Cambridge hosted by the local software craftsmanship group.

The format of the day is to solve the same problem a number of times with different constraints each time. Each problem is solved as a pair and generally uses test-driven development. After the end of each session, the code was thrown away. Although this might seem a little strange, it meant you couldn't form an attachment to the code and could more easily indulge with the given constraints.

The problem used this time was Conway's Game of Life. I think this is a great choice of problem, it's simple enough that the problem can be easily understood, but there's a really rich variety of possible approaches to solving the problem.

I alternated the sessions between C# and Haskell. I found the value in pairing directly related to the asymmetry in ability. The more asymmetry the better! I found that showing people unfamiliar with Haskell was a rewarding experience. People seemed to grok Haskell very quickly, and were impressed with the readability of the solution.

As a side note, I was really impressed with Mono.  I needed to get a C# development environment up and running on my Linux laptop, and I was slightly dreading doing so.  Turned out that was completely unfounded, it just worked.  The IDE lacks a few niceties (it's certainly not as easy as Visual Studio + Resharper), but it's not bad!

On the C# side, I learnt more about design than anything else. In particular, one thing that stuck out for me was the difference between the top-down and bottom-up approaches. Sometimes when I paired, my partner would aim to get a top-level test (evolving a grid produces a new grid for example). I found this really hard to drive from a TDD point of view - breaking down from the top is much harder for me. The other way up, writing the rules first (a dead cell that has three neighbours becomes alive for example) feels much more natural for me an certainly a lot easier to drive the tests. There did seem to be a middle ground that was more fun to work through; get the structure right with a top-level test, and then drive the functionality bottom-up. The seems to be the same approach as advocated by Growing Object-Oriented Software by Tests and I look forward to trying this out in anger.

I didn't pair with anyone who even considered mutable objects.  I'm not sure whether this was just a self-selecting group of people, but everyone was firmly in the "mutable state" is bad.  I would really like to run with a set of constraints to do with high-performance to see how that altered peoples perceptions.

The constraints that were chosen were interesting.
  • No primitives (including enums)
  • No if statements
  • No mouse! (use the IDE)
  • No long methods
No mouse wasn't very interesting for me, I've always been a big fan of learning keyboard shortcuts (bashing keys always makes it look like you know what you are doing, so I've found it a good way to hide my incompetence sometimes!). 

The most interesting constraint for me was the no if statements (see the anti-if campaign for some justification). The lack of if statements forced you into generating object hierarchies and using polymorphism.  It was a little artificial for the problem but it did result in cleaner code (just maybe a bit more of it!).

So now I need to try and take this back to my day job.  Always a bit harder trying to apply techniques learnt in the small to problems faced in the large.  Legacy code makes life harder!

Thursday 6 December 2012

Code Smells - The Lying Interface

I came across a code smell today that I hadn't seen / heard before. I've christened it the "Lying Interface". At work, we've been trying to refactor some code into a shared service accessible by WCF. To do this, we need to understand the interfaces and then develop the appropriate protocol to model this.

I've been faced with some interfaces like this:

public interface IPoller {   
     void Poll();
  }

This doesn't look too bad. The Poll method presumably calls every so often to poll something. Looking at the constructor then shows something a bit hidden.

public class Poller : IPoller {
    private IPollListener m_Listener;
  
    public Poller(IPollListener listener) { 
       m_Listener = listener;
    }

    public void Poll() {
      // run some code and generate some results
      m_Listener(results);
    }
  }

So despite the poll method being simple, it's deceptive. It actually relies on a constructor argument being passed in and invokes the results upon it. The interface is really lying, it's not showing the true responsibilities of the class. Perhaps another name for this is "hidden dependency"?

My opinion is that the interface should be self-describing, it should say what the class does without saying how. In this case, the hidden dependency on the IPollListener class is not expressed properly. I ended up refactoring the code to be more self-describing and using an event instead, roughly the new code ends up looking like this.

public interface IPoller {
     void Poll();
 
     event EventHandler<PollingEventArgs> Polled
   }

   public class PollingEventArgs : EventArgs {}

It's still not perfect, without a comment I have no way of knowing from the method signatures what events are triggered, but I think it's better than it was. This has the neat side effect of breaking the dependency on Poller on IPollListener and switching over to an Observer pattern.

I'm still not sure what the right name for this smell is, maybe it's more to do with hidden dependencies, but it's a start!