Friday, 11 April 2014

How does it feel to give a terrible conference talk?

Have you been to a conference and sat through an awful presentation and wondered just how the hell someone got there? Me too!

Recently I attended the ACCU conference in Bristol and got to experience what it feels like to deliver something that went down like a lead balloon. One evening many moons ago, I thought I'd send in a proposal. By some small miracle I got accepted and was all set to run a 90 minute introduction to Haskell.

I'd already run through the workshop once at a local user group. The material isn't amazing, but I was confident in delivering it and thought it offered people a chance to get a taste of Haskell and programming with functions.

Then the problems started. It's ACCU. It's full of clever people, therefore I should level-up the material and assume more knowledge. Right? I should make it more hands-on, more interactive and better in every way.

I prepared hard. I updated the slides. I added more and more. I wrote notes, I dug references and I was confident it would kick-ass.

And then the day arrived.

90 minutes seem like a long time. It isn't. I spent a good 15 minutes ensuring that everyone could run "hello world". Very rapidly 90 minutes because 60 minutes.

Then my cleverness got the better of me. The Curry-Howard isomorphism is fascinating, but perhaps it's not the best subject matter within the first 30 minutes of any presentation. Trying to explain it under pressure with questions from an audience eager to learn makes it even worse. I probably lost another 20 minutes trying and failing to explain that const :: a -> b -> a only has one valid implementation in Haskell. And what the hell are the poor attendees going to do with this information? GAH!

And so it continued. On to writing some code. I'd wanted to make it easier to compose higher order functions to produce results, so I'd made the initial data structures in the exercises a bit more complicated than those I'd shown in the example slides. Big mistake. This made it much harder for people to grok the syntax; I'd shown simple syntax but not given enough direction. 30 minutes rapidly disappeared and I'm now *way* behind schedule.

At this point, I'd already realized the situation was going Pete Tong. But what'd you do? You can't just down tools and walk out the room (well, I suppose you could, but that'd be worse), so you just have to knuckle down and carry on. And carry on I did, through more examples (well over-egged) and then onto the Universality of Fold (brain, what the hell are you thinking?!).

With 5 minutes left, there's plenty of time to through a demo of QuickCheck in, right?But then, I realized I'm in an Emacs buffer. How'd I increase the font-size so people can read it? GNARGH!! It's over to Notepad and bump the fonts up in that. "Should have used vi!" went the audience. ARGH!

And then the buzzer sounds (well, not really, but it's time to go). Bring things to a halt and escape to a corner of the building. I can't imagine that was particularly fun for the participants. A few people kept up (hurrah!) and there were a couple of positive things said, but I knew it'd gone wrong and it boy that doesn't feel good.

So, at least now I know how it feels (bad, very bad) and I also learnt an important lesson. Keep the message simple! Focus on the single takeaway you want participants to have. I wanted people to leave knowing that Haskell isn't impenetrable and looking at how far you can get just by reading type signatures. However, I lost this in a noise of other random related things and tried (and failed) to communicate a million and one other features.

KISS!

Wednesday, 9 April 2014

Agile - What Next?

I'm at ACCU at the moment, and instead of preparing my talk on Haskell for Thursday, I'm writing up my notes from Bob Martin's talk on agile yesterday.

Agile was originally founded by a bunch of programmers over a decade ago. The aim (from Kent Beck) was to devise a system that eliminated the trust divide between programmers and managers (them and us). Transparency was the aim of the game. Programmers would record velocity using story points. Managers would track number of story points per sprint and produce burn-down charts. Everyone is happy.

Unfortunately, burndown and velocity charts track only one part of software development, features. There's a hidden part of software development that isn't captured by these charts, ability to change. If there's one thing for certain in software development it's that people will change their mind and features will need to adapt. It's no good having your software with the correct features today, if it can't have the correct features tomorrow. Arguably, a code bases ability to respond to change is the primary responsibility of the developers.

In the original light-weight process, XP, this was kept in check by Ron Jefferies concentric circles.

Concentric Circles

This, again, is part of transparency and trust. At the inner-level, TDD, pair-programming and simple design keep the software honest. A suite of tests gives transparency on the system functionality. Moving further out we reinforce these practices with collective ownership (transparency again, no siloed development). And so on, and so forth.

Fast-forward a decade or so, and where are we now? Agile is the domain of the manager. There are no developers at agile conferences any more, it's all about the secondary value of a software product (shipping features) rather than the primary ability (reacting to change).

The XP Practices have been forgotten. Scrum empowers teams to take ownership of their practices and opt out of ones that don't work. Of course, it's easier (in the short run!) to forget about TDD, simple design and refactoring. However, in the long run productivity grinds to a halt (see Design Stamina Hypothesis).

Bob argues (The Corruption of Agile) that agile doesn't exist without the practices that support it. I agree; most agile teams aren't agile in their ability to react to change. Martin Fowler has a term for it "Flaccid Scrum" where we adopt the project management side of it, but not the underlying practices for ensuring that the code base becomes malleable and responsive to change.

With all this in mind, the trust issues have reemerged. Dropping the velocity (number of story points per sprint) is a bad idea, so developers have rebelled. Let's just make the stories smaller. The points counted are the same, but the size of the stories is much smaller. Teams are wading through custard, developing features just as slowly as ever.

The thrust against this has come in the form "software craftsmanship". This is trying to reimagine the circles from the inside out, but it's failed. It's failed because it doesn't attempt to bridge the divide between the managers and the coders. It might help the engineers to "do the right thing" more often, but it doesn't show transparency.

And the talk ended there, no answers for the future and a little depressing. I've definitely seen the scenarios Bob describes, but what's the solution? It's probably not "kill all the project managers" as someone suggested. I'd love to make the "ability to change" a tangible concept that teams can explore and understand. It's not an easily measured property, but I think taking data-driven decisions about code is part of the answer. Project managers need options to meet business constraints. Sometimes it's OK to go quick and dirty, to spike a feature that may not live longer than a week, but you have to accept that the remedial cost of recovering from that burst of activity exists and understand the remedial cost.

Right, now to finish off a few slides for this Haskell thing.

Saturday, 1 February 2014

The First International Conference on Software Archaeology

I recently attended The First International Conference on Software Archaeology, much more memorably shorted to #ticosa.

It was a slightly strange conference, in that it was never particularly clear what software archaeology was, but that was a good thing as it gave a great variety of talks encompassing everything from metrics, to tools for understanding, to philosophical thoughts on the architecture of information.

Process Echoes in Code


Michael Feathers opened the proceedings with a question, what's the real point of version control systems? The most common answer is that VCS systems help you roll back to previous revisions should something go wrong, or support multiple different product lines. The truth is this doesn't really happen. If your team deploys something to production that goes wrong, then I imagine you'll revert the deploy (not the VCS) and simply deploy again. The real purpose of source control is providing change logging. By looking at those changes we can see the traces of the way we work that are indelibly written in the version control system.

Michael demonstrated a tool (delta-flora) to explore the traces left in the source code. The tool was a simple Ruby program that mapped the Git commit history (SHA1, files changed, author, code diff) into Method event objects (methods added, changed and modified). This is a simple transformation, but one that seems to yield a vast amount of useful information.

Exploring the temporal correlation of class changes seems like an incredibly useful way of identifying an area of related objects. I'm working on a large, badly understood code-base. We're already finding that adding features requires touching multiple files. By mining information from the past, maybe we can make more educated decisions in the future?

Another area Michael mentioned that sent my synapses firing was analysing classes by closure date. Even if you have a huge code=base, identifying the closed classes (those that haven't changed) helps reduce the surface area you have to understand. One particular graph he showed (graphing the set of active classes against the open classes) was particularly interesting.



I'd love to plot this on a real code-base, but my understanding is that whilst you've got open classes, chances are you haven't finished a feature and the code-base is in an unstable phase. Looking forward to trying this one out.

Are you a lost raider in the code of doom?


Daniel Brolund followed with a quick overview of the Mikado Method. The Mikado Method provides a pragmatic way of dealing with a big ball of mud. We've probably all experienced the "shockwave" refactoring (or refucktoring?) where we've attempted to make a change, only to find that change requires another change, then another and before you know it you have a change set with 500 files in and little or no confidence that anything works.

The Mikado Method helps you tackle problems like this by recognizing that doing things wrong and reverting is not a no-op. You've gained knowledge. Briefly the method seems to consist of trying the simplest possible thing, using the compiler and more to find pre-requisites (e.g. If only that class was in a separate package...). By repeatedly finding the dependent refactorings you can arrange a safe set of refactorings to tackle larger problems.

I completely agree with this approach. Big bang refactorings on branches are no longer (if they ever were!) acceptable ways to work. Successful refactoring keeps you compiling and keeps you working in the smallest possible batch size. I liked the observation that the pre-requisites form a graph; before I've worked in pairs where we've kept a stack of refactorings (the Yak stack?) but it's an interesting observation that sometimes it's a graph.

How much should I refactor?


Matt Wynne gave a great metaphor for keeping code clean. If you imagine that software engineers are chefs and their output is meals, then the code base is the kitchen. What does your kitchen look like?

Matt had an exemplar code base (Cucumber rewrite), created as greenfield code, test-first, small-team, small commits and no commercial pressures. By analysing commits, a rough and ready guess was that 75% of commits were pure refactoring.

In answer to the question, how much should I refactor? The answer is simple.

More than you currently do.

Code Metrics


Keith Braithwaite gave us a talk about metrics and in particularly the dangers of not knowing what you are doing.

He gave some examples from earlier analysis that (allegedly) demonstrated that TDD exhibited bigger methods than test last. This doesn't fit our intuition and indeed analysing the results showed that they based the results on the mean. If we plot method length distribution, we'd find it's not a normal distribution but a power-law distribution. Doing a more statistically sound analysis actually gives the opposite results.

The moral of the story for me was that reducing a data set without knowing what you are doing is very dangerous!

Visualizing Project History


Dmitry Kandalov showed us an amazing analysis of a number of open source projects by mining the version control history (see here). This was the highlight of the conference for me, seeing interactive history of real code bases. Neat!

I really enjoyed seeing the way Scala and Clojure have evolved. Scala has progressively added more complexity and more code. Clojure however, has stabilised. Draw from that what you will!

Tools for Software Business Intelligence


Stephane Ducasse gave us an overview of some of the tools he used for software business intelligence. There was a call to action that we need dedicated tools for understanding code bases and I couldn't agree more with that. There were many interesting links:

Understanding Historical Design Decisions

Stuart Curran gave a presentation on "Understanding Historical Design Decisions". Stuart's perspective was very different as he comes from an information architecture / design background and didn't consider himself a programmer.

Some books to add to my ever-growing reading list:

Confronting Complexity

Robert Smallshire gave a talk on Confronting Complexity and returned us back to metrics (see also notes from Software Architect 2013).

We started by analysing how to calculate cyclomatic complexity. One interesting observation was that cyclomatic complexity gives us a minimum bound on the number of tests we need to get code coverage. If you follow this through, then if you add a conditional statement once every fifth line then every five lines of code you write demands another test. Ouch.

We looked at a simpler proxy for code complexity, Whitespace Integrated over Lines of Text (WILT). This is a really simple measure and incredibly quick to calculate so it lends itself to visualizing code data quickly.

There was a really good quote attributed to Rob Galankis (technical director at Eve Online):

How many if statements does it take to add a feature?

Again, this comes back to one of the recurring themes of the conference, Bertrand Meyer's open-closed principle. One of my takeaways from this was to pay much more attention to OCP!

Rob mentioned that Refactoring Reduces Complexity and gave the example of "Replace switch with polymorphism". I'd agree with this for the most part, but there are exceptions. Rename for example preserves code complexity, but increases code comprehensibility: the two don't always align. It'd be interesting to hook in a plugin to refactoring tools to calculate WILT before and after refactorings and report on the cumulative benefits.

Rob finished off by presenting an alternative model-driven approach to software engineering. The visualizations were neat and helped show the range of possibilities. That immediately seems like an improvement over other models such as COCOMO. Interestingly, going back to COCOMO shows that developer half-life isn't considered in the model, nor is complexity of the code produced (I guess the assumption is that complexity of the product => complexity of the code?).

Lightning Talks

Finally, we ended up with a set of lightning talks. Nat Pryce gave a quick demo of using neo4j to analyse a heap dump. Graph databases are cool!

Ivan Moore gave a few opinions on how you can protect your software for archaeologists from the future.

  • Ship your source with your product
  • Put your documentation be in source control
  • Put your dependencies in source control (reminded me of nuget package restore considered harmful)
  • Make sure you put instructions to build the product in source control (chef!)

There was a presentation towards the end that showed how adding sound to a running program (initially for the purposes of accessibility) produced some interesting effects. I've done this kind of thing before (creating animations for log files). Sometimes you can just rely on your brain to find the interesting things when you present it in another way.

Conclusion

TICOSA was a great conference. There was a good line up of speakers and lots of interesting content to muse over. What would I like to see next year? I'd really like to hear more war stories. I'd love to hear stories of archaeological digs. I'd especially love to hear about restorations. My general impression is that very few code bases start a restore process and come out better at the end (usually you hear about the big rewrite and sometimes those fail too), but I'd love to hear otherwise!

I'm looking forward to getting back to work on Monday and scraping through the commit logs to see what I can uncover!