Fatvat: October 2013

Wednesday, 30 October 2013

Not all names are created equal

I think everyone agrees that naming things is one of the hardest things you can do? Books like Clean Code devote whole chapters to naming. Names should convey meaning so that the next person reading the code has an easier job understanding what it does. After all, we read code far more than we write it. It's definitely OK to spend some time arguing about the right name. It's important.

So that's it. Names are important. Job done? Of course not! There's more to the story than that.

At Agile Cambridge 2013, I attended a session (Unpicking the Haystack) where the source code was only available from decompiled byte code (some sad story involving not using version control, not backing up and all the things that no-one ever does). Our task was to recover what the original program actually did. When we're looking at decompiled code almost all the naming information has gone. By the time you've gone source code to binary to source code you've lost variable names. Unsurprisingly, trying to decipher what the code does with local variables such as a1 to a999 is very hard.

With variable names gone, we have to look for other clues for programmer intent. So what else is there? Well, it certainly helps that public methods aren't lost. In this respect, method names are more important to get right than variable names. The naming is stickier. But something far more important gives us even more clues about this mystery code base.

Enter types. Decompilation reveals the names of public types. Names of types can show much more information than the variable name. For example, string s reveals little, whereas URL s reveals much more. If we're disciplined followers of domain-driven design then our types align with the problem we are solving. I'd say that types sit right at the top of the most-important-things-to-name-correctly hierarchy.

In this view of decompiled code, some names are more important than others. Parameter names and local variables are least important, whereas type names are the most important (with methods a close second).

Coming at names from decompiled source is certainly a weird way to do it, but this seems to fit with Bob Martin's definition of name length.

The length of a variable name should be proportional to its scope. The length of a function or class name is the inverse.
— Uncle Bob Martin (@unclebobmartin) February 20, 2011

I'd like to try to reinforce the view that types are by far the most important thing to get right. Crisply named abstractions matter more than almost anything else. To explore this area, we'll look at a strongly-typed static language, Haskell, and explore just enough syntax to understand its types. But first...

What is a type? A type is a label that describes properties of all objects that are instances of this type. If you see string in C#, you know you are getting an immutable set of characters with certain methods available. If you see a AbstractSingletonFactoryVisitorBean then you know you've got problems. I'm kidding.

Anyway, back to sensible types. Types describe program behaviour. Don't believe me? Let's begin our detour into Haskell:



-- Whenever you see "::" replace it with "is of type"

-- When you see a capital letter variable then you've got a type

-- add5 is of type Int, returning Int

add5 :: Int -> Int 

add5 x = x + 5



-- Parameters are separated by ->

-- For the purposes of this, let's just say the last one is the return

-- type and the rest are the arguments

-- add is of type (Int -> Int) returning Int

add x y :: Int -> Int -> Int

add x y = x + y



-- Generics are represented with lower case paramets

-- middle is of type three generic parameters (a,b,c) returning b

middle :: a -> b -> c -> b

middle x y z = y

Let's look at that last one again. middle :: a -> b -> c -> b. From the name we might guess that it returns the middle argument (e.g. middle 1 2 3 returns 2). Is there any other definition of what the function could do? In Haskell, there's no such thing as type-casting, if all I know is that something could be any type, there's not many options. I can't add anything to it. I can't convert it to a string. In fact, I can't do anything with it other than return it. The types don't let me. Types constrain the implementation choices to a more sensible subset.

Do the names matter? We know that the argument x has type a. Is there any more descriptive name? Probably not, from the type we have no idea what properties hold for the types so a long descriptive name is just wasting space. For all we know, the argument could be a function. Or it could be a monad. What are you going to call it?

Is the method name important? It's definitely nice to have a good name, but is it essential? If I gave you quux :: a -> b -> a I'm betting you could tell me what it does?

In fact, armed with just a little knowledge about types you can start to infer what functions do without even needing to see their definition. Here's a few random functions with really poor names; what do they do?



bananaFactory :: a -> a



-- (a,b) is a tuple of two elements of type a and type b

spannerBlender :: (a,b) -> a 



-- (a -> b) is a function taking anything of type a and returning type b

-- [a] is a list of items of type a

omgWTF :: (a -> b) -> [a] -> [b]



-- "Num a =>" says a must be an instance of the Num typeclass

-- think of this as specifying an interface

-- boing is a function of type taking two numbers and returning a number

boing :: (Num a) -> a -> a -> a



-- m is a type constructor that takes an argument of any type a

mindBlown :: (a -> b) -> m a -> m b

Armed with this basic knowledge of reading Haskell type signatures, you're now equipped to use Hoogle. You can search for the type signatures given above (a -> a, (a,b) -> a, (a -> b) -> [a] -> [b] and (a -> b) -> m a -> m b) and get a good idea of what these functions do.

So that's why I think long variable names are less common in functional programming. It's because the languages are terser (Uncle Bob's rule still applies) and because the type signature gives you the power of reasoning, not the variable names.

@kevfromireland @hairybreeches variable names matter much less in FP because types. #controversialopinion
— Jeff Foster (@fffej) October 25, 2013

Names are important; but not all names are equally important.

Thursday, 10 October 2013

Software Architect 2013 Day #2

What's wrong with current software architecture methods and 10 principles for improvement

Tom Gilb showed us a heck of a lot of slides and tried to convince us that we must take architecture seriously. I don't disagree with this, our industry could definitely do with a bit more rigour. Tom was very forthright in his views, and I appreciated his candour.

The system should be scalable, easy customizable and have a great user interface

That's a typical "design constraint" that we're probably all guilty of saying. This is nothing more than architectural poetry (putting it politely) or complete and utter bullshit. In order to take architecture seriously we should measure. Architecture is responsible for the values of the system. We should know these values and be able to measure them. If a given architecture isn't living up to these values, we should replace it with something that does. Architecture exists solely to satisfy the requirements.

Real architecture has multi-dimensional objectives, clear constraints, estimates the effects of changes. Pseudo-architecture has no dedication to objects and constraints, no ideas of the effects and no sense of the relationship between the architecture and the requirements.

If we're going to take architecture seriously, then we need to start treating it as engineering. We must understand the relationship between our architecture and the requirements of the system. We must demonstrate that our architecture works.

And then the wheels came off.

I don't work with huge systems, but I can clearly see that understanding the relationship between an architecture and the requirements is a good thing. Unfortunately, Tom presented examples from a domain that was unfamiliar to me (300 million dollar projects). In the examples, incredibly accurate percentages were shown (302%). At that point, I lost the thread. Estimates are just that, and if experience has taught me anything, it's that estimates have HUGE error bars. I didn't really see how all that planning up front led to a more measurable design. I've got a copy of Tom's book, Competitive Engineering so hopefully I can fill in the blanks.

Building on SOLID foundations

Nat Pryce and Steve Freeman gave a thought-provoking presentation entitled "Building on SOLID foundations" which explored the gap between low-level detail and high-level abstractions.

At the lowest level we have guidelines for clean code, such as the SOLID principles. At this level, it's all about the objects, but not about how they collaborate and become assembled into a functioning system. Even with SOLID principles applied, macro level problems occur (somehow all related to food metaphors), colourfully referred to as "raviolli code". Individual blocks are well organized, but as a whole it still looks like a mess. "Filo code" is code that's got so many layers you can't tell what's going on. "Spaghetti and Meatballs" code is an application with a good core, but the communication glue surrounding it is a huge mess.

At the highest level we have principles such as Conway's Law, Postel's Robustness Principle, CAP, end-to-end principle and REST.

But what's in the middle?

In the middle there are some patterns, such as Cockburn's, Hexagonal Architecture that help us structure systems as an inner domain language surrounded by specific adapters converting that data to the needs of the client. The question remains though; what are the principles between low and high level design?

Nat and Steve assert that compositionally is the principle for the middle. We should adopt a functional type approach and build a series of functions operating on immutable data in a stateful context. That sounds complicated, so what does code written in this style look like? Hamcrest gives us some examples, where by using simple combinators (functions that combine data) you can build up complicated expressions from simple operations (see the examples).

Having done a fair bit of Haskell I found it really easy to agree with this point of view. When there's no mutable state you can reason about code locally (and not checking for mutation). Local reasoning means that I can understand the code without jumping around. This is a hugely important part of a well-designed system.

I was slightly concerned to hear this style of programming as Modern Java. I hope it's not, because using Java like this feels like putting lipstick on a pig. One of the things I value in Haskell is that composition is a first class citizen. Partial application, function composition and first class functions mean that gluing simple code together to make something powerful is incredibly easy. I hope we're at that awkward point in language evolution where we're stretching our current languages to do things they don't want to do. Maybe this is finally the time when a functional language hits mainstream? (maybe it's Clojure or Scala.

We tried adopting this style of programming at Dynamic Aspects when building domain/j [PDF]. It was fantastic fun, and I really love Java's static imports for making the code lovely and terse (finding out $ is an operator also helps). Something about it feels dirty though. Haven't quite put my finger on what that was then, and hopefully with lambdas in Java 8 it's more natural.

So what is the bit in the middle? The bit in the middle is the language that describes your domain. Naming is everything and you should do whatever you can to make it easiest to understand. Eschewing mutable state and using functional programming to compose multiple simple operators seems to work!

Agile Architecture - Part 2

Allen Holub gave a presentation on agile techniques for design. Allen examined the fragile base class in some depth, before recapping CRC cards (not used enough!). Allen is a good presenter, so it was great to have a recap and have a few more examples to stick in my brain!

Leading Technical change

Nate closed out the day by giving a presentation on Leading Technical Change. It was well presented and focused on two things. How do you keep up with technology and how do you engage your organization to move to different technologies?

Nate presented some really disturbing statics about how much time Americans (and presumably other countries) waste on TV. Apparently the average American watches 151 hours of TV a month! Wow.

Nate introduced the audience to the idea of the technology radar which allows you to keep track of technology that is hot for yourself or your organization. We're trying to build one at Red Gate. We've also experimented with skills maps too, and you can see an example from a software engineering point of view here (love to know what you think?).

Introducing change is hard, and Nate presented the same sort of ideas that Roy presented the previous day. Change makes things worse in the beginning, but better in the end. Having the courage to stick out the dip is a hard thing! (image from here)

I have to admit, I didn't take many notes from this talk because I was enjoying it instead :) It was well presented and engaging with the audience. In summary, change is hard and it's all about the people. I think deep down I always knew that (people are way more complicated than code) but it was great to hear it presented so well!

Software Architect 2013 Day #1

The Coaching Architect

Roy Osherove presented "The Coaching Architect". If you want a better idea of the manifesto, read Notes to a Software Team Leader, it's a great book!

Roy asserts that your role as a "Software Architect" / team leader / leader of any kind is to grow the team to solve problems on their own. Far from making you redundant, this makes you a highly valued employee; by growing others, you'll always be wanted. Unfortunately, this means stepping outside your comfort zone and dealing with people.

Many managers like to take the money, but not do all the hard parts (Gerald Weinberg)

Learning something new is tough. Everything you learn has a downslope initially. You lose productivity, it's hard. However, once that thing has clicked, your performance rises. This pattern never ends!

I've seen this behaviour before in Programming Epiphanies. Initially I'll try a new technique or language (let's say when I first found C++ and objects). I was terrible; I constructed code at work that made me cry the next day I read it. Eventually things started to get better. I had my code critiqued by clever people (I say critiqued, I mean brutally torn apart) and rewrote it and practised it some more. Eventually, I felt as if I was a C++ ninja and I finally got the language. A few years later and I felt pretty comfortable. Until, Alexandrescu wrote Modern C++ and it felt like throwing myself off the cliff again!

Roy argues that to grow your team, you should push them off this cliff and challenge them. You need some risk to learn a new technology. Learning a framework is your spare time is not a risk; learning a new framework on the job? That takes some balls. Pushing people to learn also is a risky thing, but to grow the team we must first realize that we can put ourselves in that scary place and grow

There's a time and a place for learning. Roy outlines three phases on a team of development and a suitable stance for a leader to take on each part.

Survival - Teams are fire-fighting at this stage. Chaos rules! There's no time for learning. Teams in this mode need to get out of it, and the best way to do this is "Command and Control". Prioritize the tasks, use a clean slate and exit into the next mode.
Learning - Teams in this mode have time to learn new techniques. Roy asserts that teams in this mode might go 300-400% slower whilst they learn a new skill (say TDD). As a leader on a team like this, your role is to support the team through coaching.
Self Organizing - The team doesn't need you. They can solve problems on their own. Roy estimates that fewer than 10% of software teams find themselves in this place.

Teams get addicted to survival mode. Faced with the "write it now and get it out" or "test it" choice, teams often pick the former and get away with it. It feels good. It's only later when we realize we have to do a rewrite that we realize the folly of this decision.

It's OK, I hear you "lean" people. It's a waste doing it right, surely? It's an MVP man. What's the point in testing it if I'm going to throw it away? That'd be fine if you did throw it away, but you don't. We also overestimate how long we can get away with poor quality code. The design stamina hypothesis doesn't label the time axis, but in my opinion it's probably days or weeks not months or years.

Anyway. In order to break survival addiction, it's up to us as developers to take this under control. Give realistic estimations that build in time for doing it to an proper level of quality (you want to write unit tests? Make sure your estimate includes time for that, don't show it as a separate activity).

In order to understand why people don't change, Roy recommend the book Influencer: The Power to change everything.

For each behaviour, the world is perfectly designed for that to happen

In order to change behaviours, we need to understand the personal, social and environmental conditions that led for that behaviour to happen.

Roy ended up with a song. Which was weird. But was good.

Implementing micro-service architecture

Fred George gave an awesome presentation on micro-service architectures. He started with a brief history lesson of how he got to the plan of microservices, showing how his career had progressed from big monolithic applications through to service-oriented architecture. Each time he felt there was a collection of smaller things trying to get out, until one day he had the chance to try something crazy with a desparate team.

What if we built our applications as many tiny services, each fewer than 100 lines of code? Each service is like a class, a small and crisp conceptualization. Each service has a segregated database and encapsulates that information. Services publish conclusions; not raw data. This brings up some really interesting questions. How do you monitor systems like this? How do you keep it running?

The slides are available and I'd encourage you take a read. It's full of dangerous ideas. Why'd we need unit tests if we are just writing 100 lines of code? Why should we adopt a uniform language when we could just rewrite servers anyway? Why do we need to worry about copy-and-paste code?

I don't necessarily agree with everything (because I've never been involved in a system like this, and I'm a sceptical kind of guy), but it's great to see something different that challenges the way you think.

Architecture in the age of Agile

Rob Smallshire talked to us about architecture in the age of agile.

Lean thinking (I should read The Goal) says that it's all about the flow, and we should reduce our batch size to reduce waste. TDD does a great job of reducing batch size. You get feedback quicker and you find defects earlier. Rob argues that architecture is a counterpart to this; it helps you find defects in your system design earlier, it just works on a different time-scale.

A calendar and a clock also work on a different time scale, but we view these as complementary; not competing.

Architecture is about maximizing our ability to keep working sprint after sprint. Without architecture, how will you ever reach 200 sprints? Scrum is feature driven. I have a backlog of features which I rapidly complete, I throw them into a system and non-functional requirements are emergent properties.

This really clicked for me. In teams I've worked on cross-cutting concerns (performance, usability) often get neglected in scrum. We try to make features out of them, but that never works (how do you estimate "improve performance?"), so they instead get added to the bug tracker, usually split out into lots of little bugs (performance in history pane is too slow, performance on dialog x is too slow). The solution to these is not a local optimization; we should be considering the system and solving it from that perspective.

Rob presented some scary statistics on the half-life of code and developers on systems. On average on any project after about 3 years, 50% of the developers will have left. However, the code often lasts longer than that.

As architects, we should take that into account. Deliberate design provides context and structure and, most importantly, continuity for the project.

If you think there’s a conflict between “agile development” and “software architecture”, you don’t understand at least one of them
— Stefan Tilkov (@stilkov) October 3, 2013

Keeping Agile Development from becoming Fragile

Just because you can go fast, doesn't mean you should. (1995 Darwin awards)

In my opinion the technical debt metaphor is over used. In this talk, Howard used conscious technical debt to illustrate the point. That's a trade-off I'm willing to make sometimes, and he gave some good examples of how you could recognize problems early on and counteract them.

Final Thoughts

It's been a bit of a mixed-bag so far. The format of the conference doesn't encourage the conversations that happened at Agile Cambridge. Looking forward to the rest of it though, some exciting sessions on Thursday.