Definition of Done Should include a Definition of Undo(ne)
Everyone building software products today aspire to be able to seamlessly update the production software in a continuous manner. To be able to deploy code without the ‘normal’ friction of process controls, reviews, test departments and committee meetings. As Martin Fowler describes in his review of Jez Humbles and Dave Farley’s books ‘Continuous Delivery’, the last mile of software delivery is often the hardest part. But is continuous delivery really the aim or is it something more? To understand this question, I want us to think about how our approach has evolved from Continuous Integration to Continuous Delivery and on.
In the beginning, there was Continuous Integration (CI). Well, actually a long time after the beginning. CI became popular with the advent of better working practices popularized by Extreme Programming. The idea that when you finished your work you committed that work into the main development branch, with everyone else and it was integrated, deployed to some magical environment and then tested. I must admit, CI changed my life. As a very average developer, I lived in my branch, my own separate environment avoiding integration until I really had to. I polished my apples/code until I was so sure it worked I would be tempted to show others. Integration with others was normally a nightmare fraught with blame, fingers being pointed and problems. It broke up my otherwise perfect job of sitting on my own solving thinking big thoughts. The work by Martin Fowler and others made me realized that maybe if I committed integration earlier with small chunks of stuff, my code would be better! Then automated testing and then....
But there was a flaw – We delivered our code, even tested it, but then moving into production was a nightmare and there was a cost. The challenges we had in integrating the code were nothing compared to the perils of moving that code into production. Differences in configurations and data made the likelihood of success low. To reduce the risk, complex processes and toolchains were introduced.
Continuous Delivery was the response to this. It basically applied the ideas of CI on a much grander scale. Let’s not just integrate and test, but let’s integrate, test and deploy to production. Let’s strive for a continuous process and remove all waste that gets in the way of that process. And organizations like Amazon, Facebook and others took this mission to extremes deploying every XX seconds and creating automation to reduce the cost and effort required.
But, then many started asking a simple question. Is the goal to release software or actually answer questions or learn something. Along came Lean UX, or Sense and Respond. The addition to the process of instrumentation and data gathering to provide insights into the use of a feature. Add to that A/B testing where you released competing features and compared the results. Continuous delivery became CD and Data capture/analysis.
And the Scrum community embraced the idea of continuous delivery + data capture/analysis. After all, Scrum is an empirical process. It needs continuous learning to allow course corrections improving both the ability to deliver stuff and how the stuff is delivered. This whole CI/CD/DCA would be documented in a simple artifact, the Definition of Done (DoD). The DoD guides the team as they plan, do and deliver work. It is used to communicate between teams, allowing those teams to know what the bar is for finishing their work. So, we start to see DoDs that include not only finished software but people using it and data coming back into the team. We start seeing discussions about when we know if something worked or not. And, all this driven by the DoD.
But what about undone?
Recently I was trying to persuade a person at a large financial company that they should release more frequently. His response was about risk. He said, “we can’t release until we are really certain, or at least I have done enough that if something goes wrong I can say I did everything I could to avoid this”. So, rather than describing the fact that you can never do enough and that the lack of production outages is an indicator not of success but of failure to push the envelope we started talking about something else. We talked about could we release a little bit, and if it doesn’t work, bring it back. Could we add an undone to our definition?
The idea is not new. Most, robust websites have the ability to rollback, but actually, it is much harder to do with complex transactional systems. Data dependencies make rollbacks hard and complex, but if we ever want to break the ‘we can’t release yet’ cycle we have to start thinking about undone. How do we pull this back? Can we automate that? Can we build that functionality in parallel to the done functionality? And frankly, it would really help development if you included it when building the main functionality allowing simple testing to happen over and over again without having to run DB and server refreshes.
So, as we start into 2018 I challenge you all – You are not done until you can undo(ne). :-)