Using a "Technical Debt Register" in Scrum
"Debts and lies are generally mixed together" - Rabelais
As I get older, I'm turning into one of those annoying nostalgic-types who reminisces too much. Things were better back in the day, son. We had standards see, and there was less of this "dumbing-down". Yip.
But sometimes at dusk, as I rise from my rocking chair on the porch, hitting the spittoon one last time and leaning over to fix the crick in my back, I reckon there's one thing that's actually better now than in the early 2000's. What modern kids don't stop to appreciate is that these days a man can talk about "technical debt", and folks don't always assume he's a nut-job.
Technical debt can be defined as the longer-term consequences of poor design decisions. Originally described as a metaphor by Ward Cunningham, pretty much everyone now accepts that technical debt is a real risk which can genuinely be incurred. That recognition is good. Shoveling Java and damning the consequences isn't really "agile", it's just being a cowboy. In truth it always was, but for years this went unrealized until the debt certain goons racked up became unsustainable.
It's worth bearing in mind, however, that not all flaws and defects constitute technical debt. This is because they don't reflect "design decisions" which were actually taken. They are often just errors, no matter how irresponsible or egregious they might be. Also, if a decision doesn't directly compromise product quality, then it isn't technical debt. Hence a team may wish for a slick new IDE or plug-in, but the failure to provide the same isn't "technical debt", since it doesn't put the quality of the product itself at risk. It might very well reduce velocity, because they have to limp on using the old development platform, but that's a separate concern.
In Scrum, the expectation is that a Definition of Done should be sufficiently robust for unmanageable levels of genuine debt not to accrue. Hence if technical debt is known to be building up, the Definition of Done should be revisited. It's essential to find out why the debt is being incurred and how this can be avoided.
There are certain other controls which can be used to keep debt down. For example, a team should allow for refactoring (and indeed any other tasks) when deciding how much work they can induct into a Sprint Backlog without compromising long-term quality. Yet apart from these checks and balances, there's no prescription for how technical debt should be handled once you've got it. This isn't an oversight, since Scrum is deliberately as non-prescriptive as possible. It's a framework, and it's up to teams how they implement it.
Note however that implementing "special sprints" to clean up technical debt isn't an option. Technical debt sprints, also referred to as hardening sprints, are essentially an anti-pattern. Each and every Sprint must yield an increment of genuine release quality. That's why the Definition of Done is the primary bulwark against debt building up in the first place.
What some teams do is to maintain a technical debt register, whereby the design decisions which lead to debt can be rationalized. You can think of this register as a RAID log (Risks, Assumptions, Issues, and Dependencies) which is under the team's own purview. It details the technical consequences of expedient decisions on the quality of product implementation, often in terms of probability, impact, and suggested remedy. It may be possible to recommend mitigation during a certain Sprint, if the team have sight of a sufficiently well ordered Product Backlog. Assumptions and dependencies may also feature in the register, and of course some risks can eventually turn into issues which are pressing.
A technical debt register can help inform teams about how the debt they incur should be managed. They can make sensible, informed decisions about whether or not to incur debt and when to pay it back. The use of a register like this isn't part of Scrum, but nonetheless it can help a team to get a grip on the technical debt they decide to take on. Sometimes, for example, technical debt can be paid off when implementing related backlog items. In other cases that might not be possible and the debt must be addressed separately.
Certain instances of debt may need to be exposed to the Product Owner for consideration, and others may not. In severe cases technical debt may have been accumulated which exceeds the value of the project itself, possibly by multiple times. In such extreme cases the most pragmatic approach may be to can the project and start again. Needless to say, this variation on "fail now, not later" takes extreme courage.
Another option when faced with substantial technical debt is to whittle it down gradually, Sprint by Sprint. The team may need to conspire with the Product Owner to deliver something, each and every iteration, which releases sufficient value to stakeholders. Clearly this isn't a great option. It can end up in horse-trading, where the scale and ownership of the debt problem is not admitted to or made known to those stakeholders. Technical debt then becomes more a case of technical embezzlement. Affected parties might be encouraged to turn a blind eye as long as they get something of immediate business value in return, but none of this is good for openness, trust, and transparency.
Now, although I've compared a technical debt register to a RAID log, I'm not suggesting that it should take the usual "documentary" form of one. In my experience it's generally better to use an information radiator. In some cases it may be as rudimentary as a sheet of paper taped to the back of a Scrum Master's chair, but I prefer to coach teams to use a separate card wall for this purpose. Typical states are To Do, In Progress, Done, and Escalate. Items are Done when they are either mitigated or accepted.
Incidentally, this also works for conventional RAID logs at project or program level. Escalation from a Technical Debt Register would imply promotion to such a level. Managers are thereby encouraged to take an active interest in these registers, as well as impediments on a Scrum board, and to query any unescalated problems that appear to have stalled. It can be a great way to encourage "gemba", where they get out of the office, walk around, put their reports and filters to one side, and actually see things for themselves.