Faking It: Estimates and Metrics in Scrum

(United Kingdom)

February 28, 2018

"The most important metrics are: did we execute the way in which we said we would, and did we deliver the value to the business that we had promised?" - Jamie S. Miller

Scream In an earlier post we took a critical look at metrics and at how easily they can be abused. Pretty much anything can be measured, and the gratuitous presentation of numbers can give a sheen of science to an undertaking, no matter how absurd it might really be. The problem is that a wealth of data can seem to make a convincing case, even when the numbers have not been correlated to an hypothesis by rigorous empirical means. Hence phrenology, although it is now understood to be a pseudo-science, was thought to be a credible enough discipline by our forebears. Careful measurements of people's skulls were made in an attempt to ascertain their mental condition. Only over time, and through sceptical enquiry, did it eventually become clear that the shape of a person's head relates very poorly indeed to their psychological make-up. We can trust that any measurements taken were accurate and extensive, but the data was informationally useless when applied in pursuit of this supposed science. The measurements could never validate the phrenological method, irrespective of their quality and quantity. Today it is dismissed as the relict superstition of a bygone age.

Simply put, an abundance of metrics, irrespective of the precision with which they might be taken, cannot cheat a fundamentally weak correlation between hypothesis and data. The descendants of yesterday's “bump-readers”, however, can still be found in the board-rooms and management offices of large corporations today. With many people under their assumed control, they demand standardized measures of productivity by means of which employees might be compared, punished, and rewarded. Any straws may be grasped at as long as they can be counted. Thus agile teams, which ought to be assessed empirically by the incremental release of value, are instead gauged by the higher-ups in terms of how much estimated work they appear to have "delivered".

Those are the bumps of today. Estimates proxy for value in this grotesque dystopia. Measures like "story points" have become commoditized as a surrogate currency, inviting bizarre inflationary pressures and market distortions upon any numbers which might be arrived at. The actual provision of value to stakeholders is ignored as a quantity too difficult to measure, and so cock-eyed metrics are appropriated in compensation. Our work is cut out for us in trying to persuade delinquent executives to do the right thing - to master the science of measurement - and to value the empiricism which would allow informed decisions to be made.

A further irony is that these suspect techniques, whereby projections are made which are based on estimates, can be used quite rationally by agile teams themselves. The numbers represent a collaborative assessment of essential criteria, such as how much work a team believes it can take on. Having taken these measurements the teams which own them can then make reasoned forecasts. It is their data which they may use for their own projective purposes, even though other stakeholders can only be assured by the receipt of actual value. One reasonable forecast might be how much work they think is likely to remain at a given point before one of those valuable increments is delivered to customers. The shorter the time-period under consideration, the smaller the leap-of-faith a team will make when determining the likelihood of a valuable, empirical outcome.

The Sprint Burndown

The "Sprint Burndown" is an example of this sort of projective practice. It is based on estimates, and is quite familiar to many Scrum Teams. During Sprint Planning, a Development Team will meet with the Product Owner to agree on a selection of work from the Product Backlog. The selection forms the basis of the Sprint Backlog, which is a forecast of the work needed to achieve a jointly agreed Sprint Goal. This body of work may be revised during the Sprint time-box in order to better meet the Goal. Achieving a Sprint Goal is an accomplishment which is of signal importance in Scrum. Completing the original forecast of work arrived at during Sprint Planning is, in truth, somewhat irrelevant. The critical thing is to have a plan which allows the Goal to be met. It is the Sprint Goal, and not the Sprint Backlog, which represents the more artful team commitment. In essence, measuring how much work is left in the Sprint Backlog ought to be nothing more than an exercise in forecasting goal actualization. It relies on having up-to-date estimates which allow the team's progress itself to be continually estimated, until such time as an increment is delivered, and which empirically validates the work which has been undertaken.

A Sprint Burndown is a forecast of the work which remains to be done by a team, for which projections can be made based on prior forecasts, and it is updated throughout the Sprint until the goal is met. The Sprint Burndown may therefore be a projection based on estimates, but it is understood that the measurements are made by a team for its own purposes, and for no-one else's. It tells them whether or not they are actually on course to provide empirical evidence, by the end of a Sprint, that the complex challenge they have undertaken has been mitigated. External stakeholders will gauge progress only through the evidence vouched by actual delivery. Story points and other estimates should never proxy for this value, or be traded or commoditized. These measures are only useful to the teams which make them, within the context of their Sprint and their own development concerns.

Advocates of empirical process control may not be entirely satisfied with this. Even if we accept that value will be evidenced empirically by the end of each Sprint, we still see an attempt to measure progress using estimates. We see promissory notes for value instead of work genuinely done. The leap-of-faith being made through a story point Sprint Burndown is admittedly time-boxed and carefully limited, but it is a leap-of-faith nevertheless.

Why Estimate?

So why do it? Why estimate at all? Why not just focus on completing one item on a Sprint Backlog at a time, bringing it to release quality, and so measure progress in terms of the rate of value honestly and genuinely delivered? If we need a burndown to show us progress towards a goal, why not track that progress in terms of actuals rather than estimates? Moreover, wouldn't this allow empirical process control towards that very goal to be brought within the Sprint itself?

The argument is a sound one, and the case for "no estimates" in agile delivery has a lot to be said for it. Certainly, we must understand and accept that measuring progress on the basis of story points is indeed unempirical, even within the narrow confines of a Sprint. The delivery of working features, early and often, is the only measure of progress which can be truly satisfactory at any scale. What a story point burn-down may reasonably do, however, is to give a team transparency over a complex event. You see, that's what a Sprint really is. It isn't just a stream of work where independent and discrete pieces of value are exposed to uniform pull and flow. Their joint purpose is to meet a Sprint Goal. That goal can mitigate a very significant risk which ultimately makes a Sprint Backlog more than the sum of its parts. Incremental release certainly doesn't have to be deferred to the end of a Sprint, and it may indeed occur on the basis of pull-into-production and continuous flow. However, it might only make sense to effect a release at the end of a Sprint where a complex deliverable is at hand, and there are multiple unknowns to be juggled. Scrum makes no prescription about any of these scenarios or about the metrics which a trusted and self-organizing team ought to use. A story point burn-down is an interim construct through which empirical process control can be faked. When release happens, the fakery ends and progress is recalibrated. As long as we understand and accept this as well, then there may not be a problem.

The Product Burndown

Now let's consider another common way of projecting delivery by means of story-point estimates, and which is found in many Scrum implementations. The "Product Burndown", like the Sprint Burndown, is a forecast which shows how much work is likely to remain over time, and projected dates for its likely completion. However unlike a Sprint Burndown - which constrains a forecast to the Sprint Backlog - a Product Burndown attempts a forecast over perhaps the entire corpus of work. Estimates like story-points may be used to calibrate them, and to make projections which extend over many months, and possibly even into years of anticipated product development. Moreover, these estimates are not intended primarily for Development Team consumption, but rather for the benefit of senior stakeholders and other higher-ups who wish to be appraised concerning longer-term delivery outcomes. How reasonable is it to use Development Team estimates for these purposes? Shouldn't those people care more about receiving value iteratively and incrementally, rather than about graphs and charts and projections? Aren't we getting perilously close to the old bump-reading problem, where careful measurements end up being badly used, reality is misrepresented, and empiricism takes a back seat? In short, can executive types be trusted with Development Team measures and metrics?

Let's remind ourselves that, at its root, the only purpose of estimation is to allow a Development Team to figure out how much work it thinks it can take on. When those estimates are exposed beyond the team's circle of trust we may indeed run the risk of abuse, of story points being commoditized, of teams being compared or obliged to bid for work using points as a cryptocurrency, and other abominations. In Scrum this is a risk which lies squarely with a Product Owner to manage. As a member of the Scrum Team, the Product Owner is trusted to understand and respect the Development Team's estimates and to use any associated projections sensibly. The Product Owner will understand the limitations of using estimates to measure progress, and the importance of recalibrating a Product Burndown and any forecasts in light of the empirical evidence brought about by release. If there is doubt about the ability of other stakeholders to consume this data, then the Product Owner - as their representative, advocate and arbiter - must decide whether or not they ought to be exposed to estimated measures and forecasts at all. Perhaps they aren't. A Product Owner might be the only trusted consumer of Product Burndown information. The Product Owner must be respected as the authority who must interpret the available data, including forecasts, and who will make decisions for optimizing and releasing product value. He or she is the one customer representative who must lie within the Scrum Team circle of trust. It is an unprecedented level of responsibility and accountability...and it comes with the job.

What did you think about this post?

Share with your network

Comments (16)

David Sosa

04:47 pm February 28, 2018

Great article! Sharing with the other SrcumMasters in my org and my teams. Can't wait to see what conversations come up.

disqus_VQ04LfiRab

08:57 am December 19, 2018

Thank You! To me this might actually be the greatest practical benefit of SCRUM implementation - there is no way around accepting that dev work is unpredictable and abstract measures are not worth anything without proper correlation and personal interpretation and assumption of responsibility (in SCRUM by the product owner). This should be mandatory reading for every management level.

Stefan

10:55 pm December 20, 2019

Scrum is a lightweight but this article was a bit too heavy for me. Instead of the many nested (e.g. phrenology or descendants of yesterday's "bump-readers",) hints and long suspense curves, I would like to see this article written in a clear and direct way.

Alex Fragkiadakis

05:35 pm April 20, 2020

Thank you for this article. I believe that the most powerful feature of the Spring / Product Burndown charts is not that of the number of story points remaining (as elaborately analysed in the article these are forecasts done by and for the development team) rather the illustration of additional points that are introduced during the product lifecycle. Combined with justification of these additions, it can be invaluable for the Sprint Retrospective.

Marko

08:03 am May 10, 2020

Could not agree more.

Maik Kade

08:06 am July 3, 2020

Very insightful article. Thank you so much. I agree on so many levels with it as, on the one hand, it is very debatable whether such metrics have any useful value within an agile framework, since agility means being able to handle unforeseen situations. And something unforeseen is by definition difficult or impossible to estimate in advance.

On the other hand, it can make sense for a team to work with such metrics, especially if they wish to do so themselves. This can give a team experience values for their own productivity in different (complex) situations. This might be especially good for newer scrum teams, but I would recommend a different approach to story points to what you "usually" see:

An interesting variant is to record the story point estimate only retrospectively after the completion of a sprint and also only in a simple way (light, medium, large; 1/2/3 points or similar). This increases the self-perception of complexities of individuals and teams without building up pressure of having to fulfill a predefined story point number for a sprint.

After a while this approach allows a further retrospective perspective by asking the question: "Would you have initially assessed the complexity of the task as it ultimately turned out?"

This could open up- with some patience- the window for estimates to be implemented in sprint planning. Especially since this strategy could decrease the probability of team members/ teams choosing a noncommittal/ meaningless values (normally being exactly the middle value on any given scale) in order not to have to make a specific commitment.

But- as said before- all of this is only possible if the team itself considers such a thing to be useful and this type of metrics/estimates are not instrumentalized for the wrong purposes by stakeholders/ Project Managers etc...

KENMEUGNE TCHUINKAM Romuald Fr

05:18 am September 8, 2020

Trully said, but taking enough time to read, it becomes "digestible", haha.

Daisy Dai

09:39 am November 24, 2020

Thanks! However, I think for non-native English speakers or those who are new to Agile methodology, this article is a little bit hard to understand :D

Saurabh Sharma

07:03 pm December 19, 2020

All these metrics are for the development team to be used as gauges as they progress just like the multitude of gauges in cockpit of a plane for pilots. The ultimate goal / measure of success is NOT whether the right altitude is maintained all through the flight, whether speed was always optimal, whether head wind was under check etc. etc. BUT whether the plane landed safely or not with all passengers sound and safe. That's just like the measure of value that all stakeholders of a scrum team should be measuring and should be interested in.

Qing Mu

04:25 pm January 26, 2021

Best comment at scrum.org

aditya pandey

10:09 pm April 2, 2021

Good Write up ! Thank you for this !!

Viola Cocos

02:45 pm August 16, 2021

Who can create the Sprint Burndown?

Jesus Espinoza

02:14 am March 24, 2022

Great, great post!!! Congrats!!

Katrina Latyshava

10:35 am May 11, 2022

I can't but agree it's a heavy reading. Main ideas could have been voiced simpler and in a less fussy way, sorry.

Franck

03:59 pm January 27, 2023

The article certainly contains many interesting reflections and remarks. It does a very effective work in demonstrating that any estimate is a bad proxy for forecasting. Yet product sponsors (i.e. the one financing the product development) want to know by when the MVP and subsequent releases will reach the shore of their customers. Acknowledging that time-2-market is among the top key criteria for a successful product launch, what does scrum offer to address this foremost requirement on the agenda of their stakeholders? Advocating that scrum can only commit on one negotiated sprint goal every sprint, and pushing this responsibility only on the product owner shoulders is unfortunately insufficient. This certainly does not promote scrum nor make it sustainable at executive level.

disqus_fGLYY2MOlz

02:14 pm April 10, 2023

a lot of 'interesting business-speak' to make the article sound interesting but fully agree with other commenters it adds no value. I'm using ChatGPT to summarise articles like this in an easy to understand style and tone