Where should defects discovered during test-stabilisation work be tracked in Scrum?
I’m looking for guidance on how Scrum teams typically handle defects discovered during a test stabilisation story which asks to analyse the test failures and log them.
We have multiple Scrum teams working on the same product. In one sprint we created a story focused on stabilising historical failing UI automated regression tests. The acceptance criteria includes identifying failing tests and performing root cause analysis.
Many of these tests were created a long time ago and have been failing for various reasons (automation issues, test data/environment changes, and in some cases application behaviour). During the investigation, some failures are confirmed to be genuine product defects rather than test problems.
This has raised a process question in our teams.
One view is:
Because the defect was discovered while validating work in the sprint, it should be logged and associated with the sprint work (for traceability), and then triaged.
Another view is:
Because the defect is not necessarily related to current sprint feature development (and may belong to another team’s area), it should go directly into the product backlog rather than being associated with the sprint.
We are not trying to force teams to fix the defect within the sprint, only to understand what Scrum guidance suggests regarding visibility and tracking.
In Scrum, when a defect is discovered while performing investigation or verification work during a sprint - especially system/regression testing across multiple teams, is it considered part of the sprint work (until triaged), or should it be recorded only as a product backlog item?
Appreciate any advice or references from Scrum Guide interpretations or real team practices.
You'll have to look well beyond the Scrum Guide for any kind of guidance. Scrum doesn't have the concepts of "stories" or "test stabilization".
In Scrum, the team works from Product Backlog Items. Each Product Backlog Item represents a discrete change that would improve the product. The Scrum framework doesn't specify how to structure Product Backlog Items, but stories are a common technique that emerged from the Extreme Programming community. Although user stories are common, teams may also find alternative story formats useful.
Regardless of how you structure your Product Backlog Items, stories or otherwise, "test stabilization" work is often seen as something to avoid. During a Sprint, "quality does not decrease". That is, if defects are found during implementation of a Product Backlog Item, they are fixed before the Product Backlog Item is considered done. This doesn't mean the team will find and fix all defects, though, and the work to fix defects that escape is tracked as a Product Backlog Item and ordered appropriately. Some teams adopt strict policies and prioritize fixing known defects before any other feature work, but this doesn't work for every team. When defects, especially critical ones that require interrupting a team's Sprint plan, arise, the team usually takes the opportunity, through root cause analysis and/or the Sprint Retrospective, to review the Definition of Done and find ways to better detect or even prevent similar defects in the future.
Going back to your current situation...
Rather than a "test stabilization story", I'd suggest that each failing test or small groups of failing tests be treated as independent Product Backlog Items. "Fixing" the failing test could involve various types of changes. In some cases, the test was not updated or was incorrectly updated to represent desired functionality, so the work will be to update the date. In other cases, the test is "flaky" and intermittently fails, so understanding the root causes and stabilizing the test are the solutions. Finally, the test may be accurate and reveal a defect that needs to be corrected.
By grouping the work into individual tests or a small number of closely related tests, the Product Owner can better order the work. There may even be dependencies in which the team fixes tests in a particular part of the product before implementing new features or functionality in that part. The team responsible for implementing a desired change would first fix all tests associated with that change, then implement the change, and add or revise tests to cover it. If there's a true defect in the product, there may be a decision to put the work on the backlog (perhaps for another team) or fix the defect.
Something else to consider is disabling the tests. Right now, the tests aren't useful to the team, especially if you don't know why they're failing. This would ensure that these tests don't necessarily block teams from doing their work, but the team still has to have the discipline to not forget about the disabled tests. The team can reenable the tests as part of fixing them when the specific work calls for it.
The most important thing, though, would be to take steps to prevent additional tests from failing as the system is developed.
The work isn't Done, and the Developers will be accountable for the consequences of any quality failings. This should shape everyone's understanding of when to fix those defects.
In short technical debt has been incurred. The Developers may choose to defer the remediation of this debt over future sprints or they may decide the risk is unacceptable and they have to fix those defects now.
Either way, they are unlikely to be the "stories" you refer to. Framing them that way is usually just a canard - they are typically defects and that's it.
- automation issues or problematic test data/environment changes represent the absence of quality which stakeholders should be able to take for granted. They aren't about the product value they might expect to negotiate.
- application behaviour issues may or may not represent a quality failing. They could be to do with value, depending upon the specifics, and hence a new or an unsatisfied "story".
In short my understanding is that all work not done in a sprint, including defects, go to the product backlog.
The Sprint Backlog (and related Sprint planning artefacts) exists only for the duration of the Sprint.
If work is not Done by the end of the Sprint, it is returned to the Product Backlog for reordering by the Product Owner. How the Product Backlog is organised (grouping, tagging, linking, labelling, etc.) is up to the team and organisational conventions.
So if a defect is identified during investigation work in a sprint, and cannot be fixed in the sprint:
- Log it as a Product Backlog Item
- Link it to the investigation or feature story for traceability (if possible)
- The Product Owner orders (prioritises) it alongside other backlog items
- It is selected in a future Sprint if and when the team and Product Owner decide it is appropriate
Any errors relating to tasks being worked on in the current sprint must be fixed within that sprint. If this is not possible, the story will be marked as incomplete and returned to the backlog for inclusion in the next sprint. Anything else would create more technical debt.
However, this is not the case here. In this instance, errors originating from previous sprints are specifically sought out in order to reduce technical debt.
In a previous team, we had a working agreement, which I found to be a very plausible and straightforward approach. It could also be applied here, even though it is formulated more generally.
If an error pops up during the sprint that does not relate to the current sprint's work:
- If you can fix it without jeopardising the sprint goal, fix it !
- If you cannot fix it without jeopardising the sprint goal, then:
- If the error is critical (e.g. production-threatening or safety-relevant):
Make a new plan together with the Product Owner. - If the error is not critical, create a backlog item.
- If the error is critical (e.g. production-threatening or safety-relevant):