July 14, 2021

What Makes Scrum Teams Effective? A scientific investigation of 1.200 Scrum teams

Banner

This post is a non-technical version of an academic paper about Scrum teams that I wrote with Daniel Russo. Daniel is a Professor at the University of Aalborg and is specialized in empirical software engineering. I am an organizational psychologist and Scrum practitioner with a love for survey development and statistics. Please note that our paper is currently reviewed by academic peers. 

How can you make a Scrum team more effective? Most of the books, podcasts, blog posts, and material that we find online have to do with this question. How do scaling frameworks impact effectiveness? What about Sprint Goals? How can we empower teams to take more ownership? How can Scrum Masters and Agile coaches support this through exercises and workshops?

Most of this content is based on the personal experience and opinions of the creators. While there is great value in personal experience, it is also unwise to extrapolate conclusions from a few data points to all Scrum teams. If you’ve seen Sprint Goals work well with the dozen teams you have experience with, does that mean that all teams will benefit from them? What about stakeholder involvement? Or the degree to which teams are cross-functional? What if none of this matters, and we simply don’t see it?

I’ve always wondered what would happen if we subject these questions to a scientific approach. What if we would collect a lot of data from actual Scrum teams and then apply scientific methods to answer what really matters — regardless of personal preferences, intuitions, dogma’s, and vested interests?

In short, that is what happened. I created a free online app called the Scrum Team Survey. This free tool allows Scrum teams to diagnose themselves with an extensive survey, and receive detailed results and evidence-based feedback upon completion. We were able to use the anonymous data for academic purposes. I then collaborated with Prof. Daniel Russo to write an academic paper that has been submitted to the academic journal “Transactions of Software Engineering”. Although our paper is currently undergoing peer review, you can download the pre-publication of the paper here. This post is a non-technical companion to explain our method and results in a less academic form.

Model
Screenshots of the survey. Scrum Teams can participate for free in the survey. A detailed profile is offered afterward to help teams identify areas of improvement. You can try it at https://scrumteamsurvey.org.

 

Phase 1: A theory for Scrum team effectiveness

Our primary research question was: “Which are the key factors of effective Scrum teams and how do they relate to each other?”. Although the Scrum framework is built on scientific insights to some extent, we found no existing theories for Scrum team effectiveness in scientific literature. So we had to develop one in order to know what to measure and test.

We chose to develop our model from observational data from 13 case studies that took place over a period of five years. Although the case studies were not exhaustive of all kinds of Scrum teams, they provided a great starting point for model development by grounding it in real Scrum teams. The case studies allowed us to define important variables in the terminology of Scrum teams, rather than the other way around. I will define the core variables below. The case studies also provided insights into potential patterns. For example, Scrum teams seemed to be more effective when they collaborate closely with stakeholders. We also observed that autonomy varied greatly and that it seemed to impact effectiveness. Based on the patterns we observed in the case studies, we then combined them with established scientific research to develop the model below. 

Hypothesized model
The theoretical model we constructed from the case studies and existing literature. The gray spheres are our core factors. Each core factor — except management support — is measured and defined by a number of lower-order indicators. The arrows between the factors represent the expected effects. The next step is to test this model; does the grouping of factors meet our expectations (or not), and do the expected effects exist in data.

We expect five core factors (the grey spheres) that are measured and defined through several narrower lower-order factors (the white spheres) that we expect to cluster together in their scorings. So “continuous improvement” consists of variables like “psychological safety” and “concern for quality”. 

The model that we created from the observations in the case studies proposes that:

  • We defined “effectiveness” in terms of what the Scrum teams in our case studies expected to achieve. Effective Scrum teams can satisfy the needs of their stakeholders (external) and experience high morale while doing so (internal).
  • We defined “Responsiveness” as the capability of teams to release every Sprint. “Stakeholder concern” is the degree to which the team as a whole has a good sense of who their stakeholders are and what needs they have. “Continuous improvement” is a general climate where teams take ownership of their improvements and feel supported and safe to do so. We defined “Team Autonomy” as the (relative) freedom from internal and external constraints. Finally, we defined “Management Support” as the degree to which teams feel that management supports them and their work with Scrum. Curious readers who would like to dig deeper into the definitions can check the paper for more backgrounds.
  • The effectiveness of Scrum teams is primarily determined by the ability of teams to release frequently, close collaboration with stakeholders, the autonomy of teams, the degree to which teams operate in environments that encourage continuous improvement, and support from management.
  • Team autonomy, management support, and a climate of continuous improvement are essential “hygiene factors”. These need to be in place for Scrum teams to collaborate closely with their stakeholders and to release frequently.
  • While stakeholder collaboration, team autonomy, continuous improvement, and management support positively impact effectiveness, this positive effect is diminished when Scrum teams don’t actually release anything frequently. In a sense, this is what we’ve come to call Zombie Scrum (“It looks like Scrum, but there’s no working software”).
  • The effects that we expect in this model can be generalized across Scrum teams.

We formalized the proposed effects as “hypotheses” and marked them in the model (H1-H6). Now that we had a preliminary theoretical model for Scrum team effectiveness, we wanted to test it more rigorously with a dataset that was much larger than 13 case studies.

Phase 2: Testing the model with 1.200 Scrum teams

In research of this kind, we’re ultimately looking for signals and patterns in the data that confirm or disprove our model. This is where the size of our dataset is important. Because every model is a simplification of the real world, there is always “noise” because of pure randomness and factors we didn’t include in the model. This noise can make you miss patterns that are actually there or see patterns that don’t exist. It's a bit like a telescope. The larger the opening of your telescope, the more light you can catch with it and the clearer the picture. A small telescope might lead you to mistake a smudge for a fuzzy planet or vice versa, whereas the larger telescope offers such a clear picture that nobody can refute it. So larger datasets are generally better.

So we knew that to make robust claims, we needed a lot of data from actual teams. We used the Scrum Team Survey to gather data from 1.200 Scrum teams. This provided us with the statistical equivalent of a telescope with a very large opening that is capable of catching a lot of light. This allows us to easily distinguish noise from actual patterns. For those with statistical training; our dataset was large enough to detect even tiny effects (0.05 or less) with a statistical certainty approximating 100% (GPower).

Our measurement consisted of a 100+ question survey. We measured each topic, like “psychological safety”, “release frequency” and so on with two or more questions each. We used multiple questions to give us slightly different angles on that topic. With statistical techniques (CFA and HTMT) we can then determine if the questions are indeed measuring the same thing or different things. We ran several trials before the actual data collection and made some modifications to improve the survey. Some items were dropped, others were added and we also dropped a topic called “release automation” because it turned out to be very hard to measure consistently.

We then applied advanced statistical techniques — structural equation modeling — to analyze patterns in the data. Although these techniques are hard to explain briefly, the most important point is that they allow us to test specifically if our model accurately predicts the data we observed in real teams. If our model would reach the required threshold, we could safely conclude that the factors and the effects we propose indeed properly describe what happens in real Scrum teams.

So we entered our model and the data into specialized software called AMOS, and this is what we got. Because the picture probably doesn’t tell you much, I will capture our most important findings below.

Model
The statistical model from AMOS. The numbers above the paths indicate their strength (ranging from -1.0 to 1.0). The percentages in the circles represent the percentage of the variation from the real world that our model explains. The dashed lines represent effects that we expected to find but were not present in the data. The other effects were significant at .05.

Finding #1: Five factors explain a substantial part of Scrum team effectiveness

Our most important conclusion is that our model indeed accurately describes what happens in Scrum teams. The core factors in our model — responsiveness, stakeholder concern, team autonomy, continuous improvement, and management support — explain 47% of the real-world variation in stakeholder satisfaction and 29% in team morale. This may not seem much if you’d expect 100%, but anything above 20% is considered “large” in the social sciences. Getting a value close to 100% requires a model that is as complex as the real world and has to include an impractically huge number of variables ranging in the thousands. For example, perhaps the number of plants in the team room has a microscopic influence on their effectiveness. Or the physical distance from the Scrum Master. Ultimately, science prefers simpler models over more complex ones because simpler models are more practical.

So this is a great result. Our model predicts a substantial part of stakeholder satisfaction and team morale from just the few team-level factors we included. And the diversity and size of the dataset mean that we can generalize these findings to other Scrum teams without issue. Although we included “organization size”, “team age”, “team size” and “type of product (internal or external)” in preliminary versions of the model, none of these influenced the effectiveness of Scrum teams across the board. The only contextual factor that seemed to have a small (positive) influence was the experience of Scrum teams.

Also, note that our model doesn’t include how strictly Scrum teams adhere to Scrum or what practices they use. Still, we were able to predict a substantial amount of Scrum team effectiveness from more general team dynamics.

Practical implications

If you want to diagnose and support Scrum teams in their journey to increased effectiveness, the five factors in our model are a great starting point. It's a safe assumption that Scrum teams that score low on the five core factors will be less effective than those that score high. It’s also a safe assumption that effectiveness will increase when you invest in one or more of the core factors. When organizations undergo changes that impact the core factors, effectiveness will likely change accordingly.

Our findings should make it easier to convince management. It is clear from our results that if you want to make Scrum teams more effective, you need to invest in autonomy, continuous improvement, collaboration with stakeholders, and responsiveness. And, as we will discuss further down, some of these factors are more important than others depending on the situation a team is in.

We created the Scrum Team Survey to diagnose Scrum teams on the factors in our model and to offer evidence-based recommendations. The survey is available for free, with some advanced features for subscribers. You can also invite your stakeholders to participate for a clearer picture.

“It is clear from our results that if you want to make Scrum teams more effective, you need to invest in team autonomy, continuous improvement, collaboration with stakeholders, and responsiveness”

Finding #2: The most effective Scrum teams release at least every Sprint

Responsiveness is central to our model. This is the ability of Scrum teams to release every Sprint. We very consistently found that Scrum teams are more effective when they (can) release more frequently. In fact, our model suggests that “Responsiveness” acts as a gatekeeper (or ‘mediator’). The positive influence from team autonomy, continuous improvement, management support, and stakeholder concern on effectiveness diminishes as Scrum teams are less able to release frequently.

Although we didn’t set out to find empirical evidence for the Scrum framework, these findings do provide evidence for a foundational principle of Agile software development (“Ship it Fast”), and the Scrum framework specifically. 

Practical implications

I’m sure that every Scrum practitioner has seen Scrum teams that don’t release often in one form or another, and what happens because of that. Barry Overeem, Johannes Schartau, and I jokingly called it “Zombie Scrum” in our book, the Zombie Scrum Survival Guide; it is something that looks like Scrum but doesn’t have the beating heart of software. And although that metaphor is as unscientific as it gets, our study underscores the importance of frequent releases with evidence from 1.200 teams. 

Our results show that Scrum teams that don’t release frequently are not as effective as teams that do, regardless of how they score on the other factors like stakeholder concern, team autonomy, continuous improvement, and management support. No matter how great your Sprint Retrospectives are, how supportive management, or how concerned your team is with their stakeholders, our data shows that it doesn’t matter when releases to those stakeholders remain incidental. Hopefully, our study will make it easier to convince management of the urgent need to release frequently — even when this is challenging.

To offer some guidance on how to do this, Barry Overeem and I created five do-it-yourself workshops to help your team become more responsive.

“No matter how nice you make your Sprint Retrospectives, how supportive management or how concerned your team is with their stakeholders, our data shows that it doesn’t matter when releases to those stakeholders remain incidental.”

Finding #3: Shared Product Ownership is essential

In our model, stakeholder concern is the other side of the same coin as responsiveness. While releasing frequently is great, there is no benefit when what is released isn’t relevant to the needs of stakeholders (or vice versa). 

We defined “stakeholder concern” as a combination of focus on stakeholder needs, the presence of valuable goals, collaboration with stakeholders, and the quality of Sprint Reviews. Although this obviously connects to the role of the Product Owner, we did not specifically focus on Product Owners. You can also think of this factor as a “shared Product Ownership” in a team that may (or may not be) facilitated by Product Owners. Teams that do well on this factor understand why their work matters to stakeholders and can make decisions based on that. We defined “stakeholders” in the survey as “users, customers or people with a substantial stake in the product”.

Those Scrum teams in our dataset that scored high on stakeholder concern were also substantially more effective. With one catch; this positive effect diminished when teams were unable to actually release frequently (see above).

Practical implications

A clear practical implication of our findings is that there is a great benefit for Scrum teams in getting to know their actual stakeholders. Unfortunately, many Scrum teams still operate in environments where there is no direct access to users and stakeholders, and requirements are handed down from other departments.

We created six do-it-yourself workshops specifically aimed at this purpose. They’re a good start if you’re looking for inspiration.

“A clear practical implication of our findings is that there is great benefit for Scrum Teams in getting to know their actual stakeholders.”

Finding #4: Team Autonomy and Continuous Improvement create the right conditions for effective Scrum teams

The ability to release frequently and to work closely with stakeholders is clearly at the heart of effective Scrum teams. It is also hard to do, and not usually something that Scrum teams are good at from the start. So we also investigated two factors that create the right environment, or “hygiene factors”, for Scrum teams to prosper. 

First, the data confirms that teams that improve continuously are also more likely to focus on stakeholder needs. This effect was substantial. We defined “continuous improvement” fairly broadly in our study, and included psychological safety, concern for quality, the quality of Sprint Retrospectives, a learning environment, and shared learning. The picture that emerges from the data is that teams where it is safe to explore and learn, are also more inclined to create products that delight stakeholders. 

“The picture that emerges from the data is that teams where it is safe to explore and learn, are also more inclined to create products that delight stakeholders.”

The second hygiene factor we considered was team autonomy. There is already a substantial body of evidence on the benefits of high autonomy to knowledge work in general. But many Scrum teams still operate in environments where their autonomy is minimal and they can’t change how they work, in what order, and how to distribute work.

We defined team autonomy as the freedom from external constraints (self-management) and freedom from internal constraints (cross-functionality). We found that team autonomy most strongly contributes to continuous improvement and a lesser extent to the ability of teams to focus on the needs of stakeholders. 

However, our results did not confirm two expectations. We expected that both team autonomy and continuous improvement would also positively influence responsiveness. This was not the case. One possible explanation is that responsiveness is generally limited more by technology and skills than by autonomy and continuous improvement. Another explanation is that autonomy and continuous improvement lose their value when teams can’t focus on stakeholder needs and can’t release to them frequently. More research is needed to understand what interactions are happening here, specifically.

Practical implications

While an increasing number of organizations see the value of autonomous teams, many still consider it a “nice to have”. Or they treat it as optional in addition to the more visible elements of the Scrum framework — like the roles, artifacts, and events. Although many studies have shown how high autonomy makes knowledge workers and teams more effective in general, our data specifically confirms how autonomy contributes to Scrum team effectiveness. It gives Scrum teams more opportunities to improve their process and take ownership of it, while also increasing their ability to take ownership of their product.

We recommend that organizations design and support Scrum teams with high autonomy in mind. Management can support teams here by removing obstacles and limitations to that autonomy, and by helping teams to create a climate where they can learn, experiment, and improve.

Barry Overeem and I created thirteen do-it-yourself workshops and experiments specifically to invest in continuous improvement. There are also nineteen do-it-yourself workshops available to invest in team autonomy.

Finding #5: Management Needs To Create The Right Environment

What is the influence of management support on the effectiveness of Scrum teams? Although there is much to say about how and where managers can support Scrum teams, we specifically investigated their support as perceived by Scrum teams. From our data, we can clearly tell that management support positively impacts team autonomy, stakeholder concern, and continuous improvement. The effect on team autonomy is clearly the strongest. We also expected a positive effect on the responsiveness of a team but found none.

Exactly how management can support Scrum teams is a research question we’d like to explore later.

Practical implications

The relation between management and autonomous teams has been studied extensively by academics. Our study confirms what has already been well-established. If managers are serious about making their Scrum teams more effective, the best course of action for them is to actively support teams. Scrum teams are unlikely to become effective when they don’t feel that management has their backs or doesn’t understand why they are working with Scrum. 

“If managers are serious about making their Scrum teams more effective, the best course of action for them is to support teams rather than direct them.”

If you’re in an organization where management support is lacking, perhaps our study can help start the conversation around where management support is needed. The factors in our model can also guide this conversation into the areas that matter most.

We created five do-it-yourself workshops to involve management and start this conversation. 

The Road From Here

As we mentioned in the introduction, the paper we present in this post is a prepublication. We have submitted it to the academic journal “Transactions of Software Engineering” from the IEEE. Our paper will first undergo a rigorous peer review by other academics. They will check our methodology, analyses, and conclusions and offer feedback on what needs to be improved or clarified in order the meet established quality criteria for scientific publications. Most papers go through several iterations before they are officially published and this can easily take a year. We’re prepublishing our paper in the spirit of transparency and to show how this process leads to reliable knowledge.

This paper is the first in a series of investigations into what makes Scrum teams effective. Daniel Russo and I are on a mission to bring a more scientific perspective to the Agile community. Compared to many other professional fields, our field hasn’t been strongly connected to scientific research yet. And like those other professional fields, we share the same ethical responsibility to support our clients in ways that align with scientific insights.

Here are some of the research questions Daniel and I would like to pursue:

  1. How does diversity (gender, functional, cultural) influence effectiveness? Existing research suggests and strong effect, but this has never been studied for Scrum teams.
  2. What is the influence of conflict in teams on their effectiveness?
  3. Which interventions are most helpful at what stages of Scrum team development? How does their development generally progress?
  4. What is the influence of co-location and working from home?

Conclusion

Scrum teams are the beating heart of many organizations. Unfortunately, academic research into what makes them effective has been limited despite the popularity of Scrum. We feel there is great value in providing practitioners with a more evidence-based perspective on designing, supporting, and diagnosing Scrum teams.

We developed a theory for Scrum team effectiveness based on 13 case studies that took place over 5 years. We then tested and confirmed this model with a large dataset containing 1.200 Scrum teams from all over the world. Our results show that the most effective teams are those that release frequently and focus on the needs of their stakeholders, but not one or the other. In turn, this requires a high degree of team autonomy, a climate of continuous improvement, and support from management. Organizations that want to make their Scrum teams more effective are well-advised to make substantial investments in these areas. 

The model in this paper can act as a grounding framework to inform future research. And we sincerely hope it will — along with other research in this area — lead to more reliance on evidence-based approaches that demonstrably increase effectiveness.

How You Can Help

There are two ways you can help with this research. The first is to participate with your team in the free Scrum Team Survey. We’re especially interested in following Scrum teams over a longer period of time. Your (anonymous and aggregated) data is vital for our future studies. We recently added a subscription feature to generate some revenue to afford the hosting and continued development.

Second, this research is self-funded by The Liberators and our patrons. We don’t operate on grants or research funding as is usually the case for academic research. So donations are much appreciated if you see the value of this work. Your support is very much appreciated.