DORA continues to be the benchmark for evaluating the performance and efficiency of software engineering teams. However, as the industry relies heavily on feature flags, we should take a closer look at the DORA metrics and how we approach them.
When it comes to Mean Time to Restore (MTTR), it’s hard to ignore the impact that feature flags have upon the numbers and the way in which we respond to feature-related issues. Before I go into further detail, let’s do a quick review of DORA.
The 4 DORA Metrics
What are the DORA metrics? They’re a set of four metrics that measure software delivery performance. How did they come to be? They’re a result of seven years of surveys conducted by the DevOps Research and Assessment Group.
Here’s how each is currently defined:
1. Lead Time for Changes
This is the length of time between when a code change is committed to the trunk and when it is deployed to production.
2. Change Failure Rate
The change failure rate is the percentage of code changes that require hotfixes, rollbacks, or other remediation after production. This does not measure failures caught by testing that are fixed before code is deployed.
3. Deployment Frequency
This measures the frequency of new code being deployed into production. Many teams use the term “delivery” to mean code changes that are released into a pre-production staging environment, while “deployment” is reserved only for production environments.
4. Mean Time to Restore (MTTR)
This is how long it takes to restore a service from a partial interruption or total failure. Whether the interruption is the result of a recent deployment or an isolated system failure, MTTR is important to track.
In this particular article, I focus primarily on how feature flags redefine the way we respond to feature-related issues. Because of this, it’s worth reimagining the full potential of MTTR in way that’s relevant to today’s work streams.
Feature Flags Elevate the Old Standard
One of the metrics that feature flags blow the door off is Mean Time to Restore (MTTR). With feature flags, restoring to a previously stable state is as automatic as flipping a switch: Literally. Engineers who adopt feature flags are also shifting toward small, frequent release strategies. As a result, the blast radius of a release is minimized (as well as the level of recovery needed shall a feature-related issue occur).
All of this is great news for MTTR numbers. But what is MTTR missing in a feature flag-driven world?
Beyond Shipping Fixes Fast, It’s About Switching Off the Pain
While it’s important for engineering teams to get faster at creating and shipping fixes, there’s another modern tool in the toolbox that doesn’t require rushing through a half-baked fix. Feature flags allow you to instantly “turn off the pain” of an issue, so customers are not affected throughout the remediation process. As a result, you can spend the extra time needed to fully repair the problem without harming the user experience. This is a major plus for risk mitigation.
As we dissect the DORA metric from this lens, we shouldn’t just be considering the pace of restoration. Fixing things right without a customer noticing the problem is just as important as fixing things fast. Therefore, we should be prioritizing the time it takes to “stop the pain” as well. “How quickly can I turn this problematic feature flag off, so it’s not impacting customers?” This is another important efficiency to gain and improve upon.
Can MTTR accurately capture this metric? If not, what’s the new one? Switch Off Speed (SOS)? Let’s leave that to DORA to figure out.
Improve MTTR & More With Feature Management
By adopting a feature flag approach to instant triage, you can cut down MTTR to a matter of seconds (and in a way that barely harms user experiences). All you need is feature management to help.
With the right feature management platform, there’s no need to scramble and make major repairs to a big bang release in a rush. Instead, you’ll be able to release feature by feature, attaching each one to a feature flag and measuring the impact as soon as it’s turned ON. Is it creating latency issues? Is it breaking the experience? If it does, you’ll be automatically alerted to the problem causing feature, all you have to do is turn it OFF. Then, rather than rush to ship a major fix, you’re just isolating the problem and taking it out of the equation.
Hotfixes aren’t really a thing in this new way of working, and feature management platforms like Split are redefining the standards for speedy MTTR metrics and beyond.
As a new standard of nearly automatic triage emerges, it’s important you have the right tools and techniques at your disposal. Otherwise, you’ll be left in the dust. Don’t be on the wrong side of today’s faster MTTR standards with the ability to isolate issues throughout the process. Strengthen your approach to DORA metrics with a feature management platform that has automated rollout monitoring baked right in. You’ll eliminate downtime, hotfixes, and stop the pain experienced by customers with the push of a button.
More on “Rethinking the DORA Metrics”
I speak more in depth about reimagining the DORA metrics and leveraging feature flags in a recent podcast interview on Dev Interrupted. Be sure to listen here. Plus, be sure to check back soon on the Split blog feed for my upcoming discussion around Deployment Frequency.
Get Split Certified
Split Arcade includes product explainer videos, clickable product tutorials, manipulatable code examples, and interactive challenges.
Switch It On With Split
Split gives product development teams the confidence to release features that matter faster. It’s the only feature management and experimentation platform that automatically attributes data-driven insight to every feature that’s released—all while enabling astoundingly easy deployment, profound risk reduction, and better visibility across teams. Split offers more than a platform: It offers partnership. By sticking with customers every step of the way, Split illuminates the path toward continuous improvement and timely innovation. Switch on a free account today, schedule a demo to learn more, or contact us for further questions and support.