5 minute read
Traveling this weekend, I realized there are gatekeepers of two types: those that are people, and those that are software. Both are everywhere, guarding our access to valuable assets and granting permission to those who are qualified. That TSA agent is a gatekeeper, and so is the software check that happens when she scans my passport. Either has the power to accept or deny my request to board my flight, and either result has myriad follow-on effects.
In that way, your company’s production development environment can be a lot like airport security. With the exception of the smallest startups and projects, every company has human and software gatekeepers controlling which changes can be shipped to production, when.
On the human side, instead of the TSA agent there’s the QA engineer. Our airport analogy is helpful because we’re all familiar with the flaws in human gatekeepers when confronted with too much traffic: long lines, missed flights, and frustrated travellers. In a software company a similar cascade can occur: long waits and fierce competition for QA resources, missed release deadlines, and a crawling product development cadence.
In the software world though, we have the opportunity to improve this process, through the use of automation and better software delivery techniques.
Making Better Use of QA than ‘Validating Releases’
Don’t get me wrong; QA engineers are incredibly important. Our first backend developer was a QA engineer. However, he takes on a different role at Split, acting not as a gatekeeper for feature development, but a creative engineer for customer experience quality. He’s able to do this because we rely on software to be our gatekeeper.
I first experienced this difference in human vs software gatekeeping during my time at LinkedIn. When I joined, software delivery was painfully slow; every release had to be certified by a large QA team. By the time I left, we were releasing software multiple times a day, with QA enabling releases instead of gatekeeping them.
From Branches to Flags
How did we achieve this transition? A fundamental change in the way we delivered software: moving from feature branching to feature flagging (also known as feature toggles, feature switches, or feature flippers).
Feature branching is the idea that every feature be developed on its own long-lived branch. We call this branch ‘long lived’, because it exists for the entire lifecycle of the feature’s development and isn’t merged into main/trunk until QA has validated the finished branch in an integration environment. The whole process can take weeks or months, and after it’s merged back in the QA team still has to validate the entire release in a staging environment before the trunk is promoted to production. Only at this point is the feature live for customers.
The biggest benefit of this approach is its allowance for isolated development. This isolation comes at a steep cost though: competition for QA resources among developers and delayed integration dramatically slows down your software delivery cadence.
The alternative is feature flags: features built on ‘short-lived’ branches. These branches are merged daily into the trunk, which is then periodically deployed to production; your feature is technically in production within days of starting development, even if it’s unfinished. The key is that the feature is behind a flag, which controls access to it for each user—a software gatekeeper. By changing the flag’s configuration, you can make the feature inaccessible to every customer (also known as a dark launch), make it available to internal employees, grant access to beta customers when the time is right, or randomly deliver it to a percentage of your customers to test it out.
By releasing your features early, targeted to only the right groups of people, you can gain valuable feedback and iterate quickly on it, resulting in more productive development, or even just abandoning ideas that don’t seem to be working out before they drain more of your resources.
Enabling Experience-focused QA
In this model, the feature flag does the gatekeeping, not the QA team. If a problem happens, the feature can be turned off via the flag. This leaves the QA team to instead focus on building automation software to detect problems, software that can continually be running checks to ensure your customers are receiving the best possible experience. That’s what our QA organization spends time on here at Split.
All of this doesn’t mean that feature flags are a panacea—some changes will never be appropriate for flagging, like backward-incompatible database schema changes. For those, branching is still a valuable and useful approach. But for most discrete features being developed in software today, feature flags offer an unprecedented opportunity for precise control with a much greater speed of development, for individual engineers and the entire team.