Canary Release and Feature Release with Feature Flags in Continuous Delivery

Code By: Dave Karow 7 min read
Thumbnail for Canary Release and Feature Release with Feature Flags in Continuous Delivery

Canary releases and feature flag rollouts are two common feature release strategies for testing in production, increasing the safety of continuous delivery, deploying faster and more often. Both aim to reduce the “blast radius” of unforeseen problems and build confidence in a release. Both gradually expose new code to users on the production infrastructure.  That’s about all they have in common!

The differences between canary releases and feature flag rollouts are significant, impacting development velocity and your quality of life. Let’s consider an example to understand what you can control and how you can react when things go wrong, using each of the two strategies.

Imagine it’s Thursday night and you have plans for the weekend ahead.  Then this happens:

A tale of two feature releases

Let’s say we are working on a service called the airline-booking-service.  It lets users search for airline tickets as well as book them. Today your team is releasing two new features in this service: a bags-fly-free feature that lets travelers search for flights where bags are free and a prepay-for-bags feature that lets travelers prepay discounted fees for bags on the other flights.

Now, let’s look at two ways to do this feature release safely and consider the differences.

How does a canary release work?

A “canary release” involves networking magic to deploy the new version of the airline-booking-service to a small subset of production machines. The canary phase starts with most machines running the old version of airline-booking-service, and a smaller subset of machines running your new version. A router (usually a load balancer) will route users to the two sets of machines, with stickiness turned on to ensure that each user has a consistent experience. Users routed to the canary release will keep getting access to the new features while the rest will not.

Determining the success of your canary release

By monitoring the logs and performance of the canary machines compared to others, we get early feedback about whether the release is ready for primetime. If all goes well with the canaries, we continue the deployment of new code to additional machines until we’ve got it running on all of them. Here is the canary release in a picture, adapted from a blog post by Itay Shakuri:

Typical canary release timeline

Canary release routes traffic to an instance of your new app on a separate server. It acts at the level of the full deployment, not individual features.

Deployment and release using a feature flag rollout

When using feature flags, the new version of airline-booking-service is deployed to all production machines, but the new features are hidden behind feature flags that limit exposure to a targeted subset of users. The traffic router exposing new functionality no longer lives in the networking layer. Instead, it is pulled into the application code and controlled by an if/else statement that wraps each of the new features. These if/else statements are the feature flags. Here is sample code for a feature flag:

// uid: id of the user
// map: a key-value map with user data that is used to make the flag decision
if (flags.getTreatment(uid, “bags-fly-free”, map).equals(“on”)) {
    // this code executes if bags fly free is on.
} else {
    // this code executes if  bags fly free is off 
}

Turning up the volume with feature flags

Think of feature flags is as volume dials on an audio mixer. The dials for each of the features can be turned up or down independently to control the rollout of these features to subsets of users.

Examples of “turning the dial up” include releasing the feature first to employees, to a percentage of users, to users in a certain geography, or to users of a certain type.  As with a canary release, you gain confidence in the new code while reducing the blast radius of mistakes. When you gain confidence in each feature, you can dial it up to all users independently of other features. At that point, you can also make plans to remove the feature flag (if/else statement) from your code to tidy things up.

Let’s do another diagram, but this time, think of the green color as the users passing through the code where the new features are active and the blue color as all the rest of the known-good features from V1:

Differences in feature flag release vs canary release

A feature flag rollout acts at the level of feature blocks in your code on a per-user basis. A simple if/else statement decides which code path gets executed for each user. The rules for that if/else can be changed on the fly at any time.

See how the feature flags now control the “rollout” of the new features, or feature release, irrespective of the version of software deployed to production? Those green feature flag dials might represent internal users only at first, then partial rollouts to other users in the field, and finally all users, at which time it would be safe to remove the feature flags altogether. On a side note, you may choose to keep some of the flags around for other reasons, such as an ops-toggle to shed load during peak traffic, but that’s another topic for another day 🙂

Differences between feature flag rollouts & canary releases

If you are a quick study, you already spotted a few fundamental differences between these two continuous delivery feature deployment and release strategies.  Let’s walk through them one at a time.

Where do canary releases live?

Canary releases live in the infrastructure networking layer. They are configured and orchestrated by infrastructure teams, such as site reliability (SREs) and DevOps teams.

Feature flags live in the application. Feature flag rollouts are configured and orchestrated by application teams, such as engineers, product managers, and agile testers.

Feature release targeting

Canary releases have limited targeting capabilities: most often they target by percentage of incoming traffic, and/or by whitelisting the IP addresses of internal users.  Turning them off means sending all incoming users to the old app instances while the canary traffic “drains” away as users complete their journey.

Feature flag rollouts have targeting super powers: anything your app knows about the current execution context is fair game and can be acted on instantly. As a result of that, a feature flag can target the rollout to any cohort of users, e.g. 20% of users in NYC, admins of customer accounts, 50% of free users. This infinite flexibility of targeting, which isn’t limited by what the infra team and load balancer can offer you, is a big deal.

Rolling back feature releases

Canary releases operate at the level of a deployment: the entire deployment is rolled back if things go wrong. If bags-fly-free has a critical bug, but prepay-for-bags is ready to go, both features will have to be rolled back for now to be deployed some other day. Ouch.

Feature flag rollouts operate at the most granular code level: an if/else statement inside any piece of your code. One feature can be dialed up or rolled back independently of any other, without a hotfix or redeployment. Yay! Now bags-fly-free can be turned off for bug fixes, while prepay-for-bags is dialed up to 100% of users.

Software development velocity and quality of life

Take a minute to ponder the implications of the differences here, and in particular the difference between killing the entire deployment or just killing one feature. Think of the flywheel effect of forward momentum your team gets by continuing onward with the good parts of the release.  Compare that to the buzz-kill of stopping everything to roll back the whole deployment because of one bug you missed.  Talk about “reducing the blast radius” and at a more personal level!  Hmm… let’s see… you avoid an all-hands fire-drill, a delay in being “done” for the rest of your team and the explanation to external stakeholders as to why the whole release didn’t go live as planned. Instead, you contain the issue by dialing production exposure to zero, whitelist the feature flag “on” for yourself and a few colleagues so you can debug it. Hopefully, you keep your weekend plans intact. Certainly this all goes down with WAY less drama and fireworks.

When to do a canary release, a feature flag rollout, or both

Canary releases make the most sense when the thing you are changing is a piece of infrastructure itself or an app that is effectively a single black box to you. They are not a good idea when you have a manifest of multiple features contained in a single deployment package.

If you do have multiple features in your deployments, consider using both canary releases and feature flag rollouts. In that scenario, you start every deployment as a canary with all new feature flags turned off.  You do that to watch for an obvious regression. If all is well with the canary, you deploy the new code to all machines and begin your feature flag rollout.

If you have to pick one continuous delivery strategy, pick feature flag rollouts. They are easy to implement, give you targeting superpowers, decouple features from one another and give you more control of your speed and safety.

Takeaways

Canary releases and feature flag rollouts are both powerful release strategies that leverage testing in production. Both catch bugs and scalability issues that were never anticipated before release while limiting the impact on your user population. Beyond that, there are quite a few reasons to choose feature flag rollouts as the sharper tool to have in your box. Why? They make for happier, faster-moving engineering teams thanks to the fundamental pattern shift that decouples code deployment from feature release. Life for you and your team is much better when a buggy feature can be quickly “faded to black” while the other shiny new ones continue to see the light of day in production.

Further reading

Shift Right with Feature Flags: Best Practices for Testing in Production

Introducing Split’s Feature Flag Edition—the Free Feature Flagging Solution for Development Teams