Blue/green deployment is a continuous deployment process that reduces downtime and risk by having two identical production environments, called blue and green. (The names blue and green aren’t special or important — this process is also called red/black deployment or A/B deployment but for the purpose of this post, we’ll call them blue and green.)
Let’s say the blue environment is active, while the green one is idle. When a developer wants to release new code of any variety — a new feature release, a new version of the application, etc. — the work on the new version is done in the green environment, while the old version is maintained in the blue. Once the new release is finished, the load balancer switches all production traffic to the green version, and the blue version is maintained as a backup.
After the green version is live for a while and has been deemed bug-free, the old blue version is scrapped, the currently-live version becomes the blue, and a new production environment clone is created to become the new green.
Benefits of Blue/Green Deployment
The major benefit of blue/green deployment is in simple rollouts, quick rollbacks, and easy disaster recovery.
Have you ever had to deploy a feature release at an insane hour, because that was the only time you could take down the system without losing sales? Or maybe you’ve had a hard time finding any time to release, because your business is global enough that the middle of the night in one place is prime selling time in another? Blue/green deployment is zero-downtime, so the development team can make the switch and let the load balancing system automatically shift all users to the green version instantaneously — no staying up till 4am required.
Have you ever been called in on a weekend to roll back a buggy deployment? With blue/green deployment, the old version is ready and waiting in case something goes wrong, so all that’s required for a rollback is to ask the load balancer to switch users back to the blue version. This way, the programmers can come in at a normal, sane hour on a Monday to fix the issues with the green version, then deploy it again when it’s ready.
Still, there is a strategy even safer than blue/green deployment: the canary deployment strategy. Using canaries, the team will not just create two clones of production and test in only one, they will roll out the new code slowly, testing on only a subset of users before deploying to the entire user base. So in a new release, instead of an immediate switch from 100% of users seeing version blue to 100% seeing version green, the initial deployment can switch over only 10% of users and leave the rest on blue. This controls the blast radius on blue/green deployment.
Drawbacks of Blue/Green Deployment
There are some drawbacks to blue/green deployment. For one thing, running two identical environments is expensive. Whether you run multiple physical servers or multiple instances in Kubernetes or Amazon Web Services (AWS), maintaining two environments, a production environment and a production-cloned staging environment which could be pushed to production at any time, is not a simple task.
Furthermore, there is the database problem. The process of maintaining two clones of production and pushing only one of them live can cause all kinds of database problems. Do you clone the database? Don’t clone the database? And what if the database schema is going to be changed as a part of the new release? There are no easy answers. Database refactoring can fix the schema problem, and a mirror database can fix a few other issues, but in general, caution is necessary when any blue/green deployment involves a database component.
Blue/green deployment is a great way to mitigate risk and prevent problems from update downtime, but consider both the benefits and drawbacks before diving in.