March 8th, 2022
What Are Canary Deployments and How to Best Use Them
Canary deployments are a deployment strategy that roll out releases to a subset of either users or servers first before being deployed to all users. They are useful for testing new deployments in production while minimizing the impact caused by any code malfunctions or other, unforeseen issues with a deployment. Any issues with a deployment will be limited in scope and can be rolled back without most users or servers being affected. Read on to learn more about canary deployments and why you should consider using them.
What is Canary Deployment and How it Works
The “Canary deployments” terms come from coal mining, when canaries were famously used in mines as an early warning indicator for toxic gasses. Bird lungs are much more sensitive than human lungs, so if a canary became agitated or died, it gave miners a chance to escape before succumbing to the toxic gasses themselves. Canary deployments work in much the same way, though fortunately, no animals are harmed this time around.
When a new release is ready to be pushed to production, there are several deployment strategies to be considered. You could simply push a new release to all production servers immediately. This would mean all users immediately benefit from the new release, but it also means that all users will be impacted by any bugs or issues with the deployment. Canary deployments are a way of testing deployments with a limited scope to mitigate the impact of any bugs or deployment issues.
A typical canary deployment involves choosing a server or subset of users to act as canaries. New releases are deployed to the canary server ahead of the others. You may then simply wait until satisfied or actively test the deployment in production. The deployment can then be either rolled back or rolled out to the other servers.
Rolling Deployments
Another deployment strategy that you can utilize is to phase your deployments to each server (or server cluster). Instead of pushing a canary deployment to a server and then to all servers, rather rolling deployments push updates incrementally. As an example, you may instead wish to deploy a new release to one server at a time, testing after each instance is updated before moving on to the next.
This deployment strategy further limits the impact of any bugs or issues, but it also means that you extend the duration of your deployment process, in some cases significantly. You will also need to support both the old and new release during a rolling deployment as some servers will be updated while others won’t. This may make it an inappropriate strategy for releases with known breaking changes.
Side-by-Side Deployments
Side-by-side deployments are also known as blue/green (and sometimes red/black) deployments. Two identical environments are selected (such as staging and production) that run different versions of an application. Tests are performed on the blue environment, while active user traffic runs on the green environment. After passing testing, user traffic is then switched to the blue environment.
Side-by-side deployments are easy to implement and low risk but costly. You have to run two full production environments to be able to test and switch user traffic between them, whereas canary deployments don’t require additional resources, as a canary server can function as both a staging and production server.
Why Use It?
Canary deployments are an incremental deployment strategy that is low risk compared to other bolder deployment strategies, but it also comes with some drawbacks because of the slower and more methodical process.
Pros
Canary deployment allows for new releases to get tested in a production environment with no downtime while limiting the scope of any issues within a deployment. There’s no substitute for testing under real production conditions and early data analysis of user behavior interacting with new features can be invaluable for both testing and feature tweaks. The deployment can also be compared to other servers running live instances of older versions, which can be beneficial for further testing, analysis and optimization purposes.
Cons
The downside to testing in production is that you’re actually doing it. This generally involves a lot more overhead, as it is more complicated to set up than running simple blue/green staging and production environments. You must support two production environments, one running the existing version of your application and one running the new version. You need to then partition your users, via a load balancer or some other method, to direct most users to the old version and a small subset to the new version.
Even after all this, you are still exposing that small subset of users to your less-tested new version. If there are any issues with it, you are exposing that subset of users to them. Generally speaking, this is still favored over exposing all of your users, but there will still be real users impacted by this and they may not like being used as canaries.
Best Uses for Canary Deployment
Canary deployments are best used for projects that like to experiment and innovate. They’re a low-risk means of testing new deployments and releases without having to maintain fully separate staging and production environments. A canary server can be used to test new releases ahead of other servers and then simply continue functioning as part of the production environment once the new release is rolled out to all servers, making it a cost-effective deployment strategy.
Guidelines for Implementing
Canary deployment is relatively simple to implement. One server (or group of servers) is identified as the canary server and receives a new deployment ahead of all the other servers. After testing or manual intervention until satisfied that there are no issues, the new deployment may be rolled out to all of the other servers.
Feature Flags
An alternative implementation is to use feature flags instead to identify certain groups of users to act as canaries, instead of servers. With feature flags, a new deployment is pushed to all users, but feature flags are used in the codebase to explicitly enable new features for only specified users. These can be as simple as an if/else statement to check if a user is a canary user or not and running a different block of code accordingly.
This allows for finely tuned, granular control of user exposure to new features. You may choose to release new features in stages in this way. For example, internal testers or staff members may act as the first group of canary users, followed by users who opt-in to early access to new releases, before finally being rolled out to all users.
Canary Deployment vs Feature Flags
There are some fundamental differences between canary deployment and feature flags. Canary deployments segregate users at the server level, by splitting up traffic and routing a subset of users to a canary server. This leaves limited scope for how these users are targeted. This is generally achieved by either IP whitelisting or load balancing.
Feature flags live in the application level and can use all the data available to your application in order to identify and segregate users. You can choose to test only guests or registered users, users from certain countries or regions, or even by their marketing preferences. This is much more flexible and it also means that different features can be turned on or off independent of each other.
For example, if there are two new features in a new release and an issue is found within one of them on a canary server, then both features will have to be rolled back. However, with feature flags, if an issue is found with one feature but not another, then the working feature can stay live while only the problematic feature is rolled back. However, you must be confident that your feature flags work as intended and that other issues do not occur outside of the feature flags.
Final Thoughts
Canary deployments allow you to control your new release rollout and offer a low-risk deployment strategy for testing in production in a limited scope before rolling out to all users. Issues can be identified early on and any negative impact can be limited. You also don’t need to maintain both a staging and production environment, as the canary server can function as both staging during the rollout and production once the rollout is complete.
However, setting up a canary deployment process is more complicated. A subset of servers or users must be identified and load balancers or other checks must be built to direct traffic appropriately. If you are interested in integrating canary deployment into your deployment process but don’t know how, or you have questions about the process, then get in touch with Guide Rails. Guide Rails offers an all-in-one collaboration and coding platform that makes integrating canary deployment as part of your overall development and deployment process easy.