Using Failover Groups to avoid rain on your Parade

alt

If you were one of the unfortunate devs who had a stack caught up in the storm-related AWS outage last week, we felt your pain. It just goes to show that none of us are immune from the unforeseen perils of mother nature. The reality is, these sorts of outages happen all the time.

While I'm not one for panic and scaremongering, I did want to raise some awareness around how you can make app availability bulletproof. Enter failover groups.

What's a failover group?

A failover group is a managed, quick response DNS address that automatically follows your stack web endpoints. You can connect it to up to 2 stacks at any time to create a primary and backup stack. In the event that you're experiencing an outage and need to switch traffic between stacks, you can simply flip the switch and your traffic will flow to the backup stack within minutes.

Failover groups follow the web head of your stack. In other words, it points to your web server when you don’t have a load balancer, and if you have one, to your load balancer. Failover groups are designed to automatically update to point at a newly added load balancer. Similarly, it also gets automatically updated when you rename your stack or web servers.

Easy and seamless traffic switching between stacks

There are some major advantages to using failover groups, which include:

Application resilience
By building and nominating a secondary backup on a different cloud provider or data center you can use a failover group to switch your visitors from the primary to the Backup stack with ease.

Cloning stacks
If you need to clone or rebuild your stack, you can use a failover group to switch your traffic to the new stack without any interruption to your service.

Critical availability
Availability of applications for mission critical workloads can be secured by switching traffic from one data centre to another, or from cloud provider to another or even to an on-prem server.

Reducing downtime threat
Having a failover group helps prevent unwanted downtime should your primary stack become unresponsive, creating a seamless, always on user experience.

Setting up a failover group

Cloud 66 customers take a mix and match approach to how they setup their failover groups. The most typical scenario is a mirrored clone of the stack, housed in a separate geographical data centre, usually with the same cloud vendor.

Other customers with more high availability requirements opt to replicate their stack with a different cloud provider. In either scenario - as with any migration, you'll need to deal with moving your code, data and traffic to use the failover feature of Cloud 66:

Code: clone your existing stack to a different cloud vendor or data center, and set it into maintenance mode to prevent it from serving content. We highly recommend that you build a stack with similar server specifications to your main stack to avoid issues during a switch.
Data: enable database replication between your stacks - this will setup a master/slave architecture between your stacks, whereby the slave is an exact replica of the master at all times.
Traffic: use Failover Groups to make it easy for you to switch between stacks. By pointing your domain at the failover address, you will be able to switch your traffic between stacks at the click of a button.

Though straightforward in its setup, there are several steps involved to getting your settings right for your particular use case. So as always, we recommend you test your setup in a staging environment to reduce any potential downtime. If you have questions about how to setup failover groups, please don't hesitate to get in touch.