Amazon Route 53: Health Checks and DNS Failover

In this article, you will learn:

What is the Amazon Route 53 health check?
Types of Route 53 health checks
Route 53 DNS failover routing
Route 53 health check and failover tutorial

It’s common to use two or more resources that perform the same function in the AWS Cloud, such as two or more web servers (EC2 instances) or whole multi-tier infrastructures with web servers, databases, and static data storage. But how can you know that part of your infrastructure is working and that you are prepared to route your traffic somewhere else if it fails? This is where Amazon Route 53 DNS health checks and failover come in.

In this blog post, you will learn about Route 53 health-checking features and how to only route traffic to healthy AWS resources.

What is the Amazon Route 53 health check?

Route 53 health checks are a function that allow you to monitor the health of selected types of AWS resources or any endpoints that can respond to requests. Route 53 health checks can provide notifications of a change in the state of the health check and can help Route 53 to recognize when a record is pointing to an unhealthy resource, allowing Route 53 to failover to an alternate record.

Types of Route 53 health checks

At this moment three different types of Route 53 health checks are available to us. Endpoint health checks are the most common.

Endpoint health checks: You can configure to monitor an endpoint that you specify by IP address or domain name. Within a fixed time interval that you specify. Route 53 submits automated requests over the Internet to your application, server, or other resources to verify that it is accessible, available, and functioning properly.
Health checks that monitor other health checks: This type of health check monitors other Route 53 health checks. Basically, a "parent" health check will monitor one or more "child" health checks. If the provided number of child health checks report as healthy, then parent health checks will also be healthy. If the number of healthy "child" checks falls below a set threshold, the "parent" check will be unhealthy.
Health checks for Amazon CloudWatch Alarms: You can also perform health checks associated with alarms created in the CloudWatch service. These types of Route 53 health checks monitor CloudWatch data streams sent to previously configured alarms. If the status of the CloudWatch alarm is OK, the health check will report as OK.

Amazon Route 53 DNS failover routing

Failover redirects your production traffic from the primary region to the recovery region. If you use Route 53 for DNS, you can set up your primary region and recovery region endpoints under one domain name.

A routing policy is then selected to determine which endpoint receives traffic for that domain name. If the primary server is unhealthy based on your configured health checks, failover routing will automatically send traffic to the recovery area.

Route 53 active-passive vs active-active failover

The main difference between the two architectures is when the architecture is available and running. Active-active failover gives you access to all resources during normal operation. In an active-passive failover, the backup resources only see operations during the failover and are usually in a standby state.

Active-active failover

Active-active failover is used when you want all of your app nodes in all regions to be available simultaneously. Use this failover configuration when you want all of your resources to be available the majority of the time.

In this example, both region 1 and region 2 are active all the time. When a resource becomes unavailable, Route 53 can detect that it's unhealthy and stop including it when responding to queries.

For example, this can be created by using Route 53 weighted, geolocation, geoproximity, latency and multivalue answer routing policy.

Active-passive failover

Use an active-passive failover configuration when you want a primary resource or group of resources to be available the majority of the time and you want a secondary resource or group of resources to be on standby in case all the primary resources become unavailable.

In this example, only region 1 is active all the time and region 2 will be only active when failover starts (after region 1 is unavailable).

This can be created by using Route 53 failover routing policy.

StormIT helps you to optimize your cost if you are already using the AWS cloud or just planning your cloud adoption.

Learn more

Route 53 health check and failover example

This guide will show you how to set up a simple Route 53 active-passive failover with health checks in AWS Management Console. Our architecture is based in two regions and has Application load balancers (ALB), Auto Scaling groups, and EC2 instances (simple web servers).

The primary region is in Frankfurt. The secondary region is in Ireland.

This guide does not contain instructions on how to put some of the resources into operation. In this example, we already have some elements prepared, except for health checks and failover settings in Route 53.

Configuring Route 53 health checks for ALB

1. Log into the Route 53 Console.

2. Click on “Health checks” and then on “Create health check”.

3. Set the name of your Health check and provide the domain name of your ALB. And under Advanced configuration set “Request Interval” to 10 seconds and “Failure threshold” to 1. Click on “Next”.

4. In the next step only click on “Create health check”.

5. Repeat points 2, 3, and 4 with a different name and domain for the second ALB in a different region. You don’t have to change the Advanced configuration.