AWS Auto Scaling: Everything That You Need to Know
In this article, you will learn:
- What is AWS Auto Scaling?
- How does AWS Auto Scaling work?
- Getting started with AWS Auto Scaling
- AWS Auto Scaling vs. Amazon EC2 Auto Scaling
- AWS Auto Scaling vs. Elastic Load Balancing
- AWS Auto Scaling benefits and disadvantages
- AWS Auto Scaling pricing
Whether you are growing rapidly or demand is slowing and you need to scale down, AWS Auto Scaling can help. Manual scaling is time-consuming and costly, whereas automatic scaling adjusts capacity based on predictable performance and cost, helping you reduce waste and optimize AWS Cloud spend.
To help you better understand AWS Auto Scaling and its benefits, this blog post will introduce what AWS Auto Scaling is, how it works, its advantages and disadvantages, and more.
What is AWS Auto Scaling?
AWS Auto Scaling is a service in the AWS Cloud environment which enables you to configure scaling for selected AWS services that are part of your application in minutes. You can be sure that you’ll always have enough resources/instances to handle your application load, no matter how greatly or suddenly traffic may spike.
With AWS Auto Scaling, you can configure and manage the scaling of resources through scaling plans. This ensures that you add the computing power you need to handle the load on the application, and then remove it when it is no longer needed.
It’s useful for those who, for example, have an application that uses multiple Amazon EC2 instances and Amazon DynamoDB. AWS Auto Scaling can be used to manage resource provisioning for all of the EC2 Auto Scaling groups and DynamoDB tables in one place.
At this moment you can use AWS Auto Scaling to scale the following AWS services:
- Amazon Elastic Compute Cloud (EC2) Auto Scaling groups: Launch or terminate EC2 instances in an Auto Scaling group.
- Amazon EC2 Spot Fleet: Launch or terminate instances from a Spot Fleet request.
- Amazon Elastic Container Service (Amazon ECS): Adjust the ECS tasks in response to load variations.
- Amazon DynamoDB: Enable a DynamoDB table or a global secondary index to increase or decrease its provisioned capacity.
- Amazon Aurora: Dynamically adjust the number of read replicas.
AWS Auto Scaling is focused on horizontal scaling of these services. If you want to know more about horizontal scaling, read our blog post: Scalability in Cloud Computing: Horizontal vs. Vertical Scaling
How does AWS Auto Scaling work?
AWS Auto Scaling scans your AWS Cloud environment and automatically discovers the scalable resources so you don’t have to manually identify them one by one.
You can find scalable resources by CloudFormation stack, tag, or EC2 Auto Scaling groups.
Here is a simple diagram of the whole process:
Two main components of AWS Auto Scaling are scaling plan and scaling strategy.
The scaling plan is the core component of AWS Auto Scaling and you can simply create it in the AWS Console. It's a set of directions for scaling your resources. If you work with AWS CloudFormation or add tags to scalable resources, you can set up scaling plans for different sets of resources.
The scaling strategy is within the scaling plan and includes everything that AWS Auto Scaling needs to know to properly scale your application resources. You can optimize for availability, for cost, or a balance of both.
Alternatively, you can also create your own custom strategy, per the metrics and thresholds you define. You can set separate strategies for each resource or resource type.
When choosing your scaling strategy, you can also choose these two specific features:
1. Dynamic scaling
The number of instances is changed automatically based on information provided by a CloudWatch alarm which adjusts a number of instances in response to live changes in resource utilization(average CPU, network in/out). The intention is to provide enough capacity to maintain utilization at the target value.
For example, you can configure your scaling plan to keep the number of tasks that your ECS service runs at 75 percent of CPU. When the CPU utilization of your service rises above 75 percent (meaning that more than 75 percent of the CPU that is reserved for the service is being used), this triggers your scaling policy to add another task to your service to help out with the increased load.
2. Predictive scaling
With predictive scaling, actions on your instances are based on the predictable traffic patterns of the application. Currently, only the Amazon EC2 Auto Scaling group supports this feature.
It works by analyzing the historical records of the specified load indicators (CPU utilization, network input/output) for the past 14 days, but you can set it to a minimum of 24 hours of data. Then it generates a forecast two days in advance and schedules a scaling action on your EC2 instances to adjust the capacity. The goal of predictive scaling is to make the scaling index as close as possible to the target value.
For example, you can enable predictive scaling and configure it to keep the average CPU utilization of your Auto Scaling group at 60 percent. Your forecast calls for traffic spikes to occur every day at 9a.m. Then it creates the future scheduled actions to make sure that your infrastructure is ready to handle that traffic ahead of time.
Getting started with AWS Auto Scaling
There is no exact simple way to enable this service in your AWS environment because it’s very specific to your use case. You can find a good manual for every step in the AWS official documentation: Getting started with AWS Auto Scaling or open AWS Auto Scaling console, and try to go through the creation of a new AWS Auto Scaling plan.
Comparison of AWS Auto Scaling with other AWS services
AWS Auto Scaling vs. Amazon EC2 Auto Scaling
Sometimes, understanding the difference between these two AWS services can be confusing, so let's take a closer look at when it is best to use these services.
The best way to use AWS Auto Scaling is to manage multiple resources. It allows you to use scaling policies to define dynamic scaling policies for multiple EC2 Auto Scaling groups or other services. Using AWS Auto Scaling to configure scaling for more services at once is much faster than managing the scaling strategy for each resource. If you want to create predictive scaling for EC2 instances, you should also use AWS Auto Scaling.
If you only need to scale one or a couple of Amazon EC2 Auto Scaling groups or if you are only interested in maintaining the health of your EC2 instances, use Amazon EC2 Auto Scaling. If you need to set up scheduled or step scaling policies, you should also use EC2 Auto Scaling because AWS Auto Scaling does not support this feature.
AWS Auto Scaling vs. Elastic Load Balancing
Elastic Load balancing is the ability to evenly distribute traffic between your instances to ensure that none of these instances are overloaded by the traffic that they're receiving. There are three types of load balancers in AWS Cloud and they can be used to distribute traffic across instances created by AWS Auto Scaling or Amazon EC2 Auto Scaling.
AWS Auto Scaling benefits and disadvantages
AWS Auto Scaling benefits
- Setup auto scaling of multiple resources : AWS Auto Scaling lets you set target utilization levels (CPU, Network In/Out) for multiple resources in a single place in the AWS Management Console.
- Improved cost management : You can scale multiple services at once up or down (horizontally) according to the requirements of your organization. This allows you to save on the costs of managing these services.
- Reliability : Automatic scaling is efficient and reliable. It’s also simpler to do in one place, and whenever scaling is initiated, AWS can send you notifications.
Disadvantages of AWS Auto Scaling
There are no major disadvantages to using AWS Auto Scaling, but here are some common things you should understand before implementing it.
- Increase development complexity : Integrating any type of auto scaling may make deployment and configuration more complicated. You need a separate service to synchronize your code changes.
- Regionally limited : AWS Auto Scaling service is only effective in one region and it’s not possible to use it across resources in multiple regions. You’ll have to create AWS Auto Scaling in every region separately. So, if you have a multi-region application, it can be a challenge.
AWS Auto Scaling pricing
The AWS Auto Scaling service is free to use. You only pay for the AWS resources (EC2 instances, DynamoDB tables, etc..) and because the AWS Auto Scaling feature is enabled by Amazon CloudWatch metrics and alarms, you’ll pay the CloudWatch monitoring fees.
Automatic scaling in AWS is a very broad topic and there are a lot of services that can be scaled automatically in AWS Cloud. The AWS Auto Scaling service helps you to do this in one place for your whole application, with for example EC2 instances, DynamoDB tables, and more.