Whether you are growing rapidly or demand is slowing and you need to scale down, AWS Auto Scaling can help. Manual scaling is time-consuming and costly, whereas automatic scaling adjusts capacity based on predictable performance and cost, helping you reduce waste and optimize AWS Cloud spend.
To help you better understand AWS Auto Scaling and its benefits, this blog post will introduce what AWS Auto Scaling is, how it works, its advantages and disadvantages, and more.
AWS Auto Scaling is a service in the AWS Cloud environment which enables you to configure scaling for selected AWS services that are part of your application in minutes. You can be sure that you’ll always have enough resources/instances to handle your application load, no matter how greatly or suddenly traffic may spike.
With AWS Auto Scaling, you can configure and manage the scaling of resources through scaling plans. This ensures that you add the computing power you need to handle the load on the application, and then remove it when it is no longer needed.
It’s useful for those who, for example, have an application that uses multiple Amazon EC2 instances and Amazon DynamoDB. AWS Auto Scaling can be used to manage resource provisioning for all of the EC2 Auto Scaling groups and DynamoDB tables in one place.
AWS Auto Scaling is focused on horizontal scaling of these services. If you want to know more about horizontal scaling, read our blog post: Scalability in Cloud Computing: Horizontal vs. Vertical Scaling
AWS Auto Scaling scans your AWS Cloud environment and automatically discovers the scalable resources so you don’t have to manually identify them one by one.
You can find scalable resources by CloudFormation stack, tag, or EC2 Auto Scaling groups.


The scaling plan is the core component of AWS Auto Scaling and you can simply create it in the AWS Console. It's a set of directions for scaling your resources. If you work with AWS CloudFormation or add tags to scalable resources, you can set up scaling plans for different sets of resources.
The scaling strategy is within the scaling plan and includes everything that AWS Auto Scaling needs to know to properly scale your application resources. You can optimize for availability, for cost, or a balance of both.
Alternatively, you can also create your own custom strategy, per the metrics and thresholds you define. You can set separate strategies for each resource or resource type.

The number of instances is changed automatically based on information provided by a CloudWatch alarm which adjusts a number of instances in response to live changes in resource utilization(average CPU, network in/out). The intention is to provide enough capacity to maintain utilization at the target value.
For example, you can configure your scaling plan to keep the number of tasks that your ECS service runs at 75 percent of CPU. When the CPU utilization of your service rises above 75 percent (meaning that more than 75 percent of the CPU that is reserved for the service is being used), this triggers your scaling policy to add another task to your service to help out with the increased load.

With predictive scaling, actions on your instances are based on the predictable traffic patterns of the application. Currently, only the Amazon EC2 Auto Scaling group supports this feature.
It works by analyzing the historical records of the specified load indicators (CPU utilization, network input/output) for the past 14 days, but you can set it to a minimum of 24 hours of data. Then it generates a forecast two days in advance and schedules a scaling action on your EC2 instances to adjust the capacity. The goal of predictive scaling is to make the scaling index as close as possible to the target value.
For example, you can enable predictive scaling and configure it to keep the average CPU utilization of your Auto Scaling group at 60 percent. Your forecast calls for traffic spikes to occur every day at 9a.m. Then it creates the future scheduled actions to make sure that your infrastructure is ready to handle that traffic ahead of time.

There is no exact simple way to enable this service in your AWS environment because it’s very specific to your use case. You can find a good manual for every step in the AWS official documentation: Getting started with AWS Auto Scaling or open AWS Auto Scaling console, and try to go through the creation of a new AWS Auto Scaling plan.
Sometimes, understanding the difference between these two AWS services can be confusing, so let's take a closer look at when it is best to use these services.

The best way to use AWS Auto Scaling is to manage multiple resources. It allows you to use scaling policies to define dynamic scaling policies for multiple EC2 Auto Scaling groups or other services. Using AWS Auto Scaling to configure scaling for more services at once is much faster than managing the scaling strategy for each resource. If you want to create predictive scaling for EC2 instances, you should also use AWS Auto Scaling.
If you only need to scale one or a couple of Amazon EC2 Auto Scaling groups or if you are only interested in maintaining the health of your EC2 instances, use Amazon EC2 Auto Scaling. If you need to set up scheduled or step scaling policies, you should also use EC2 Auto Scaling because AWS Auto Scaling does not support this feature.
Elastic Load balancing is the ability to evenly distribute traffic between your instances to ensure that none of these instances are overloaded by the traffic that they're receiving. There are three types of load balancers in AWS Cloud and they can be used to distribute traffic across instances created by AWS Auto Scaling or Amazon EC2 Auto Scaling.
There are no major disadvantages to using AWS Auto Scaling, but here are some common things you should understand before implementing it.
The AWS Auto Scaling service is free to use. You only pay for the AWS resources (EC2 instances, DynamoDB tables, etc..) and because the AWS Auto Scaling feature is enabled by Amazon CloudWatch metrics and alarms, you’ll pay the CloudWatch monitoring fees.
Automatic scaling in AWS is a very broad topic and there are a lot of services that can be scaled automatically in AWS Cloud. The AWS Auto Scaling service helps you to do this in one place for your whole application, with for example EC2 instances, DynamoDB tables, and more.
Adam Novotny is an AWS Solutions Architect at Stormit with 5+ years of experience designing and optimizing AWS cloud architectures.
He supports customers across the full cloud lifecycle — from pre-sales consulting and solution design to AWS funding programs such as AWS Activate, Proof of Concept (PoC), and the Migration Acceleration Program (MAP).
Adam holds the AWS Certified Solutions Architect – Professional and AWS Certified CloudOps Engineer – Associate certifications.