It's a region-specific service.
Amazon EC2 Auto Scaling helps you ensure that you have the correct number of Amazon EC2 instances available to handle the load for your application. You create collections of EC2 instances, called Auto Scaling groups.
If you change the desired capacity, the capacity that you specify will be the total number of instances launched right after creating your Auto Scaling group.
There are no additional fees with Amazon EC2 Auto Scaling, so it's easy to try it out.
Launch Configuration - Instance type, AMI, security group, key pair. Can’t be edited after creation.
Autoscaling Group - Group name, group size, VPC, subnet, health check period.
Scaling policy - Metric type, target value.
Scale Out - The scale-out events direct the Auto Scaling group to launch EC2 instances and attach them to the group.
Scale In - The scale-in events direct the Auto Scaling group to detach EC2 instances from the group and terminate them.
Auto-scaling balances the number of EC2 instances in each AZ by relaunching and then terminating the ec2 instances if not balanced.
We can attach a running EC2 instance to an auto-scaling group but should not be part of another autoscaling group.
During rebalancing autoscaling can exceed the maximum value of EC2 instances by max 10% or one instance.
EC2 instances can be removed from the auto scaling group.
When the auto scaling group will be deleted its parameters like maximum-minimum are set to zero hence it terminates all the EC2 instances.
Elastic load balancer can be attached to Auto Scaling Groups but should be in the same region.
Consequently, any EC2 instances existing or added by Auto Scaling Group will be automatically registered with ELB.
Instance and elastic load balancer should be in the same VPC.
Unlike rebalancing, termination of unhealthy instances happens first then autoscaling attempts to launch new instances to replace.
When the launch configuration is done by CLI, a detailed monitoring for EC2 instances will be enabled by default (60 Sec) while basic monitoring (5 minutes) will be enabled for console creation.
Instances in a Standby state continue to be managed by the Auto Scaling group. However, they are not an active part of your application until you put them back into service.
Auto Scaling Group does not perform health checks on instances in standby state.
Keep this group at its initial size.
If there are no other scaling conditions attached to the Auto Scaling group, the group maintains this number of running instances even if an instance becomes unhealthy.
To maintain the same number of instances, Amazon EC2 Auto Scaling performs a periodic health check on running instances within an Auto Scaling group. When it finds that an instance is unhealthy, it terminates that instance and launches a new one.
If you stop or terminate a running instance, the instance is considered to be unhealthy and is replaced.
Scale the capacity of your Auto Scaling group in response to changing demand.
Increase or decrease the current capacity of the group based on a target value for a specific metric. This is similar to the way that your thermostat maintains the temperature of your home—you select a temperature and the thermostat does the rest. (e.g.: CPU utilization 70%, auto scaling group will add or remove EC2 instances to keep the utilization near 70%)
To use step scaling, you first create a CloudWatch alarm that monitors a metric for your Auto Scaling group. Define the metric, threshold value, and number of evaluation periods that determine an alarm breach.
Then, create a step scaling policy that defines how to scale your group when the alarm threshold is breached.
You can use a percentage of the current capacity of your Auto Scaling group or capacity units for the scaling adjustment type.
Add the step adjustments in the policy. You can define different step adjustments based on the breach size of the alarm. For example:
Scale out by 10 instances if the alarm metric reaches 60 percent
Scale out by 30 instances if the alarm metric reaches 75 percent
Scale out by 40 instances if the alarm metric reaches 85 percent
When the alarm threshold is breached for the specified number of evaluation periods, Amazon EC2 Auto Scaling will apply the step adjustments defined in the policy. The adjustments can continue for additional alarm breaches until the alarm state returns to OK.
Each instance has a warmup period to prevent scaling activities from being too reactive to changes that occur over short periods of time. You can optionally configure the warmup period for your scaling policy.
Simple scaling policies are similar to step scaling policies, except they're based on a single scaling adjustment, with a cooldown period between each scaling activity.
Single adjustment (up and down) in response to an alarm (add instances if CPU reaches higher than 70%).
Predictive Scaling, a feature of AWS Auto Scaling uses machine learning to schedule the right number of EC2 instances in anticipation of approaching traffic changes. Predictive Scaling predicts future traffic, including regularly-occurring spikes, and provisions the right number of EC2 instances in advance.
Used for predictable load change. You need to configure a scheduled action for scale out/in at a specific date and time and required capacity.
A scaling policy instructs Amazon EC2 Auto Scaling to track a specific CloudWatch metric, and it defines what action to take when the associated CloudWatch alarm is in ALARM.
The metrics that are used to trigger an alarm are an aggregation of metrics coming from all of the instances in the Auto Scaling group.
For example, let's say you have an Auto Scaling group with two instances where one instance is at 60 percent CPU and the other is at 40 percent CPU. On average, they are at 50 percent CPU.
When the policy is in effect, Amazon EC2 Auto Scaling adjusts the group's desired capacity up or down when the alarm is triggered.
When you use simple scaling, after the Auto Scaling group scales using a simple scaling policy, it waits for a cooldown period to complete before any further scaling activities due to simple scaling policies can start.
An adequate cooldown period helps to prevent the initiation of an additional scaling activity based on stale metrics.
By default, all simple scaling policies use the default cooldown period associated with your Auto Scaling group, but you can configure a different cooldown period for certain policies.
If a value for the default cooldown period is not provided, its default value is 300 seconds.
Auto Scaling waits for an EC2 instance, till it warms up and is ready to start sharing the load.
However, during this scaling time, the instances are not factored into the CloudWatch metrics for the group.
This avoids unnecessary scaling while the new instances prepare themselves to take on their share of the load.