Scalability capabilities with appropriate use cases (for example, Amazon EC2 Auto Scaling, AWS Auto Scaling)

Task Statement 3.2: Design high-performing and elastic compute solutions.

📘AWS Certified Solutions Architect – (SAA-C03)

Overview

Scalability means the ability of a system to handle increasing or decreasing workloads efficiently without affecting performance. In AWS, scalability ensures your applications can serve more users or process more data automatically when demand changes.

AWS offers two main types of scaling:

Vertical Scaling (Scale Up/Down)
- Increases or decreases the capacity of a single resource.
- Example: Changing an Amazon EC2 instance from t3.medium to t3.large to get more CPU and memory.
- Pros: Simple, quick.
- Cons: Limited by the maximum size of the instance; single point of failure.
Horizontal Scaling (Scale Out/In)
- Adds or removes multiple resources to meet demand.
- Example: Adding more EC2 instances behind a load balancer.
- Pros: Can handle very large workloads, more fault-tolerant.
- Cons: Slightly more complex to manage.

For the exam, AWS focuses more on horizontal scaling, because it’s the foundation of high availability and elasticity.

Key AWS Scalability Services

1. Amazon EC2 Auto Scaling

What it is: Automatically adjusts the number of EC2 instances in your application based on demand.
How it works:
1. Define a launch template (the type of EC2 instance, AMI, etc.).
2. Set scaling policies (conditions to add or remove instances).
3. Optionally attach to a load balancer to distribute traffic.
Types of Scaling Policies:
- Target Tracking Scaling: Adjusts instances to keep a metric (like CPU usage) at a target value.
- Step Scaling: Adds/removes instances in steps based on thresholds.
- Scheduled Scaling: Adds or removes instances at specific times (e.g., peak hours).
Use Case Example:
If CPU usage goes above 70% for 5 minutes, Auto Scaling adds 2 more EC2 instances automatically. When CPU drops below 30%, it removes the extra instances.

2. AWS Auto Scaling (Application-Level Scaling)

What it is: Manages scaling across multiple AWS services, not just EC2.
Services that can scale automatically:
- EC2 instances (via Auto Scaling groups)
- ECS tasks (containers)
- DynamoDB tables (read/write capacity)
- Aurora Replicas (database read replicas)
- Spot Fleet (managed spot instances)
How it works:
- Create a scaling plan for multiple resources.
- Use CloudWatch metrics to trigger scaling actions.
Use Case Example:
An application uses EC2, DynamoDB, and Aurora. AWS Auto Scaling can automatically adjust all three services based on traffic, keeping performance high and costs optimized.

3. Elastic Load Balancing (ELB)

Why it matters for scalability: ELB distributes incoming traffic across multiple EC2 instances.
Types:
- Application Load Balancer (ALB): For HTTP/HTTPS and web applications.
- Network Load Balancer (NLB): For TCP traffic and high-performance needs.
- Gateway Load Balancer (GLB): For network appliances.
Role in scaling: Ensures that new instances added by Auto Scaling get traffic evenly.

4. Amazon ECS & AWS Fargate

ECS (Elastic Container Service): Can automatically scale containers based on CPU/memory usage.
Fargate: Serverless container service; automatically scales the underlying compute.
Use Case Example:
A microservice receives sudden traffic spikes. ECS/Fargate launches new containers automatically to handle the load.

5. Amazon DynamoDB Auto Scaling

What it is: Automatically adjusts read/write capacity of DynamoDB tables.
How it works:
- Set target utilization (e.g., 70% of provisioned capacity).
- DynamoDB adjusts throughput up or down automatically.
Benefit: No need to manually monitor or provision throughput, reduces downtime or throttling.

Key Exam Points

Elasticity ≠ Scalability:
- Elasticity is automatic scaling in real time.
- Scalability can be manual or automatic, horizontal or vertical.
Auto Scaling vs AWS Auto Scaling:
- EC2 Auto Scaling: Only for EC2 instances.
- AWS Auto Scaling: Can scale multiple services (EC2, DynamoDB, ECS, Aurora).
Auto Scaling Policies:
- Target Tracking: Keep metrics at a target value.
- Step Scaling: Adjust resources in steps based on thresholds.
- Scheduled Scaling: Adjust resources at specific times.
Monitoring Metrics:
- CPUUtilization, Memory, Request Count, Queue Length.
- Auto Scaling uses CloudWatch alarms to trigger scaling actions.
Load Balancers and Scaling:
- Always pair Auto Scaling with ELB to distribute traffic efficiently.
Cost Optimization:
- Scale in when demand decreases to save cost.
- Use a mix of On-Demand, Spot, and Reserved Instances if applicable.

Simple Exam Tip

Think of it this way:

If traffic or demand increases, AWS automatically adds resources (Scale Out).
If demand decreases, AWS automatically removes resources (Scale In).
Auto Scaling + CloudWatch metrics + Load Balancer = High-performing, elastic system.