Task Statement 3.2: Design high-performing and elastic compute solutions.
📘AWS Certified Solutions Architect – (SAA-C03)
1. What Scaling Means in AWS
In AWS, scaling is about adjusting your computing resources to match the demand:
- Scale Out (or Up) – Add more resources (e.g., EC2 instances, containers) when demand increases.
- Scale In (or Down) – Remove resources when demand decreases to save cost.
To do this automatically, AWS needs metrics (measurements of your system) and conditions (rules that trigger actions).
2. Key Metrics for Scaling
Metrics are numbers that tell you how busy or stressed your system is. In AWS, these are often monitored via Amazon CloudWatch.
Here are the common metrics used for scaling:
a) CPU Utilization
- Measures how much of the CPU capacity is being used by your servers (EC2 instances, containers, etc.).
- Example: If CPU usage exceeds 70% for several minutes, the system might need more instances (scale out).
b) Memory Utilization
- Measures how much RAM is used by your applications.
- Not always default in CloudWatch, but can be added via CloudWatch agent.
- Example: If memory usage is above 80%, add more instances.
c) Network Traffic
- Measures data sent/received by your servers (in/out traffic).
- Example: If network traffic exceeds a certain threshold, you might need more instances behind a load balancer.
d) Disk I/O
- Measures the speed/volume of reading/writing to storage (EBS volumes, for example).
- Example: High disk read/write can slow down applications → scale out or increase instance size.
e) Custom Application Metrics
- You can define your own metrics from the application, like:
- Number of requests per second
- Queue length in SQS
- Active users connected
Custom metrics are especially important when CPU or memory isn’t enough to measure application load.
3. Conditions to Trigger Scaling Actions
Once metrics are tracked, you need conditions or thresholds that tell AWS when to scale:
- Threshold-based scaling: Scale when a metric exceeds (or drops below) a certain value.
- Example: “Add 1 EC2 instance if CPU > 70% for 5 minutes.”
- Step scaling: Scale in multiple steps depending on how far the metric is from the threshold.
- Example: CPU at 80% → add 1 instance, CPU at 90% → add 3 instances.
- Target tracking: Automatically keeps a metric at a target value.
- Example: Keep average CPU at 50% by adding/removing instances automatically.
4. AWS Services that Use Metrics and Conditions
a) Auto Scaling Groups (ASG)
- Automatically adds/removes EC2 instances.
- Uses CloudWatch metrics and scaling policies.
- Policies:
- Simple scaling – Trigger based on a single metric threshold.
- Step scaling – Trigger different actions based on how far metric exceeds thresholds.
- Target tracking – Automatically adjust to maintain a metric (like CPU) at a target value.
b) Application Auto Scaling
- Used for other resources like:
- ECS/Fargate tasks
- DynamoDB read/write capacity
- Aurora replicas
- Works the same way: metrics → condition → scaling action.
c) Amazon CloudWatch Alarms
- Monitors metrics.
- Triggers scaling policies when thresholds are met.
5. Best Practices for Scaling
- Choose the right metrics – Use metrics that truly represent system load. CPU alone might not be enough.
- Set realistic thresholds – Too sensitive → frequent scaling, too high/low → under-provisioned or over-provisioned.
- Use cooldown periods – Prevents launching/removing too many resources too quickly.
- Monitor and adjust – Continuously review metrics and adjust scaling rules.
6. Summary (Exam-Focused)
For the exam, remember:
| Concept | Key Points |
|---|---|
| Metrics | CPU, Memory, Network, Disk I/O, Custom metrics |
| Conditions | Threshold-based, Step scaling, Target tracking |
| AWS Services | Auto Scaling Groups, Application Auto Scaling, CloudWatch Alarms |
| Best Practices | Right metric, realistic threshold, cooldown, monitoring |
Tip: The exam often asks scenario-based questions like:
“Your web application sees CPU spikes at 80%. Which service and scaling policy would automatically maintain performance while controlling cost?”
Answer: Auto Scaling Group with Target Tracking or Step Scaling using CPU utilization metric.
