Task Statement 4.2: Design cost-optimized compute solutions.
📘AWS Certified Solutions Architect – (SAA-C03)
1. What is Scaling?
Scaling means adjusting your compute resources (like servers) based on demand.
Why scaling is important:
- Avoid paying for unused resources (cost optimization)
- Maintain performance during high traffic
- Ensure high availability and fault tolerance
2. Types of Scaling in AWS
There are two main types of scaling:
A. Vertical Scaling (Scale Up/Down)
- Increase or decrease the size of a single instance
- Example:
- Move from
t3.micro→m5.large
- Move from
Key points:
- Requires instance stop/start
- Limited by maximum instance size
- Not highly available (single instance)
Exam Tip:
Vertical scaling is NOT preferred for highly available architectures.
B. Horizontal Scaling (Scale Out/In)
- Add or remove multiple instances
- Example:
- Add more EC2 instances behind a load balancer
Key points:
- Uses multiple instances
- Works with Elastic Load Balancer (ELB)
- Supports high availability
Exam Tip:
Horizontal scaling is the preferred approach in AWS.
3. Auto Scaling (Core Concept)
What is Auto Scaling?
Auto Scaling automatically adds or removes EC2 instances based on demand.
Main service:
- Amazon EC2 Auto Scaling
Components of Auto Scaling
1. Launch Template / Launch Configuration
Defines:
- AMI (Operating System)
- Instance type
- Security groups
- Key pair
2. Auto Scaling Group (ASG)
- Group of EC2 instances
- Controls scaling behavior
Key settings:
- Minimum capacity → always running
- Desired capacity → target number
- Maximum capacity → upper limit
3. Scaling Policies
These define WHEN to scale.
Types of Scaling Policies
A. Target Tracking Scaling (Recommended)
- Automatically adjusts capacity to maintain a target metric
Example:
- Keep CPU utilization at 50%
Key points:
- Easiest to configure
- Most commonly used
- AWS manages scaling decisions
B. Step Scaling
- Scale based on thresholds
Example:
- CPU > 70% → add 2 instances
- CPU > 90% → add 4 instances
C. Simple Scaling (Less used)
- Basic scaling with cooldown period
- Slower and less flexible
D. Scheduled Scaling
- Scale at specific times
Example:
- Increase instances during business hours
Scaling Based on Metrics
Auto Scaling uses Amazon CloudWatch metrics, such as:
- CPU utilization
- Network traffic
- Request count (with load balancer)
Cooldown Period
- Time AWS waits before another scaling action
- Prevents rapid scaling in/out
Health Checks
Auto Scaling replaces unhealthy instances using:
- EC2 status checks
- ELB health checks
Benefits of Auto Scaling
- Cost optimization → only pay for what you use
- High availability → replaces failed instances
- Elasticity → handles variable workloads automatically
Exam Tips for Auto Scaling
- Use Target Tracking when possible
- Always combine with Elastic Load Balancer (ELB)
- Use multiple Availability Zones for high availability
- Auto Scaling is dynamic, unlike manual scaling
4. EC2 Hibernation (Cost Optimization Strategy)
What is Hibernation?
Hibernation allows you to pause an EC2 instance and save its RAM (memory) state to disk.
When restarted:
- The instance resumes exactly where it left off
How Hibernation Works
- RAM content is saved to the EBS root volume
- Instance is stopped
- When restarted, memory is restored
Difference Between Stop vs Hibernate
| Feature | Stop | Hibernate |
|---|---|---|
| RAM saved | ❌ No | ✅ Yes |
| Restart time | Normal boot | Faster resume |
| Application state | Lost | Preserved |
Requirements for Hibernation
- Supported instance types only
- Must use EBS-backed instances
- Root volume must be encrypted
- RAM size ≤ 150 GB
- Supported operating systems (Linux/Windows)
Use Cases (IT-focused)
- Applications that take long time to initialize
- In-memory processing workloads
- Development/testing environments with state
Cost Consideration
- You do not pay for instance compute when hibernated
- You still pay for EBS storage
Exam Tips for Hibernation
- Use when you need fast resume with preserved state
- Not suitable for:
- Stateless applications
- Auto Scaling groups (not commonly used together)
5. Auto Scaling vs Hibernation
| Feature | Auto Scaling | Hibernation |
|---|---|---|
| Purpose | Handle demand changes | Save state & reduce cost |
| Scaling | Yes | No |
| Automation | Fully automatic | Manual/limited |
| High availability | Yes | No |
| Cost optimization | Dynamic scaling | Idle cost reduction |
6. Best Practices for Exam
1. Prefer Auto Scaling + Load Balancer
- Ensures:
- High availability
- Fault tolerance
- Cost efficiency
2. Use Multiple AZs
- Always distribute instances across AZs
3. Choose Right Scaling Policy
- Default → Target Tracking
- Predictable workloads → Scheduled Scaling
4. Avoid Over-Provisioning
- Let Auto Scaling adjust resources dynamically
5. Use Hibernation for Stateful Workloads
- When application state must be preserved
7. Common Exam Scenarios
Scenario 1:
System must handle unpredictable traffic
✅ Use:
- Auto Scaling
- Target tracking policy
Scenario 2:
System runs only during office hours
✅ Use:
- Scheduled scaling
Scenario 3:
Application needs fast restart with saved memory
✅ Use:
- EC2 hibernation
Scenario 4:
High availability required
❌ Avoid:
- Single instance (vertical scaling)
✅ Use: - Auto Scaling + ELB + Multi-AZ
Final Summary
- Scaling = adjusting resources based on demand
- Horizontal scaling + Auto Scaling = best practice
- Auto Scaling Group (ASG) is the core component
- Target Tracking is the most important scaling policy
- Hibernation is used to save instance state and reduce cost
- Combine Auto Scaling + ELB + Multi-AZ for best results
