Task Statement 4.2: Design cost-optimized compute solutions.

📘AWS Certified Solutions Architect – (SAA-C03)

1. What is Scaling?

Scaling means adjusting your compute resources (like servers) based on demand.

Why scaling is important:

Avoid paying for unused resources (cost optimization)
Maintain performance during high traffic
Ensure high availability and fault tolerance

2. Types of Scaling in AWS

There are two main types of scaling:

A. Vertical Scaling (Scale Up/Down)

Increase or decrease the size of a single instance
Example:
- Move from t3.micro → m5.large

Key points:

Requires instance stop/start
Limited by maximum instance size
Not highly available (single instance)

Exam Tip:

Vertical scaling is NOT preferred for highly available architectures.

B. Horizontal Scaling (Scale Out/In)

Add or remove multiple instances
Example:
- Add more EC2 instances behind a load balancer

Key points:

Uses multiple instances
Works with Elastic Load Balancer (ELB)
Supports high availability

Exam Tip:

Horizontal scaling is the preferred approach in AWS.

3. Auto Scaling (Core Concept)

What is Auto Scaling?

Auto Scaling automatically adds or removes EC2 instances based on demand.

Main service:

Amazon EC2 Auto Scaling

Components of Auto Scaling

1. Launch Template / Launch Configuration

Defines:

AMI (Operating System)
Instance type
Security groups
Key pair

2. Auto Scaling Group (ASG)

Group of EC2 instances
Controls scaling behavior

Key settings:

Minimum capacity → always running
Desired capacity → target number
Maximum capacity → upper limit

3. Scaling Policies

These define WHEN to scale.

Types of Scaling Policies

A. Target Tracking Scaling (Recommended)

Automatically adjusts capacity to maintain a target metric

Example:

Keep CPU utilization at 50%

Key points:

Easiest to configure
Most commonly used
AWS manages scaling decisions

B. Step Scaling

Scale based on thresholds

Example:

CPU > 70% → add 2 instances
CPU > 90% → add 4 instances

C. Simple Scaling (Less used)

Basic scaling with cooldown period
Slower and less flexible

D. Scheduled Scaling

Scale at specific times

Example:

Increase instances during business hours

Scaling Based on Metrics

Auto Scaling uses Amazon CloudWatch metrics, such as:

CPU utilization
Network traffic
Request count (with load balancer)

Cooldown Period

Time AWS waits before another scaling action
Prevents rapid scaling in/out

Health Checks

Auto Scaling replaces unhealthy instances using:

EC2 status checks
ELB health checks

Benefits of Auto Scaling

Cost optimization → only pay for what you use
High availability → replaces failed instances
Elasticity → handles variable workloads automatically

Exam Tips for Auto Scaling

Use Target Tracking when possible
Always combine with Elastic Load Balancer (ELB)
Use multiple Availability Zones for high availability
Auto Scaling is dynamic, unlike manual scaling

4. EC2 Hibernation (Cost Optimization Strategy)

What is Hibernation?

Hibernation allows you to pause an EC2 instance and save its RAM (memory) state to disk.

When restarted:

The instance resumes exactly where it left off

How Hibernation Works

RAM content is saved to the EBS root volume
Instance is stopped
When restarted, memory is restored

Difference Between Stop vs Hibernate

Feature	Stop	Hibernate
RAM saved	❌ No	✅ Yes
Restart time	Normal boot	Faster resume
Application state	Lost	Preserved

Requirements for Hibernation

Supported instance types only
Must use EBS-backed instances
Root volume must be encrypted
RAM size ≤ 150 GB
Supported operating systems (Linux/Windows)

Use Cases (IT-focused)

Applications that take long time to initialize
In-memory processing workloads
Development/testing environments with state

Cost Consideration

You do not pay for instance compute when hibernated
You still pay for EBS storage

Exam Tips for Hibernation

Use when you need fast resume with preserved state
Not suitable for:
- Stateless applications
- Auto Scaling groups (not commonly used together)

5. Auto Scaling vs Hibernation

Feature	Auto Scaling	Hibernation
Purpose	Handle demand changes	Save state & reduce cost
Scaling	Yes	No
Automation	Fully automatic	Manual/limited
High availability	Yes	No
Cost optimization	Dynamic scaling	Idle cost reduction

6. Best Practices for Exam

1. Prefer Auto Scaling + Load Balancer

Ensures:
- High availability
- Fault tolerance
- Cost efficiency

2. Use Multiple AZs

Always distribute instances across AZs

3. Choose Right Scaling Policy

Default → Target Tracking
Predictable workloads → Scheduled Scaling

4. Avoid Over-Provisioning

Let Auto Scaling adjust resources dynamically

5. Use Hibernation for Stateful Workloads

When application state must be preserved

7. Common Exam Scenarios

Scenario 1:

System must handle unpredictable traffic
✅ Use:

Auto Scaling
Target tracking policy

Scenario 2:

System runs only during office hours
✅ Use:

Scheduled scaling

Scenario 3:

Application needs fast restart with saved memory
✅ Use:

EC2 hibernation

Scenario 4:

High availability required
❌ Avoid:

Single instance (vertical scaling)
✅ Use:
Auto Scaling + ELB + Multi-AZ

Final Summary

Scaling = adjusting resources based on demand
Horizontal scaling + Auto Scaling = best practice
Auto Scaling Group (ASG) is the core component
Target Tracking is the most important scaling policy
Hibernation is used to save instance state and reduce cost
Combine Auto Scaling + ELB + Multi-AZ for best results