Scaling strategies (for example, auto scaling, hibernation)

Task Statement 4.2: Design cost-optimized compute solutions.

📘AWS Certified Solutions Architect – (SAA-C03)


1. What is Scaling?

Scaling means adjusting your compute resources (like servers) based on demand.

Why scaling is important:

  • Avoid paying for unused resources (cost optimization)
  • Maintain performance during high traffic
  • Ensure high availability and fault tolerance

2. Types of Scaling in AWS

There are two main types of scaling:

A. Vertical Scaling (Scale Up/Down)

  • Increase or decrease the size of a single instance
  • Example:
    • Move from t3.microm5.large

Key points:

  • Requires instance stop/start
  • Limited by maximum instance size
  • Not highly available (single instance)

Exam Tip:

Vertical scaling is NOT preferred for highly available architectures.


B. Horizontal Scaling (Scale Out/In)

  • Add or remove multiple instances
  • Example:
    • Add more EC2 instances behind a load balancer

Key points:

  • Uses multiple instances
  • Works with Elastic Load Balancer (ELB)
  • Supports high availability

Exam Tip:

Horizontal scaling is the preferred approach in AWS.


3. Auto Scaling (Core Concept)

What is Auto Scaling?

Auto Scaling automatically adds or removes EC2 instances based on demand.

Main service:

  • Amazon EC2 Auto Scaling

Components of Auto Scaling

1. Launch Template / Launch Configuration

Defines:

  • AMI (Operating System)
  • Instance type
  • Security groups
  • Key pair

2. Auto Scaling Group (ASG)

  • Group of EC2 instances
  • Controls scaling behavior

Key settings:

  • Minimum capacity → always running
  • Desired capacity → target number
  • Maximum capacity → upper limit

3. Scaling Policies

These define WHEN to scale.


Types of Scaling Policies

A. Target Tracking Scaling (Recommended)

  • Automatically adjusts capacity to maintain a target metric

Example:

  • Keep CPU utilization at 50%

Key points:

  • Easiest to configure
  • Most commonly used
  • AWS manages scaling decisions

B. Step Scaling

  • Scale based on thresholds

Example:

  • CPU > 70% → add 2 instances
  • CPU > 90% → add 4 instances

C. Simple Scaling (Less used)

  • Basic scaling with cooldown period
  • Slower and less flexible

D. Scheduled Scaling

  • Scale at specific times

Example:

  • Increase instances during business hours

Scaling Based on Metrics

Auto Scaling uses Amazon CloudWatch metrics, such as:

  • CPU utilization
  • Network traffic
  • Request count (with load balancer)

Cooldown Period

  • Time AWS waits before another scaling action
  • Prevents rapid scaling in/out

Health Checks

Auto Scaling replaces unhealthy instances using:

  • EC2 status checks
  • ELB health checks

Benefits of Auto Scaling

  • Cost optimization → only pay for what you use
  • High availability → replaces failed instances
  • Elasticity → handles variable workloads automatically

Exam Tips for Auto Scaling

  • Use Target Tracking when possible
  • Always combine with Elastic Load Balancer (ELB)
  • Use multiple Availability Zones for high availability
  • Auto Scaling is dynamic, unlike manual scaling

4. EC2 Hibernation (Cost Optimization Strategy)

What is Hibernation?

Hibernation allows you to pause an EC2 instance and save its RAM (memory) state to disk.

When restarted:

  • The instance resumes exactly where it left off

How Hibernation Works

  • RAM content is saved to the EBS root volume
  • Instance is stopped
  • When restarted, memory is restored

Difference Between Stop vs Hibernate

FeatureStopHibernate
RAM saved❌ No✅ Yes
Restart timeNormal bootFaster resume
Application stateLostPreserved

Requirements for Hibernation

  • Supported instance types only
  • Must use EBS-backed instances
  • Root volume must be encrypted
  • RAM size ≤ 150 GB
  • Supported operating systems (Linux/Windows)

Use Cases (IT-focused)

  • Applications that take long time to initialize
  • In-memory processing workloads
  • Development/testing environments with state

Cost Consideration

  • You do not pay for instance compute when hibernated
  • You still pay for EBS storage

Exam Tips for Hibernation

  • Use when you need fast resume with preserved state
  • Not suitable for:
    • Stateless applications
    • Auto Scaling groups (not commonly used together)

5. Auto Scaling vs Hibernation

FeatureAuto ScalingHibernation
PurposeHandle demand changesSave state & reduce cost
ScalingYesNo
AutomationFully automaticManual/limited
High availabilityYesNo
Cost optimizationDynamic scalingIdle cost reduction

6. Best Practices for Exam

1. Prefer Auto Scaling + Load Balancer

  • Ensures:
    • High availability
    • Fault tolerance
    • Cost efficiency

2. Use Multiple AZs

  • Always distribute instances across AZs

3. Choose Right Scaling Policy

  • Default → Target Tracking
  • Predictable workloads → Scheduled Scaling

4. Avoid Over-Provisioning

  • Let Auto Scaling adjust resources dynamically

5. Use Hibernation for Stateful Workloads

  • When application state must be preserved

7. Common Exam Scenarios

Scenario 1:

System must handle unpredictable traffic
✅ Use:

  • Auto Scaling
  • Target tracking policy

Scenario 2:

System runs only during office hours
✅ Use:

  • Scheduled scaling

Scenario 3:

Application needs fast restart with saved memory
✅ Use:

  • EC2 hibernation

Scenario 4:

High availability required
❌ Avoid:

  • Single instance (vertical scaling)
    ✅ Use:
  • Auto Scaling + ELB + Multi-AZ

Final Summary

  • Scaling = adjusting resources based on demand
  • Horizontal scaling + Auto Scaling = best practice
  • Auto Scaling Group (ASG) is the core component
  • Target Tracking is the most important scaling policy
  • Hibernation is used to save instance state and reduce cost
  • Combine Auto Scaling + ELB + Multi-AZ for best results
Buy Me a Coffee