Task Statement 2.2: Design highly available and/or fault-tolerant architectures.

📘AWS Certified Solutions Architect – (SAA-C03)

1. What does this mean?

In many organizations, there are legacy applications:

Old systems
Monolithic applications
Applications running on a single server
Applications that cannot be modified easily

Problem:

These applications are:

Not designed for cloud
Not highly available
Not fault tolerant

Goal:

Use AWS services to:

Improve availability
Improve fault tolerance
Improve reliability
WITHOUT changing application code

2. Key Design Principle for the Exam

👉 If you cannot change the application, you must:

Improve reliability outside the application layer
Use infrastructure-level solutions

3. Core Strategies (VERY IMPORTANT for exam)

3.1 Add Load Balancing

Service:

Elastic Load Balancer (ELB)

What it does:

Distributes traffic across multiple servers (EC2 instances)

Why important:

If one server fails → traffic goes to others
No change needed in application

Types:

Application Load Balancer (ALB)
Network Load Balancer (NLB)

Exam Tip:

✔ Use ELB when:

Application runs on multiple instances
Need high availability without code change

3.2 Use Auto Scaling

Service:

EC2 Auto Scaling

What it does:

Automatically adds/removes EC2 instances

Benefits:

Handles traffic spikes
Replaces failed instances automatically

Key Feature:

Health checks → replaces unhealthy instances

Exam Tip:

✔ Best for:

Increasing availability
Self-healing systems

3.3 Deploy Across Multiple Availability Zones

What:

Run application in multiple AZs

Why:

AZ failure will not stop the application

Works with:

ELB + Auto Scaling

Exam Tip:

✔ Multi-AZ = High Availability (very important concept)

3.4 Use Route 53 for DNS Failover

Service:

Amazon Route 53

What it does:

Routes traffic to healthy endpoints

Features:

Health checks
Failover routing

Use Case:

Primary server fails → Route 53 sends traffic to backup

Exam Tip:

✔ DNS-level failover when application cannot handle failover

3.5 Use Amazon CloudFront (Caching Layer)

Service:

Amazon CloudFront

What it does:

Caches content closer to users

Benefits:

Reduces load on backend servers
Improves performance and reliability

Important:

Works without modifying application logic

Exam Tip:

✔ Use when:

Legacy apps are overloaded
Need performance + resilience

3.6 Add Caching with ElastiCache

Service:

Amazon ElastiCache (Redis/Memcached)

What it does:

Stores frequently accessed data in memory

Benefits:

Reduces database load
Improves response time

Note:

Sometimes requires minimal configuration change

Exam Tip:

✔ Use for:

Database performance issues
Read-heavy workloads

3.7 Database High Availability

Services:

Amazon RDS Multi-AZ
Read Replicas

What they do:

Multi-AZ:

Automatic failover to standby database

Read Replicas:

Distribute read traffic

Benefits:

No app changes needed for failover (Multi-AZ)

Exam Tip:

✔ Multi-AZ = High availability
✔ Read replicas = Performance scaling

3.8 Use Amazon EBS & Backups

Services:

Amazon EBS
EBS Snapshots

What they do:

Persistent storage for EC2
Backup and restore data

Benefits:

Data durability
Fast recovery

Exam Tip:

✔ Snapshots = backup strategy for legacy apps

3.9 Use AWS Systems Manager (SSM)

Service:

AWS Systems Manager

What it does:

Manage servers without logging in
Patch management
Automation

Benefits:

Keeps legacy systems updated
Improves reliability

3.10 Use AWS Elastic Disaster Recovery (DRS)

Service:

AWS Elastic Disaster Recovery

What it does:

Replicates servers to another region

Benefits:

Fast recovery from disasters
Works for legacy systems

Exam Tip:

✔ Best for:

Disaster recovery when app cannot be redesigned

3.11 Use AWS Storage Services for Durability

Services:

Amazon S3
Amazon EFS

Benefits:

S3:

Extremely high durability
Store backups, static data

EFS:

Shared file system across instances

Exam Tip:

✔ Replace local storage with managed storage for reliability

4. Important Patterns for the Exam

Pattern 1: Lift-and-Shift Improvement

Move app to EC2
Add ELB + Auto Scaling + Multi-AZ

Pattern 2: External Resilience Layer

Add:
- Load balancer
- Caching
- DNS failover

Pattern 3: Failover Setup

Route 53 + health checks
Secondary environment

5. Key Exam Scenarios

Scenario 1:

Legacy app on single EC2 instance
✔ Solution:

Add ELB + Auto Scaling + Multi-AZ

Scenario 2:

Database is single point of failure
✔ Solution:

RDS Multi-AZ

Scenario 3:

Application overloaded
✔ Solution:

CloudFront + ElastiCache

Scenario 4:

No app modification allowed
✔ Solution:

Use infrastructure services (ELB, Route 53, Auto Scaling)

Scenario 5:

Need disaster recovery
✔ Solution:

AWS Elastic Disaster Recovery

6. Key Exam Tips (VERY IMPORTANT)

✔ If question says:

“Cannot modify application”
👉 Use infrastructure-based solutions

✔ Always think:

Can I fix this with:
- Load balancer?
- Auto Scaling?
- Multi-AZ?
- DNS failover?

✔ Avoid:

Solutions requiring code changes

✔ Prefer:

Managed AWS services

7. Summary (Quick Revision)

To improve reliability of legacy applications without changing code:

Use:

ELB → distribute traffic
Auto Scaling → self-healing
Multi-AZ → high availability
Route 53 → failover
CloudFront → caching
ElastiCache → performance
RDS Multi-AZ → database HA
S3/EBS → durability
Systems Manager → maintenance
Elastic Disaster Recovery → DR

Final Exam Strategy

👉 When you see:

Legacy application
No code changes allowed

✅ Your answer should focus on:

Adding AWS services around the application
Not modifying the application itself