Task Statement 2.2: Design highly available and/or fault-tolerant architectures.
📘AWS Certified Solutions Architect – (SAA-C03)
1. What does this mean?
In many organizations, there are legacy applications:
- Old systems
- Monolithic applications
- Applications running on a single server
- Applications that cannot be modified easily
Problem:
These applications are:
- Not designed for cloud
- Not highly available
- Not fault tolerant
Goal:
Use AWS services to:
- Improve availability
- Improve fault tolerance
- Improve reliability
- WITHOUT changing application code
2. Key Design Principle for the Exam
👉 If you cannot change the application, you must:
- Improve reliability outside the application layer
- Use infrastructure-level solutions
3. Core Strategies (VERY IMPORTANT for exam)
3.1 Add Load Balancing
Service:
- Elastic Load Balancer (ELB)
What it does:
- Distributes traffic across multiple servers (EC2 instances)
Why important:
- If one server fails → traffic goes to others
- No change needed in application
Types:
- Application Load Balancer (ALB)
- Network Load Balancer (NLB)
Exam Tip:
✔ Use ELB when:
- Application runs on multiple instances
- Need high availability without code change
3.2 Use Auto Scaling
Service:
- EC2 Auto Scaling
What it does:
- Automatically adds/removes EC2 instances
Benefits:
- Handles traffic spikes
- Replaces failed instances automatically
Key Feature:
- Health checks → replaces unhealthy instances
Exam Tip:
✔ Best for:
- Increasing availability
- Self-healing systems
3.3 Deploy Across Multiple Availability Zones
What:
- Run application in multiple AZs
Why:
- AZ failure will not stop the application
Works with:
- ELB + Auto Scaling
Exam Tip:
✔ Multi-AZ = High Availability (very important concept)
3.4 Use Route 53 for DNS Failover
Service:
- Amazon Route 53
What it does:
- Routes traffic to healthy endpoints
Features:
- Health checks
- Failover routing
Use Case:
- Primary server fails → Route 53 sends traffic to backup
Exam Tip:
✔ DNS-level failover when application cannot handle failover
3.5 Use Amazon CloudFront (Caching Layer)
Service:
- Amazon CloudFront
What it does:
- Caches content closer to users
Benefits:
- Reduces load on backend servers
- Improves performance and reliability
Important:
- Works without modifying application logic
Exam Tip:
✔ Use when:
- Legacy apps are overloaded
- Need performance + resilience
3.6 Add Caching with ElastiCache
Service:
- Amazon ElastiCache (Redis/Memcached)
What it does:
- Stores frequently accessed data in memory
Benefits:
- Reduces database load
- Improves response time
Note:
- Sometimes requires minimal configuration change
Exam Tip:
✔ Use for:
- Database performance issues
- Read-heavy workloads
3.7 Database High Availability
Services:
- Amazon RDS Multi-AZ
- Read Replicas
What they do:
Multi-AZ:
- Automatic failover to standby database
Read Replicas:
- Distribute read traffic
Benefits:
- No app changes needed for failover (Multi-AZ)
Exam Tip:
✔ Multi-AZ = High availability
✔ Read replicas = Performance scaling
3.8 Use Amazon EBS & Backups
Services:
- Amazon EBS
- EBS Snapshots
What they do:
- Persistent storage for EC2
- Backup and restore data
Benefits:
- Data durability
- Fast recovery
Exam Tip:
✔ Snapshots = backup strategy for legacy apps
3.9 Use AWS Systems Manager (SSM)
Service:
- AWS Systems Manager
What it does:
- Manage servers without logging in
- Patch management
- Automation
Benefits:
- Keeps legacy systems updated
- Improves reliability
3.10 Use AWS Elastic Disaster Recovery (DRS)
Service:
- AWS Elastic Disaster Recovery
What it does:
- Replicates servers to another region
Benefits:
- Fast recovery from disasters
- Works for legacy systems
Exam Tip:
✔ Best for:
- Disaster recovery when app cannot be redesigned
3.11 Use AWS Storage Services for Durability
Services:
- Amazon S3
- Amazon EFS
Benefits:
S3:
- Extremely high durability
- Store backups, static data
EFS:
- Shared file system across instances
Exam Tip:
✔ Replace local storage with managed storage for reliability
4. Important Patterns for the Exam
Pattern 1: Lift-and-Shift Improvement
- Move app to EC2
- Add ELB + Auto Scaling + Multi-AZ
Pattern 2: External Resilience Layer
- Add:
- Load balancer
- Caching
- DNS failover
Pattern 3: Failover Setup
- Route 53 + health checks
- Secondary environment
5. Key Exam Scenarios
Scenario 1:
Legacy app on single EC2 instance
✔ Solution:
- Add ELB + Auto Scaling + Multi-AZ
Scenario 2:
Database is single point of failure
✔ Solution:
- RDS Multi-AZ
Scenario 3:
Application overloaded
✔ Solution:
- CloudFront + ElastiCache
Scenario 4:
No app modification allowed
✔ Solution:
- Use infrastructure services (ELB, Route 53, Auto Scaling)
Scenario 5:
Need disaster recovery
✔ Solution:
- AWS Elastic Disaster Recovery
6. Key Exam Tips (VERY IMPORTANT)
✔ If question says:
- “Cannot modify application”
👉 Use infrastructure-based solutions
✔ Always think:
- Can I fix this with:
- Load balancer?
- Auto Scaling?
- Multi-AZ?
- DNS failover?
✔ Avoid:
- Solutions requiring code changes
✔ Prefer:
- Managed AWS services
7. Summary (Quick Revision)
To improve reliability of legacy applications without changing code:
Use:
- ELB → distribute traffic
- Auto Scaling → self-healing
- Multi-AZ → high availability
- Route 53 → failover
- CloudFront → caching
- ElastiCache → performance
- RDS Multi-AZ → database HA
- S3/EBS → durability
- Systems Manager → maintenance
- Elastic Disaster Recovery → DR
Final Exam Strategy
👉 When you see:
- Legacy application
- No code changes allowed
✅ Your answer should focus on:
- Adding AWS services around the application
- Not modifying the application itself
