Task Statement 2.2: Design highly available and/or fault-tolerant architectures.
📘AWS Certified Solutions Architect – (SAA-C03)
1. What Are Service Quotas?
Definition
Service quotas are limits set by AWS on how many resources or operations you can use in an account.
These limits help:
- Protect AWS infrastructure
- Prevent accidental overuse
- Ensure fair usage across customers
Types of Service Quotas
1. Default Quotas
- Automatically applied when you create an AWS account
- Example:
- Number of EC2 instances per Region
- Number of VPCs per Region
2. Adjustable Quotas
- Can be increased by requesting AWS
- Example:
- EC2 instances
- Elastic Load Balancers
- IAM roles (some limits)
3. Hard Limits (Non-adjustable)
- Cannot be increased
- Example:
- Some limits in IAM or networking
Important Points for the Exam
- Quotas are per Region in most cases
- Some are per account
- Always check quotas before scaling a workload
- Use AWS tools to monitor and request increases
2. AWS Service Quotas Tool
AWS provides a service called:
👉 Service Quotas
What it does:
- View all quotas
- Request increases
- Set alerts
Integration with CloudWatch
You can:
- Monitor quota usage
- Set alarms when usage is near limit
3. What Is Throttling?
Definition
Throttling happens when you send too many requests to an AWS service, and AWS limits (slows down or blocks) those requests.
Why Throttling Happens
- You exceed API request limits
- Too many operations in a short time
- Service protection mechanisms
Example in IT Context
A backend application sends:
- Thousands of API calls per second to DynamoDB or EC2
If it exceeds allowed limits:
- AWS returns errors like:
ThrottlingExceptionRequestLimitExceeded
4. How AWS Handles Throttling
When throttling occurs:
- Requests may fail temporarily
- AWS expects you to retry properly
Best Practice: Exponential Backoff
Instead of retrying immediately:
- Wait before retrying
- Increase wait time gradually
Example:
- 1st retry → wait 1 second
- 2nd retry → wait 2 seconds
- 3rd retry → wait 4 seconds
This reduces pressure on AWS services
5. Designing for High Availability (Exam Focus)
This is the most important part for SAA-C03.
You must design systems that:
- Do not fail due to quotas or throttling
A. Plan Capacity with Quotas
Before deploying:
- Check quotas for:
- EC2 instances
- Load balancers
- RDS connections
- Lambda concurrency
B. Request Quota Increase in Advance
Especially important for:
- Production systems
- High traffic workloads
C. Use Multiple Resources
Instead of:
- One large resource
Use:
- Multiple smaller resources
Example:
- Multiple EC2 instances behind a Load Balancer
D. Use Caching
Reduce API calls by using:
- Amazon CloudFront
- Amazon ElastiCache
This reduces throttling risk
E. Use Queues
Use:
- Amazon SQS
To:
- Smooth traffic spikes
- Avoid sudden request bursts
6. Service Quotas in Standby Environments (Very Important)
This is a key exam concept.
What Is a Standby Environment?
A standby environment is a backup system used for failover.
Examples:
- Disaster Recovery setup
- Multi-region architecture
The Problem
In standby:
- Resources may not be fully running
- But quotas still apply
Key Risk
During failover:
- You try to launch resources
- But quota is too low
- System fails
Example Scenario (Exam-style)
Primary Region:
- 100 EC2 instances running
Standby Region:
- Quota allows only 20 instances
During failover:
- You cannot launch 100 instances → failure
Solution (Exam Answer)
You MUST:
1. Pre-configure quotas in standby region
- Ensure quotas match primary region capacity
2. Request quota increases BEFORE failure
3. Test failover regularly
Key Exam Statement
👉 Always ensure standby environments have sufficient service quotas to handle full failover load.
7. Handling Throttling in Architectures
A. Retry Logic
Applications should:
- Detect throttling errors
- Retry with exponential backoff
B. Use SDKs
AWS SDKs:
- Automatically handle retries
- Implement backoff
C. Rate Limiting
Control how fast your application sends requests
D. Use Async Processing
Instead of direct calls:
- Use queues (SQS)
- Use event-driven services (Lambda)
8. Common Services with Quotas & Throttling
Know these for the exam:
EC2
- Instance limits per Region
AWS Lambda
- Concurrency limits
- Burst limits
Amazon DynamoDB
- Read/Write capacity limits
- Can throttle if exceeded
API Gateway
- Requests per second limits
Amazon SQS
- Message throughput limits
Amazon RDS
- Connection limits
9. Monitoring and Alerts
Use:
- Amazon CloudWatch
- Monitor usage
- Set alarms
- Service Quotas dashboard
- Track limits
10. Exam Tips (Very Important)
1. Always Think Ahead
If question mentions:
- Scaling
- Failover
- Disaster Recovery
👉 Think: Are quotas sufficient?
2. Standby Region Questions
Correct answer usually includes:
- Increasing quotas in standby region
3. Throttling Questions
Correct solutions:
- Retry with exponential backoff
- Use SQS buffering
- Reduce request rate
4. Wrong Answers Usually Include
- Ignoring quotas
- Immediate retries without delay
- No retry logic
11. Summary
Service Quotas
- Limits on AWS resources
- Must be planned and increased if needed
- Critical for scaling and failover
Throttling
- Happens when too many requests are sent
- Requires retry logic and traffic control
For High Availability
- Pre-configure quotas
- Match standby capacity
- Use retries, queues, and caching
Final Key Takeaway
👉 A well-designed AWS system must handle both limits (quotas) and request pressure (throttling) to remain highly available and fault tolerant.
