Distributed design patterns

Task Statement 2.2: Design highly available and/or fault-tolerant architectures.

📘AWS Certified Solutions Architect – (SAA-C03)


1. What is a Distributed System?

A distributed system is an architecture where:

  • Components run on multiple machines (nodes)
  • These nodes communicate over a network
  • Workload is shared across multiple resources

In AWS, this usually means:

  • Multiple EC2 instances
  • Multiple Availability Zones (AZs)
  • Multiple Regions
  • Managed services like Amazon SQS, Amazon DynamoDB, Amazon S3

2. Why Distributed Design Patterns are Important

Distributed patterns help achieve:

1. High Availability

System continues working even if one component fails.

2. Fault Tolerance

Failures are isolated and do not affect the entire system.

3. Scalability

System can handle increasing traffic by adding more resources.

4. Resilience

System recovers quickly from failures.


3. Key Distributed Design Patterns (Exam-Focused)

3.1 Load Balancing Pattern

A load balancer distributes incoming traffic across multiple servers.

AWS Services:

  • Application Load Balancer (ALB)
  • Network Load Balancer (NLB)
  • Elastic Load Balancing (ELB)

How it works:

  • Traffic comes in
  • Load balancer sends it to healthy instances
  • If an instance fails, traffic is redirected

Exam Tip:

  • Use ALB for HTTP/HTTPS (layer 7)
  • Use NLB for high performance (layer 4)

3.2 Stateless Architecture Pattern

A system is stateless when:

  • No session data is stored on the server
  • Each request is independent

AWS Implementation:

  • Store session data in:
    • Amazon DynamoDB
    • Amazon ElastiCache (Redis/Memcached)
  • Use multiple EC2 instances behind a load balancer

Why important?

  • Any instance can handle any request
  • Easy to scale horizontally

Exam Tip:

  • Stateless systems = easier to scale and recover

3.3 Caching Pattern

Caching stores frequently accessed data to reduce latency and backend load.

AWS Services:

  • Amazon CloudFront (CDN)
  • Amazon ElastiCache
  • API Gateway caching

Benefits:

  • Faster response time
  • Reduced load on databases

Exam Tip:

  • Use cache for read-heavy workloads
  • Use write-through or write-back caching strategies

3.4 Decoupling Pattern

Decoupling separates components so they don’t directly depend on each other.

AWS Services:

  • Amazon SQS (Simple Queue Service)
  • Amazon SNS (Simple Notification Service)
  • Amazon EventBridge

How it works:

  • Producer sends message to queue
  • Consumer processes message later

Benefits:

  • Systems are independent
  • Failures do not affect the entire system

Exam Tip:

  • Use SQS for asynchronous processing
  • Use SNS for fan-out (one-to-many messaging)

3.5 Microservices Architecture Pattern

Application is broken into small, independent services.

Each service:

  • Has a specific function
  • Can be deployed independently

AWS Services:

  • AWS Lambda
  • Amazon ECS / EKS
  • API Gateway

Benefits:

  • Independent scaling
  • Fault isolation
  • Faster development

3.6 Event-Driven Architecture Pattern

System reacts to events instead of direct requests.

AWS Services:

  • Amazon EventBridge
  • SNS
  • Lambda (event triggers)

How it works:

  • Event occurs (e.g., file upload)
  • Event triggers downstream services

Exam Tip:

  • Use when:
    • Systems need loose coupling
    • Real-time processing is required

3.7 Failover Pattern

Automatic switching to a backup system when the primary system fails.

AWS Services:

  • Amazon Route 53 (DNS failover)
  • Multi-AZ deployments
  • Multi-Region architectures

Types:

  • Active-Passive
  • Active-Active

Exam Tip:

  • Use Route 53 health checks
  • Use Multi-AZ RDS for automatic failover

3.8 Replication Pattern

Data is copied across multiple locations.

Types:

  • Synchronous replication
  • Asynchronous replication

AWS Services:

  • Amazon RDS Multi-AZ
  • DynamoDB Global Tables
  • S3 Cross-Region Replication (CRR)

Benefits:

  • Data durability
  • High availability

3.9 Sharding (Partitioning) Pattern

Data is split into smaller parts and distributed across multiple databases.

Example (AWS):

  • DynamoDB automatically partitions data
  • RDS can use manual sharding

Benefits:

  • Improved performance
  • Scalable storage

Exam Tip:

  • Use when a database becomes too large or slow

3.10 Bulkhead Pattern

Isolates parts of a system to prevent total failure.

How it works:

  • Resources are divided into groups
  • Failure in one group does not affect others

AWS Example:

  • Separate EC2 Auto Scaling groups per service
  • Separate queues for different workloads

3.11 Circuit Breaker Pattern

Prevents a system from repeatedly trying a failing operation.

How it works:

  • Detect failure
  • Stop sending requests temporarily
  • Retry after some time

AWS Context:

  • Used in application logic
  • Often implemented with SDK retries

4. Common AWS Distributed Architectures

4.1 Multi-AZ Architecture

  • Deploy resources across multiple AZs
  • Example:
    • ALB + EC2 + RDS Multi-AZ

4.2 Multi-Region Architecture

  • Deploy across multiple Regions
  • Used for:
    • Disaster recovery
    • Global applications

4.3 Serverless Distributed Architecture

  • Uses:
    • AWS Lambda
    • API Gateway
    • DynamoDB
  • Fully managed, auto-scaling

5. Key Exam Concepts to Remember

1. Loose Coupling

  • Services should not depend directly on each other
  • Use SQS/SNS/EventBridge

2. Idempotency

  • Same operation can be repeated safely without changing the result
  • Important for retries in distributed systems

3. Retry Mechanisms

  • Systems must handle transient failures
  • Use exponential backoff

4. Consistency Models

  • Strong consistency (immediate consistency)
  • Eventual consistency (data updates propagate over time)

Example:

  • DynamoDB supports both (depending on configuration)

6. When to Use Each Pattern (Exam Tips)

RequirementBest Pattern
High availabilityMulti-AZ, Failover
ScalabilityLoad balancing, Stateless
Asynchronous processingSQS
Event-based systemEventBridge, SNS
Data replicationMulti-AZ, CRR
Fast read performanceCaching
MicroservicesLambda, ECS

7. Important AWS Services for Distributed Systems

  • Amazon EC2 – compute instances
  • Amazon SQS – message queues
  • Amazon SNS – pub/sub messaging
  • Amazon DynamoDB – NoSQL database
  • Amazon RDS – relational database
  • Amazon S3 – object storage
  • Amazon CloudFront – content delivery
  • Elastic Load Balancing – traffic distribution
  • AWS Lambda – serverless compute

8. Exam Strategy Tips

  • Identify keywords in questions:
    • “scalable” → stateless, load balancing
    • “decoupled” → SQS/SNS
    • “failover” → Route 53, Multi-AZ
    • “high throughput” → DynamoDB, sharding
  • Look for failure scenarios
  • Choose services that:
    • Are managed
    • Support auto-scaling
    • Provide redundancy

9. Final Summary

Distributed design patterns help you:

  • Build highly available systems
  • Handle failures gracefully
  • Scale horizontally
  • Improve performance and resilience

In AWS, most distributed patterns are implemented using:

  • Managed services (SQS, SNS, DynamoDB, Lambda)
  • Multi-AZ and Multi-Region architectures
  • Load balancing and caching
Buy Me a Coffee