Selecting the appropriate backup and/or archival solution

Task Statement 4.1: Design cost-optimized storage solutions.

📘AWS Certified Solutions Architect – (SAA-C03)

1. Understanding Backup vs Archive

Before choosing a solution, it’s important to know the difference:

Concept	Purpose	Frequency / Access	Cost	Example Use Case (IT-focused)
Backup	Protect data against loss (accidental deletion, corruption)	Frequent or regular	Higher than archival	Backing up databases, application servers, or virtual machines
Archive	Long-term storage for data that’s rarely accessed	Rarely	Low	Storing old logs, audit records, historical data for compliance

Key exam point: Backup is for recovery, archive is for retention. AWS often emphasizes cost vs access frequency.

2. AWS Backup Options

AWS provides multiple services for backups and archives. Choosing the right one depends on data type, access frequency, and retention requirements.

a. Amazon S3 (Simple Storage Service)

S3 Standard: High durability, frequent access, low latency
- Use for current, frequently accessed data.
S3 Standard-Infrequent Access (S3 Standard-IA): Cheaper, for data accessed occasionally
S3 One Zone-IA: Even cheaper, stores data in one availability zone (less resilient)
S3 Glacier / Glacier Deep Archive: Lowest cost, designed for long-term archival, retrieval can take minutes to hours
- Glacier: Retrieval within minutes to hours
- Deep Archive: Retrieval within 12 hours

Exam Tip: If asked “low cost + long-term storage,” think S3 Glacier or Deep Archive.

b. AWS Backup

Centralized backup service across AWS resources
Supports:
- EBS volumes (Elastic Block Store)
- RDS databases
- DynamoDB tables
- EFS file systems
- Storage Gateway
Features:
- Automated backup scheduling
- Lifecycle policies to move backups to cheaper storage
- Point-in-time restore for recovery

Exam Tip: AWS Backup is often the recommended managed service for enterprise-wide backup strategy.

c. Database-specific backups

RDS Snapshots: Managed backups of relational databases
- Automatic snapshots or manual snapshots
- Can be copied to other regions (for disaster recovery)
DynamoDB On-Demand Backup: Full backup of NoSQL tables
Redshift Snapshots: For data warehouses

Exam Tip: Remember: automated backups + manual snapshots = flexible recovery.

d. EBS Snapshots

Point-in-time backup of an EBS volume
Stored in S3 behind the scenes
Can be automated using AWS Backup or custom scripts
Supports incremental backups → cheaper and faster

Exam Tip: Incremental backups only save changes since the last snapshot.

3. Key Factors for Selecting a Backup/Archive Solution

When selecting a solution, AWS wants you to think cost-optimization and requirements. Here’s a simple checklist:

Frequency of Access
- Frequently accessed → S3 Standard or Standard-IA
- Rarely accessed → Glacier / Deep Archive
Recovery Time Objective (RTO)
- How fast you need to restore data
- S3 Glacier Deep Archive has longer RTO, so not suitable for rapid recovery
Retention Period
- Short-term backups → S3 Standard / AWS Backup
- Long-term archival → Glacier / Deep Archive
Compliance & Legal Requirements
- Data must remain unaltered → Use S3 Object Lock / WORM policies
Cost Optimization
- Frequent access + high durability → S3 Standard
- Rare access → S3 IA or Glacier (much cheaper)

4. Example Backup/Archival Architectures (IT-focused)

Data Type	Recommended AWS Storage	Reason
Production DB	RDS with automated backups + snapshots	Fast recovery, automated
Application logs (30 days)	S3 Standard → S3 IA	Logs accessed occasionally, cost savings
Audit logs (7 years)	S3 Glacier Deep Archive	Rare access, very low cost, compliance
EBS volumes	EBS Snapshots via AWS Backup	Incremental backups, easy recovery
Large File Share (NFS)	EFS + AWS Backup	Managed backup, easy to restore files

Exam Tip: Often, AWS gives a scenario with data access frequency + retention. Map it to S3 storage class + AWS Backup.

5. Lifecycle Policies for Cost Optimization

AWS allows automatic transition between storage classes to save money:
- Example: Move logs from S3 Standard → S3 IA → Glacier → Deep Archive over time
Use lifecycle rules to automatically delete or archive old data
Exam Tip: Lifecycle policies = cost optimization tool, often tested in SAA-C03.

6. Best Practices for the Exam

Understand RTO vs RPO
- RTO (Recovery Time Objective): How quickly you need data restored
- RPO (Recovery Point Objective): How much data you can afford to lose
Cost vs Access Tradeoff
- Frequent access = higher cost
- Rare access = lower cost
Use Managed Services
- AWS Backup, RDS snapshots, DynamoDB backups → less manual work, easier for exams
Automate
- Lifecycle policies + automated backups = exam-friendly design
Multi-region or Multi-AZ
- Important for disaster recovery scenarios

7. Key AWS Services to Remember for Exam

Service	Purpose
S3 Standard / IA / Glacier / Deep Archive	Object storage & archiving
AWS Backup	Centralized backup management
RDS Snapshots	Relational database backups
DynamoDB Backup & Restore	NoSQL database backups
EBS Snapshots	Volume-level backups
EFS + Backup	File system backups

✅ Summary:

Backup = frequent, quick recovery, Archive = long-term, low-cost
Use AWS Backup + snapshots for automation
S3 storage classes are your main tool for cost optimization
Lifecycle policies save money over time
Always consider RTO, RPO, access frequency, and retention