Selecting the appropriate backup and/or archival solution

Task Statement 4.1: Design cost-optimized storage solutions.

📘AWS Certified Solutions Architect – (SAA-C03)


1. Understanding Backup vs Archive

Before choosing a solution, it’s important to know the difference:

ConceptPurposeFrequency / AccessCostExample Use Case (IT-focused)
BackupProtect data against loss (accidental deletion, corruption)Frequent or regularHigher than archivalBacking up databases, application servers, or virtual machines
ArchiveLong-term storage for data that’s rarely accessedRarelyLowStoring old logs, audit records, historical data for compliance

Key exam point: Backup is for recovery, archive is for retention. AWS often emphasizes cost vs access frequency.


2. AWS Backup Options

AWS provides multiple services for backups and archives. Choosing the right one depends on data type, access frequency, and retention requirements.

a. Amazon S3 (Simple Storage Service)

  • S3 Standard: High durability, frequent access, low latency
    • Use for current, frequently accessed data.
  • S3 Standard-Infrequent Access (S3 Standard-IA): Cheaper, for data accessed occasionally
  • S3 One Zone-IA: Even cheaper, stores data in one availability zone (less resilient)
  • S3 Glacier / Glacier Deep Archive: Lowest cost, designed for long-term archival, retrieval can take minutes to hours
    • Glacier: Retrieval within minutes to hours
    • Deep Archive: Retrieval within 12 hours

Exam Tip: If asked “low cost + long-term storage,” think S3 Glacier or Deep Archive.


b. AWS Backup

  • Centralized backup service across AWS resources
  • Supports:
    • EBS volumes (Elastic Block Store)
    • RDS databases
    • DynamoDB tables
    • EFS file systems
    • Storage Gateway
  • Features:
    • Automated backup scheduling
    • Lifecycle policies to move backups to cheaper storage
    • Point-in-time restore for recovery

Exam Tip: AWS Backup is often the recommended managed service for enterprise-wide backup strategy.


c. Database-specific backups

  • RDS Snapshots: Managed backups of relational databases
    • Automatic snapshots or manual snapshots
    • Can be copied to other regions (for disaster recovery)
  • DynamoDB On-Demand Backup: Full backup of NoSQL tables
  • Redshift Snapshots: For data warehouses

Exam Tip: Remember: automated backups + manual snapshots = flexible recovery.


d. EBS Snapshots

  • Point-in-time backup of an EBS volume
  • Stored in S3 behind the scenes
  • Can be automated using AWS Backup or custom scripts
  • Supports incremental backups → cheaper and faster

Exam Tip: Incremental backups only save changes since the last snapshot.


3. Key Factors for Selecting a Backup/Archive Solution

When selecting a solution, AWS wants you to think cost-optimization and requirements. Here’s a simple checklist:

  1. Frequency of Access
    • Frequently accessed → S3 Standard or Standard-IA
    • Rarely accessed → Glacier / Deep Archive
  2. Recovery Time Objective (RTO)
    • How fast you need to restore data
    • S3 Glacier Deep Archive has longer RTO, so not suitable for rapid recovery
  3. Retention Period
    • Short-term backups → S3 Standard / AWS Backup
    • Long-term archival → Glacier / Deep Archive
  4. Compliance & Legal Requirements
    • Data must remain unaltered → Use S3 Object Lock / WORM policies
  5. Cost Optimization
    • Frequent access + high durability → S3 Standard
    • Rare access → S3 IA or Glacier (much cheaper)

4. Example Backup/Archival Architectures (IT-focused)

Data TypeRecommended AWS StorageReason
Production DBRDS with automated backups + snapshotsFast recovery, automated
Application logs (30 days)S3 Standard → S3 IALogs accessed occasionally, cost savings
Audit logs (7 years)S3 Glacier Deep ArchiveRare access, very low cost, compliance
EBS volumesEBS Snapshots via AWS BackupIncremental backups, easy recovery
Large File Share (NFS)EFS + AWS BackupManaged backup, easy to restore files

Exam Tip: Often, AWS gives a scenario with data access frequency + retention. Map it to S3 storage class + AWS Backup.


5. Lifecycle Policies for Cost Optimization

  • AWS allows automatic transition between storage classes to save money:
    • Example: Move logs from S3 Standard → S3 IA → Glacier → Deep Archive over time
  • Use lifecycle rules to automatically delete or archive old data
  • Exam Tip: Lifecycle policies = cost optimization tool, often tested in SAA-C03.

6. Best Practices for the Exam

  1. Understand RTO vs RPO
    • RTO (Recovery Time Objective): How quickly you need data restored
    • RPO (Recovery Point Objective): How much data you can afford to lose
  2. Cost vs Access Tradeoff
    • Frequent access = higher cost
    • Rare access = lower cost
  3. Use Managed Services
    • AWS Backup, RDS snapshots, DynamoDB backups → less manual work, easier for exams
  4. Automate
    • Lifecycle policies + automated backups = exam-friendly design
  5. Multi-region or Multi-AZ
    • Important for disaster recovery scenarios

7. Key AWS Services to Remember for Exam

ServicePurpose
S3 Standard / IA / Glacier / Deep ArchiveObject storage & archiving
AWS BackupCentralized backup management
RDS SnapshotsRelational database backups
DynamoDB Backup & RestoreNoSQL database backups
EBS SnapshotsVolume-level backups
EFS + BackupFile system backups

Summary:

  • Backup = frequent, quick recovery, Archive = long-term, low-cost
  • Use AWS Backup + snapshots for automation
  • S3 storage classes are your main tool for cost optimization
  • Lifecycle policies save money over time
  • Always consider RTO, RPO, access frequency, and retention
Buy Me a Coffee