Task Statement 4.2: Validate and audit security by using network monitoring and logging services.
📘AWS Certified Advanced Networking – Specialty
📌 What is an Alert Mechanism?
An alert mechanism is a system that:
- Monitors your AWS resources continuously
- Detects unusual or risky behavior
- Sends notifications automatically
👉 In simple terms:
It tells you when something is wrong or needs attention in your AWS network.
☁️ Core Service: Amazon CloudWatch
The main service used for alerting is:
- Amazon CloudWatch
CloudWatch helps you:
- Collect metrics (numerical data like CPU usage, packet drops)
- Collect logs (text-based records of events)
- Create alarms based on conditions
🚨 What is a CloudWatch Alarm?
A CloudWatch Alarm watches a metric and:
- Triggers an action when a condition is met
👉 Example (IT-based):
- Monitor network traffic on an EC2 instance
- If traffic suddenly spikes beyond a limit → trigger alert
🧠 How CloudWatch Alarms Work
Step-by-step process:
- Metric Collection
- AWS services send metrics (e.g., CPU, network packets, errors)
- Set Threshold
- You define a rule like:
- “Trigger if CPU > 80%”
- “Trigger if network packets drop > 1000”
- You define a rule like:
- Evaluation
- CloudWatch checks data periodically (e.g., every 1 minute)
- Alarm State Change
- CloudWatch alarm has 3 states:
| State | Meaning |
|---|---|
| OK | Everything is normal |
| ALARM | Threshold exceeded (problem detected) |
| INSUFFICIENT_DATA | Not enough data |
- Action Triggered
- Notification or automated response happens
📊 Types of Metrics Used in Alerts
1. AWS Service Metrics
Automatically provided:
- EC2 (CPU, network traffic)
- ELB (request count, latency)
- VPC (flow logs insights)
2. Custom Metrics
- You can send your own application/network data
- Example:
- Number of failed login attempts
- Custom firewall logs
🔔 Alarm Actions (What Happens When Triggered)
When an alarm enters ALARM state, it can trigger:
1. Notifications (Most Important for Exam)
Using:
- Amazon SNS
SNS sends:
- SMS
- HTTP/HTTPS notifications
👉 Example:
- Network anomaly detected → send email to security team
2. Automated Actions
CloudWatch can trigger:
- EC2 stop/terminate/reboot
- Auto Scaling actions
- Lambda functions
Using:
- AWS Lambda
👉 Example:
- High traffic detected → automatically scale out servers
🧩 Types of CloudWatch Alarms
1. Simple Alarm
- Based on one metric
- Example:
- CPU > 80%
2. Composite Alarm (Important for Exam)
- Combines multiple alarms
- Triggers only when conditions are logically met
👉 Example:
- Alarm triggers only if:
- CPU high AND network traffic high
This reduces false positives
⏱️ Key Alarm Configuration Settings
1. Period
- How often data is checked (e.g., 1 minute)
2. Evaluation Periods
- Number of checks before triggering
👉 Example:
- 3 evaluation periods:
- CPU must exceed threshold 3 times in a row
3. Datapoints to Alarm
- Number of failed datapoints needed
👉 Helps avoid triggering on temporary spikes
4. Threshold Types
Static Threshold
- Fixed value (e.g., CPU > 80%)
Dynamic Threshold (Advanced)
Using:
- CloudWatch Anomaly Detection
👉 Automatically detects unusual patterns using ML
🌐 Network-Specific Alerting (VERY IMPORTANT FOR THIS EXAM)
For networking, alarms are used with:
1. VPC Flow Logs
- Monitor traffic patterns
- Detect:
- Unauthorized access attempts
- Unusual traffic spikes
2. Load Balancers
- Monitor:
- Request count
- Latency
- Error rates
3. NAT Gateway
- Monitor:
- Packet drops
- Connection errors
4. Transit Gateway
- Monitor:
- Data transfer anomalies
- Routing issues
🔐 Security Monitoring with Alerts
Alerts help detect:
- Port scanning attempts
- DDoS indicators
- Unusual outbound traffic
- Sudden increase in denied traffic
👉 Alerts integrate with:
- AWS CloudTrail (API activity)
- Amazon GuardDuty (threat detection)
🔄 Integration with Event-Based Alerting
CloudWatch works with:
- Amazon EventBridge
👉 EventBridge allows:
- Real-time event-based alerts (not just metrics)
Example:
- IAM policy change detected → trigger alert immediately
🧪 Best Practices (EXAM CRITICAL)
✅ 1. Avoid False Alarms
- Use:
- Multiple evaluation periods
- Composite alarms
✅ 2. Use Meaningful Thresholds
- Don’t set too low or too high
✅ 3. Enable Notifications for Critical Systems
- Always connect alarms to SNS
✅ 4. Monitor Network-Specific Metrics
- Focus on:
- Packet drops
- Latency
- Traffic spikes
✅ 5. Use Anomaly Detection for Advanced Monitoring
- Helps detect unknown threats
✅ 6. Automate Responses Where Possible
- Use Lambda for immediate action
⚠️ Common Exam Traps
❌ Confusing logs vs metrics
→ Alarms work mainly on metrics, not raw logs
❌ Ignoring evaluation periods
→ Alarms don’t trigger instantly unless configured
❌ Not using SNS
→ Alerts must be delivered somehow
❌ Forgetting composite alarms
→ Important for reducing alert noise
🧠 Quick Summary (Exam Revision)
- CloudWatch alarms monitor metrics
- Trigger when thresholds are crossed
- States: OK, ALARM, INSUFFICIENT_DATA
- Use SNS for notifications
- Use Lambda for automation
- Use composite alarms to reduce false alerts
- Critical for network monitoring and security validation
