AWS Certified Cloud Practitioner
Post 22 of 25
88%
Complete
AWS Cloud Practitioner #22: CloudWatch - Monitoring y Logs
Domina CloudWatch: metrics, alarms, logs, dashboards y cómo monitorear recursos AWS efectivamente.
🎯 Lo que Aprenderás Hoy
- Explicar CloudWatch metrics y alarms
- Configurar logging con CloudWatch Logs
- Crear dashboards para visualización
- Set up alertas efectivas
- Diferenciar CloudWatch vs. CloudTrail
Amazon CloudWatch
¿Qué es? Servicio de monitoring para AWS resources y applications.
CloudWatch collect y track:
✅ Metrics (CPU, memory, network)
✅ Logs (application, system)
✅ Events (resource changes)
✅ Alarms (thresholds)
Use cases:
- Monitor EC2 CPU utilization
- Track application errors
- Alert si database connections spike
- Dashboard con health de systemCloudWatch Metrics
¿Qué son? Time-series data points (measurements).
Default Metrics (Free)
EC2:
- CPUUtilization
- NetworkIn/NetworkOut
- DiskReadOps/DiskWriteOps
- StatusCheckFailed
RDS:
- CPUUtilization
- DatabaseConnections
- FreeStorageSpace
- ReadLatency/WriteLatency
ELB:
- RequestCount
- TargetResponseTime
- HealthyHostCount
S3:
- BucketSizeBytes
- NumberOfObjects
Lambda:
- Invocations
- Duration
- Errors
- Throttles
Default frequency: 5 minutesCustom Metrics
Application-specific metrics:
Examples:
- Active users count
- Orders per minute
- Payment processing time
- API response times
Push to CloudWatch:
aws cloudwatch put-metric-data \
--namespace "MyApp" \
--metric-name "ActiveUsers" \
--value 150 \
--timestamp 2025-11-16T10:00:00Z
Frequency: Up to 1-second resolution
Cost: $0.30 per metric/monthCloudWatch Alarms
¿Qué son? Automated actions basadas en metric thresholds.
Alarm states:
- OK: Within threshold
- ALARM: Threshold exceeded
- INSUFFICIENT_DATA: Not enough data
Actions when ALARM:
- SNS notification (email/SMS)
- Auto Scaling action (add instances)
- EC2 action (stop, terminate, reboot)
- Systems Manager actionCreating Alarm
# CPU > 80% alarm
aws cloudwatch put-metric-alarm \
--alarm-name cpu-mon \
--alarm-description "CPU exceeds 80%" \
--metric-name CPUUtilization \
--namespace AWS/EC2 \
--statistic Average \
--period 300 \
--evaluation-periods 2 \
--threshold 80 \
--comparison-operator GreaterThanThreshold \
--dimensions Name=InstanceId,Value=i-1234567890abcdef0 \
--alarm-actions arn:aws:sns:us-east-1:123456789012:ops-team
# Evaluation:
Period: 5 minutes
Evaluation periods: 2
Trigger: If CPU > 80% for 2 consecutive 5-min periods (10 min total)
Action: Send SNS notificationCommon Alarms
1. High CPU:
EC2 CPUUtilization > 80%
Action: Scale out (add instances)
2. Low Healthy Hosts:
ELB HealthyHostCount < 2
Action: Alert ops team
3. High Error Rate:
Lambda Errors > 10 per minute
Action: Page on-call engineer
4. Billing:
EstimatedCharges > $1000
Action: Email finance team
5. RDS Storage:
FreeStorageSpace < 10 GB
Action: Increase storage / alert DBACloudWatch Logs
¿Qué es? Centralized log management.
Log sources:
- EC2 instances (application logs)
- Lambda functions (console.log)
- RDS (error logs, slow query logs)
- CloudTrail (API logs)
- VPC Flow Logs (network traffic)
- Route 53 (DNS queries)
Benefits:
✅ Centralized (all logs en un lugar)
✅ Searchable
✅ Retention policies
✅ Metric filters (log → metrics → alarms)Log Groups y Streams
Hierarchy:
Log Group: /aws/lambda/my-function
├── Log Stream: 2025/11/16/[$LATEST]abc123
│ └── Events:
│ - "START RequestId: xyz"
│ - "Processing order #1234"
│ - "END RequestId: xyz"
│
└── Log Stream: 2025/11/16/[$LATEST]def456
└── Events: ...
Log Group: Collection of related streams
Log Stream: Sequence of log events from source
Log Event: Single log entrySending Logs
# Lambda automatically sends logs
import logging
logger = logging.getLogger()
def lambda_handler(event, context):
logger.info(f"Processing order {event['orderId']}")
# Automatically to CloudWatch Logs
# EC2: Install CloudWatch Agent
sudo yum install -y amazon-cloudwatch-agent
# Configure:
{
"logs": {
"logs_collected": {
"files": {
"collect_list": [{
"file_path": "/var/log/app.log",
"log_group_name": "/aws/ec2/my-app",
"log_stream_name": "{instance_id}"
}]
}
}
}
}Metric Filters
Create metric from logs:
Example: Count ERROR logs
Log entry:
"[ERROR] Database connection failed"
Metric filter:
Pattern: [ERROR]
Metric: AppErrors
Namespace: MyApp
Result:
CloudWatch metric AppErrors increments cada ERROR
→ Create alarm: AppErrors > 10/min
→ Send notificationCloudWatch Dashboards
¿Qué son? Visual interfaces para metrics.
Dashboard components:
- Line graphs (CPU over time)
- Number widgets (current value)
- Stacked area (multiple metrics)
- Logs widget (query results)
Example dashboard:
┌─────────────────────────────────┐
│ EC2 CPU Utilization (24h) │
│ [Line graph] │
└─────────────────────────────────┘
┌─────────────────────────────────┐
│ Active Users: 1,234 │
└─────────────────────────────────┘
┌─────────────────────────────────┐
│ Error Rate (last hour) │
│ [Stacked area] │
└─────────────────────────────────┘
Use cases:
- NOC (Network Operations Center)
- Daily standup meetings
- Real-time monitoringCloudWatch vs. CloudTrail
| Aspecto | CloudWatch | CloudTrail |
|---|---|---|
| Purpose | Monitoring | Auditing |
| Data | Performance metrics | API calls |
| Question | How is it performing? | Who did what? |
| Use case | CPU high, errors | User deleted DB |
| Alerts | Performance thresholds | Security events |
Example distinction:
CloudWatch:
"EC2 instance CPU at 95%"
"Lambda errors spiking"
"RDS connections increasing"
CloudTrail:
"User juan@empresa.com terminated instance i-123"
"IAM policy modified at 10:30 AM"
"S3 bucket made public"
Often used together:
CloudTrail logs → CloudWatch Logs → Metric filter → Alarm
Example: Alert if root account usedPricing
Metrics:
- Default metrics: FREE
- Custom metrics: $0.30/metric/month
- Detailed monitoring (1-min): $2.10/instance/month
Alarms:
- Standard: $0.10/alarm/month
- High-resolution: $0.30/alarm/month
Logs:
- Ingestion: $0.50/GB
- Storage: $0.03/GB/month
- Data scan (Insights): $0.005/GB
Dashboards:
- 3 dashboards FREE
- $3/dashboard/month después
Free Tier:
- 10 custom metrics
- 10 alarms
- 5 GB logs ingestion
- 5 GB logs storageBest Practices
1. Set meaningful alarms:
✅ Critical: Page on-call
✅ Warning: Email team
❌ Don't over-alert (alarm fatigue)
2. Use dashboards:
Create for each team/service
Share URL para visibility
3. Log retention:
Balance cost vs. compliance
30 days for most, 90+ for compliance
4. Metric filters:
Convert logs to metrics
Track business KPIs
5. Use namespaces:
Organize custom metrics
MyApp/Production, MyApp/Dev
6. Tag resources:
Easy filtering en dashboards
Cost allocation
7. Use CloudWatch Insights:
Query logs at scale
Faster than grep📝 Preparación para el Examen
Puntos Clave
CloudWatch:
- 📌 Monitoring: Metrics, logs, alarms
- 📌 Default metrics: CPU, network, disk (5-min)
- 📌 Custom metrics: Application-specific
- 📌 Alarms: Automated actions on thresholds
Components:
- 📌 Metrics: Time-series data
- 📌 Logs: Centralized logging
- 📌 Alarms: Threshold-based actions
- 📌 Dashboards: Visualization
vs. CloudTrail:
- 📌 CloudWatch: Performance (HOW)
- 📌 CloudTrail: Auditing (WHO/WHAT)
Preguntas de Práctica
Pregunta 1:
¿Cuál es la frecuencia default de CloudWatch metrics para EC2?
A) 1 minute B) 5 minutes C) 15 minutes D) 1 hour
Respuesta: B) 5 minutes
Default CloudWatch metrics para EC2 son cada 5 minutos (gratis). Detailed monitoring (1-min) cuesta extra.
Pregunta 2:
¿Qué servicio monitorea performance de recursos?
A) CloudTrail B) CloudWatch C) Config D) Inspector
Respuesta: B) CloudWatch
CloudWatch monitorea performance (CPU, memory, errors). CloudTrail es para auditing (API calls).
🎓 Resumen
- CloudWatch: Monitoring service (metrics, logs, alarms)
- Metrics: Performance measurements (CPU, network)
- Alarms: Automated actions on thresholds
- Logs: Centralized logging
- Dashboards: Visualization
- vs. CloudTrail: Performance vs. Auditing
⏭️ Próximo Post
Post #23: AWS Organizations - Multi-Account Management
Tags: #AWS #CloudPractitioner #CloudWatch #Monitoring #Logs #Alarms #Metrics #Certification
Related Articles
AWS Cloud Practitioner #1: De Servidores Físicos a la Nube
Aprende qué es cloud computing y las diferencias entre IaaS, PaaS y SaaS con una metodología bottom-up que construye tu conocimiento paso a paso.
AWS Cloud Practitioner #2: Infraestructura Global AWS - Regions, AZs y Edge Locations
Descubre cómo AWS distribuye su infraestructura globalmente y aprende a elegir la región correcta para tus aplicaciones usando metodología bottom-up.
AWS Cloud Practitioner #3: Superpoderes de la Nube - Elasticity, Scalability y HA
Comprende las ventajas clave de cloud computing: elasticity, scalability, high availability y agility. Aprende cómo AWS implementa estos conceptos.
AWS Cloud Practitioner #4: Well-Architected Framework - Los 6 Pilares
Aprende los 6 pilares del AWS Well-Architected Framework: Operational Excellence, Security, Reliability, Performance Efficiency, Cost Optimization y Sustainability.