Create a CloudWatch CPU usage alert

Introduction

In this project, we create a CPU usage alert using Amazon CloudWatch. CloudWatch is a monitoring and observability service by AWS that provides data and actionable insights to monitor applications, understand and respond to system-wide performance changes, optimize resource utilization, and get a unified view of operational health.

Why is it useful in the real world?

  • Proactive Monitoring: An alert can notify us when an EC2 instance’s CPU usage goes above a threshold, helping us respond quickly to potential performance bottlenecks.
  • Cost-Effective: CloudWatch offers a free tier for metrics and alerts, making it a practical choice for small or experimental projects.
  • Automatic Scaling and Remediation: Alerts can trigger automatic scaling or other remedial actions, ensuring high availability and reliability.

Prerequisites

Required Tools & Accounts

AWS Account: Ensure it is within the Free Tier (no credits required).

AWS CLI: Installed and configured on your local machine (with valid credentials).

IAM Permissions: Have either Owner or Editor-equivalent permissions in AWS, or specifically:
cloudwatch:PutMetricAlarm
ec2: DescribeInstances
sns: CreateTopic (if you plan to set up email/SMS notifications)
sns: Subscribe

Enabled AWS Services

  • CloudWatch service (enabled by default in an AWS account).
  • Amazon SNS (Simple Notification Service) if you want to get notified via email or SMS.

Step-by-Step Implementation

Below are detailed steps showing both how to do each part via the AWS Console (GUI) and the AWS CLI (Terminal). We will create an alarm for CPU usage on a specific EC2 instance.


Step 1: Identify the EC2 Instance

Console (GUI) Approach
Go to the AWS Management Console and navigate to EC2.
Under Instances, locate the EC2 instance you want to monitor.
Take note of the Instance ID (e.g., i-1234567890abcdef0).

CLI Approach
Open your terminal and run:

aws ec2 describe-instances --query "Reservations[*].Instances[*].{InstanceID:InstanceId,State:State.Name}" --output table

Step 2: Create or Verify an SNS Topic (Optional but Recommended)

We recommend setting up an SNS topic for notifications so that we can receive an email or SMS when the alarm triggers.

Console (GUI) Approach
In the AWS Management Console, navigate to Simple Notification Service (SNS).
Click Topics on the left-hand menu, then choose Create topic.
Select Standard type, give it a Name (e.g., MyCPUAlertTopic), and click Create topic.
Click on the newly created topic and choose Create subscription.
Select Protocol (e.g., Email), enter a valid Endpoint (your email address), and click Create subscription.

CLI Approach
Create the SNS topic
:
bashCopierModifieraws sns create-topic --name MyCPUAlertTopicThis returns an ARN (e.g., arn:aws:sns:us-east-1:123456789012:MyCPUAlertTopic).
Subscribe to the topic:

Check your email to Confirm the subscription.

Step 3: Create a CloudWatch Alarm for CPU Usage

Now that we have the instance ID and (optionally) an SNS topic, we can create the alarm.

Console (GUI) Approach
Go to CloudWatch in the AWS console.
From the left-hand menu, select Alarms and then click Create alarm.
Click Select metric. In EC2 Metrics, choose Per-Instance Metrics.
Find and select the metric CPUUtilization for your specific instance ID.
Configure the alarm:
Metric Name: CPUUtilization
Statistic: Average (or Maximum, depending on preference)
Period: 5 minutes (for example)
Threshold: (e.g., 80 for 80% CPU usage)
In the Actions section, select your SNS topic (e.g., MyCPUAlertTopic) or choose No action if you only want the alarm to appear in the console.
Alarm name: HighCPUAlarm
Click Next, review the settings, and finally click Create alarm.

CLI Approach
Use the put-metric-alarm command. Replace values as needed:

This command creates an alarm named HighCPUAlarm that triggers when average CPU usage is at or above 80% for a 5-minute period.

Step 4: Verify the Alarm in CloudWatch

Console (GUI) Approach
In the CloudWatch console, go to Alarms.
Find HighCPUAlarm in the list and confirm its status is OK initially.

CLI Approach
Run:

Confirm the alarm is listed and shows StateValue as OK or INSUFFICIENT_DATA (it may briefly show INSUFFICIENT_DATA while it gathers metrics).

Verifying and Testing the Project

To confirm everything works correctly:

Artificially Increase CPU Usage
Log into your EC2 instance, then run a CPU-intensive command such as stress --cpu 1 --timeout 300 (if you have the stress utility installed). This simulates high CPU usage for 5 minutes.

Check Alarm State
Observe the Alarm in the CloudWatch console. It should change to ALARM when usage goes above your threshold for the specified period.
If you set up an SNS topic, you should also receive an Email (after confirming the subscription) or an SMS notification.

Stop the CPU Stress
End the CPU-intensive process to let usage drop. The alarm should eventually return to OK state.

Common Issues and Troubleshooting

  • Alarm Not Triggering:
    Check if the period and evaluation periods are set appropriately. It might take some time (e.g., 5–10 minutes) for the metric to reflect the spike in CPU usage.
  • SNS Subscription Not Confirmed:
    If notifications do not arrive, ensure the SNS subscription is confirmed (via the link sent to your email).
  • Insufficient IAM Permissions:
    If you receive an error, confirm that your IAM user/role can create CloudWatch alarms and has access to SNS.
  • Instance Doesn’t Show Up Under Metrics:
    Make sure the instance is running and that EC2 detailed monitoring is enabled if you want 1-minute granularity (the free tier includes some level of detailed metrics, but double-check your CloudWatch settings).

Conclusion

We have successfully created a CloudWatch CPU usage alert in AWS. We learned how to configure an SNS topic for notifications, set threshold metrics, and verify alarms both through the console and the CLI. We also covered key troubleshooting tips to ensure the alert functions correctly. By following these steps, we have taken an important step toward proactive and cost-effective monitoring of our AWS infrastructure.

What is Cloud Computing ?

Cloud computing delivers computing resources (servers, storage, databases, networking, and software) over the internet, allowing businesses to scale and pay only for what they use, eliminating the need for physical infrastructure.


  • AWS: The most popular cloud platform, offering scalable compute, storage, AI/ML, and networking services.
  • Azure: A strong enterprise cloud with hybrid capabilities and deep Microsoft product integration.
  • Google Cloud (GCP): Known for data analytics, machine learning, and open-source support.