Cloudwatch —Triggering Events , Collecting Logs and creating Alarms

Prakhar Agarwal
14 min readJul 4, 2020

Amazon cloudwatch is an Amazon Web Service that provides you real time monitoring of aws resources and customer application running on amazon infrastructure.

What Does Amazon CloudWatch do ?

1)Create Alarms and send notifications
2)Collect, monitor and store log files
3) Send system events from aws resources to AWS Lambda, SNS etc
4) Collects and tracks key metrics.

What are resources Managed by AWS CloudWatch

1)EC2
2)Data stored in S3
3)Elastic Load balancer
4)Database (AWS RDS)
5)Other various AWS services

In this blog , i am going to share the use cases for each services along with the steps to implement it. So without much a due lets get started and see the services provided by cloudwatch one by one:

  1. Create Alarms and send Notification

In this service ,we are going to setup an alarm using SNS(Simple Notification Service)and Cloudwatch service so that we get an email notification as soon as CPU utilization of our instance is less than 25%.

To create a notification , you first need to create a topic in SNS and subscribe to it with your email id.

1-a ) Goto SNS service , then Topic and Create Topic. Enter Topic name and description and click Create Topic.
On the next page , click Create Subscription .

1- b) On Create Subscription page , select the protocol , here i want the notification as email . Provide your email address ( valid one ) in the endpoint box.

Click Create Subscription.

1-c ) Now , goto Cloudwatch service and select metrics from left menu pane.Choose EC2 and then per-instance metrics.

You will get a list of metrics available for EC2 . Look for CPUUtilization and it will give list of all ec2 instance ( in case you have more than one) .

Select the one instance for which you want to get the notification. Then goto Graph Metric tab and click alarm icon under Actions.

1-d ) On the next page , select the threshold

Here , i set threshold as ≤ 25. Click Next.

1-e ) On Configuration screen , select option an existing SNS topic and choose the SNS topic we created earlier.Click Next.

Give Alarm name and description and click next to review the setting. Once done click create alarm button. Your alarm is created.

NOTE: Alarm will be in Pending Confirmation state untill you verify the provided email address.

2. Collect, monitor and store log files

To collects logs and metrics we would be installing cloudwatch agent on our EC2 instance that will export our provided logs to cloudwatch service and provide measureable and actionable data about system performances

You will first need to ensure your instances are managed by AWS systems manager. To do this lets create a role having specifc permission and attach it to our EC2 instance.

2-a) Create a Role.

Goto IAM service . Select Role from left navigation menu and click Create Role.

Select AWS Service then EC2 from ‘Choose a Use Case’ section.

Click Next:Permission

On the next screen search and select below three policies

CloudWatchAgentAdminPolicy →Allows Cloudwatch agent to save agent configuration file to Systems Manager Parameter store
CloudWatchAgentServerPolicy → Allows to write data on cloudwatch
AmazonEC2RoleforSSM → giving Systems manager permission to manage our EC2 instance

and click Next:Tags button. Add tags if you want to and proceed to Review Step.
Give the Role a name and description and click Create Role and your role is created.

2-b) Attach the Role to EC2 instance.

Goto EC2 dashboard , select your EC2 instance and in Actions menu choose Attach/Replace IAM Role under Instance Settings.

On the next , select the role that we just created and apply the changes.By attaching you are allowing your EC2 instance to be managed by System Manager service.

2-c) Configure AWS CLI to admin user profile

This we will be needing later as we work with cloudwatch agent.

#command to configure AWS CLI default-profile
aws configure
#Enter Access key and Secret key of the admin user
AWS Access Key ID [None]: R*******E
AWS Secret Access Key [None]: L*******h#Leave region as default and enter
Default region name [ap-south-1]:
#Enter json
Default output format [None]: json

2-d) Install Cloudwatch Agent on EC2

Goto System Manager service and choose Quick Setup option from left menu.
On the quick setup, leave ‘Permission Required’ section as default ,under ‘Quick Setup options’ select ‘Install and configure the CloudWatch agent’ (rest all is as per the requirement). From ‘Targets’ section , choose intances manually and select your instance. Submit the changes.

On the next page you will see status of the ‘Quick Setup options’ selected from previous step as pending after few seconds this will change to success.

To check the installation , connect to EC2 instance and check the status

sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a status

NOTE: Default location for cloudwatch agent is /opt/aws/

this should give status as running .

NOTE: If you are not seeing aws folder inside opt directory wait for few more mins sometimes it takes a bit more time.

2-e) Start CloudWatch Agent configuration wizard

As we are done installing it now its time to run cloudwatch agent configuation wizard. This will create agent configuration file.Start the CloudWatch agent configuration wizard by entering the following command:

sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard

Next you will see some set of questions and based on your answers agent will create a configuration file which will then collect the logs and metrics .
Configuraton wizard look likes this:

============================================================== Welcome to the AWS CloudWatch Agent Configuration Manager ==============================================================On which OS are you planning to use the agent?1. linux2. windowsdefault choice: [1]:1Trying to fetch the default region based on ec2 metadata...Are you using EC2 or On-Premises hosts?1. EC22. On-Premisesdefault choice: [1]:1Which user are you planning to run the agent?1. root2. cwagent3. othersdefault choice: [1]:1Do you want to turn on StatsD daemon?1. yes2. nodefault choice: [1]:2Do you want to monitor metrics from CollectD?1. yes2. nodefault choice: [1]:2Do you want to monitor any host metrics? e.g. CPU, memory, etc.1. yes2. nodefault choice: [1]:1Do you want to monitor cpu metrics per core? Additional CloudWatch charges may apply.1. yes2. nodefault choice: [1]:2Do you want to add ec2 dimensions (ImageId, InstanceId, InstanceType, AutoScalingGroupName) into all of your metrics if the info is available?1. yes2. nodefault choice: [1]:2Would you like to collect your metrics at high resolution (sub-minute resolution)? This enables sub-minute resolution for all metrics, but you can customize for specific metrics in the output json file.1. 1s2. 10s3. 30s4. 60sdefault choice: [4]:Which default metrics config do you want?1. Basic2. Standard3. Advanced4. Nonedefault choice: [1]:Current config as follows:{"agent": {"metrics_collection_interval": 60,"run_as_user": "root"},"metrics": {"metrics_collected": {"disk": {"measurement": ["used_percent"],"metrics_collection_interval": 60,"resources": ["*"]},"mem": {"measurement": ["mem_used_percent"],"metrics_collection_interval": 60}}}}Are you satisfied with the above config? Note: it can be manually customized after the wizard completes to add additional items.1. yes2. nodefault choice: [1]:Do you have any existing CloudWatch Log Agent (http://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/AgentReference.html) configuration file to import for migration?1. yes2. nodefault choice: [2]:2Do you want to monitor any log files?1. yes2. nodefault choice: [1]:1Log file path:/var/log/nginx/*.logLog group name:default choice: [*.log]nginx-logLog stream name:default choice: [{instance_id}]blogger-instanceDo you want to specify any additional log files to monitor?1. yes2. nodefault choice: [1]:1Log file path:/home/ubuntu/.pm2/logs/*.logLog group name:default choice: [*.log]pm2 logsLog stream name:default choice: [{instance_id}]blogger-instanceDo you want to specify any additional log files to monitor?1. yes2. nodefault choice: [1]:2Saved config file to /opt/aws/amazon-cloudwatch-agent/bin/config.json successfully.Current config as follows:{"agent": {"metrics_collection_interval": 60,"run_as_user": "root"},"logs": {"logs_collected": {"files": {"collect_list": [{"file_path": "/var/log/nginx/*.log","log_group_name": "nginx-log","log_stream_name": "blogger-instance"},{"file_path": "/home/ubuntu/.pm2/logs/*.log","log_group_name": "pm2 logs","log_stream_name": "blogger-instance"}]}}},"metrics": {"metrics_collected": {"disk": {"measurement": ["used_percent"],"metrics_collection_interval": 60,"resources": ["*"]},"mem": {"measurement": ["mem_used_percent"],"metrics_collection_interval": 60}}}}Please check the above content of the config.The config file is also located at /opt/aws/amazon-cloudwatch-agent/bin/config.json.Edit it manually if needed.Do you want to store the config in the SSM parameter store?1. yes2. nodefault choice: [1]:What parameter store name do you want to use to store your config? (Use 'AmazonCloudWatch-' prefix if you use our managed AWS policy)default choice: [AmazonCloudWatch-linux]Trying to fetch the default region based on ec2 metadata...Which region do you want to store the config in the parameter store?default choice: [ap-south-1]Which AWS credential should be used to send json config to parameter store?1. A************** (From SDK)2. Otherdefault choice: [1]:Successfully put config to parameter store AmazonCloudWatch-linux.Program exits now.

NOTE:You can provide location of external logs as well in the wizard. Look for the question ‘Do you want to monitor any log files?’ . Like in the above case i have given pm2 and nginx logs to be collected by agent. Similarly , you can give any logs .

Once done you can check config file on aws console as well on your ec2 instance and edit it if required.

location on EC2 inctacne is
/opt/aws/amazon-cloudwatch-agent/config.json

on AWS management ,
Parameter store menu under System Manager service

2-f) Start cloudwatch agent

Use this below command to start agent

sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c ssm:configuration-parameter-store-name -s

Here , configuration-parameter-store-name by default is AmazonCloudWatch-linux ( if you have chosen default in the configuration wizard otherwise provide the name you have given there)

In this command, -a fetch-config causes the agent to load the latest version of the CloudWatch agent configuration file, and -s starts the agent. If everything done successfully you will see this :

/opt/aws/amazon-cloudwatch-agent/bin/config-downloader --output-dir /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d --download-source ssm:AmazonCloudWatch-linux --mode ec2 --config /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml --multi-config defaultRegion: ap-south-1credsConfig: map[]Successfully fetched the config and saved in /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/ssm_AmazonCloudWatch-linux.tmpStart configuration validation.../opt/aws/amazon-cloudwatch-agent/bin/config-translator --input /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json --input-dir /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d --output /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml --mode ec2 --config /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml --multi-config default2020/06/08 19:50:24 Reading json config file path: /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/ssm_AmazonCloudWatch-linux.tmp ...Valid Json input schema.I! Detecting runasuser...No csm configuration found.Configuration validation first phase succeeded/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent -schematest -config /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.tomlConfiguration validation second phase succeededConfiguration validation succeeded

And we are done .

Now , you can check your logs on AWS management console . Goto Cloudwatch server and select log groups from left menu .You will see the logs provided by you in the configuration wizard.

3. Send System events from aws resources to AWS Lambda

Amazon Cloudwatch events is a part of Amazon CloudWatch which delivers a near real-time stream of system events that allows you to monitor and respond to the change in your AWS resources by means of rules that route events to one or more targets.

Use-Case 1 ) Start and Stop instances at a particular time

Let’s assume a scenario where you are running a full fledged website hosted on amazon with 4–5 virtual servers and currently they are all active and all through the day. But , lately you notice that traffic is high during day time whereas it is considerable low during night time. CPU utilization is less than 50% and most of the night time they are idle. However ,you are paying same amount as of day time even when they are underused during night time.

For this scenario you can create a lambda function to shutdown few of your instances during night time and start them back again in the morning . It will be obviously tedious to this manually every time and to automate this we will use cloudwatch event.

1- a) Create policy to control EC2 instance.

Goto IAM service . Select Policies from left navigation menu and click Create Policy.

Under Service , select EC2.
In Actions section, search and select StartInstances and StopInstances
In Resources , you can select all or choose manually and then select your instance.

Click Review Policy. On this page , give Policy a name and description and click Create Policy.

1-b ) Assign this policy to a role

Now we will assign this policy to a role with use case as lambda.

Goto main IAM dashboard and select Roles from left menu. Click Create Role .
Choose Lambda as a use case on this page .
Next page , search and select the above create policy .
Give any tag name to the role and click new to review .
Provide role a name and description , check you setting and create role.

Our Lambda role with start and stop ec2 instance policy has been created successfully.

1- c) Create Lambda function

We will create two lambda functions one for each stop and start ec2 instacnes.

First we are creating stop instance. For that goto Lambda service and click create Role.
Choose Author from scratch and select the role that we create above .

Click Create Function.

In code editor panel

const aws = require('aws-sdk');exports.handler = (event, context, callback)=> {
const ec2= new aws.EC2({region:event.instanceRegion});

ec2.stopInstances({InstanceIds:[event.instanceId]}).promise()
.then(() => callback(null,`Successfully stopped ${event.instanceId}`))
.catch(err => callback(err));
};

and click Save. Now , we have to provide this function instanceRegion and instacneId value.

Click configure test event

and provide json values for both fields

{
"instranceRegion": "YOUR_REGION_NAME",
"instanceId": "YOUR_EC2_INSTANCE_ID"
}

Click Create and then click test to test the function.You should see message ‘Successfully stopped <YOUR_EC2_INSTANCE_ID>’ and also you can verify the same by going to your EC2 dashboard .

Following the same process create a start instance function with this code

const aws = require('aws-sdk');exports.handler = (event, context, callback)=> {
const ec2= new aws.EC2({region:event.instanceRegion});

ec2.startInstances({InstanceIds:[event.instanceId]}).promise()
.then(() => callback(null,`Successfully stopped ${event.instanceId}`))
.catch(err => callback(err));
};

Once we are done with are functions. Now , we will automate this.

1-d) Automate lambda functions through Cloudwatch events

Till now we have used Lambda function to start and stop instance now we will automate this with the help of cloudwatch events.

I want my instance to stop at 2030 hours everyday and restart next day morning at 0630 hours.

To do this goto Cloudwatch service , click rules under events from left menu. Make the changes as shown in below image .

Here , i have scheduled this event to trigger at 2030 hours and target is stopinstance lambda function. Expand configure input and
in Constant (JSON text) pass the parameters to the lambda function in json function
{ “instranceRegion”: “ap-south-1”, “instanceId”: “EC2_INSTANCE_ID” }

Click Configure Details and give this event a name and description before saving it.

Similarly create an event to start instance and we have successfully automated the process.

Use Case-2) Trigger email notification if a instance get terminated or stopped

Consider , a scenario where one of your instance goes down and you are not aware of it . It affected the user experience badly due to which all the traffic at that time moved to any other customer. If you would have notified at that point of time you could have probably done something and save the user traffic going to someother place. We can achieve this using cloudwatch event and for that we are going to use SNS service and topic that we created in Create Alarms and send Notification section.

Create a new Rule and set the below configuration.

Here , i have set the rule to trigger when any of my instance changes its state from starting to stopping or stopped or shutting-down or terminated i should get notified on the topic created in SNS.

Use Case 3)Automatically update IP addresses without using Elastic IPs

You have a domain name in Amazon Route 53 pointing to an Amazon EC2 instance. However, if the instance is stopped and started, its public IP address changes. This breaks the A-Record since it is pointing to the wrong IP address.

One solution to this problem is having an Elastic IP and attach your instance to this IP. In this case A record will not break. But, AWS gives a default limit of 5 Elastic IP addresses per region. You can request a limit increase, but what if you need lots of them?

Another solution is using lambda function and cloudwatch event. Lets see how this solution works:

3–a ) Create a Policy.

Goto IAM service and create a new policy. In Json editor , paste the below policy

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:*:*:*"
},
{
"Effect": "Allow",
"Action": [
"route53:ChangeResourceRecordSets"
],
"Resource": "arn:aws:route53:::hostedzone/<HOSTED_ZONE_ID>"
},
{
"Effect": "Allow",
"Action": [
"route53:GetChange"
],
"Resource": "arn:aws:route53:::change/*"
},
{
"Effect": "Allow",
"Action": [
"ec2:DescribeInstances"
],
"Resource": "*"
}
]
}

Note that the above policy specifically allow the role to perform “ChangeResourceRecordSets” on the Route 53 zone with specified hosted zone id, and to perform “DescribeInstances” on all EC2 instances. The other actions related to logging are required for AWS CloudWatch logging streams to work correctly.

3-b ) Create a new role for lambda usecase and attach above policy to it

Please refer to use case 1-c for the steps.

3-c) Create a lambda function

Create a python based lambda function and paste below

from __future__ import print_functionimport boto3, json, reHOSTED_ZONE_ID = '<HOSTED_ZONE_ID>'def lambda_handler(event, context):
ec2 = boto3.resource('ec2')
route53 = boto3.client('route53')
instance_id = event['detail']['instance-id']
print(instance_id)
instance = ec2.Instance(instance_id)
instance_ip = instance.public_ip_address
print("Processing: {0}".format(instance_id))dns_changes = {
'Changes': [
{
'Action': 'UPSERT',
'ResourceRecordSet': {
'Name': "<RECORD_SET_NAME>",
'Type': 'A',
'ResourceRecords': [
{
'Value': instance_ip
}
],
'TTL': 300
}
}
]
}
print("Updating Route53 to create:")response = route53.change_resource_record_sets(
HostedZoneId=HOSTED_ZONE_ID,
ChangeBatch=dns_changes
)
return {'status':response['ChangeInfo']['Status']}

Here , when the EC2 instance state change event is triggered, details about the event are passed to the lambda function. This includes details about the affected EC2 instance, such as its identifier. This allows us to look up details about the instance, including its public IP address, and the tags.

3-d ) Create a cloudwatch event

Goto Cloudwatch and create a new rule event with below configuration.

When provided instance’s state changes from stopped to running , this event will trigger our lambda function.

Now to test stop your EC2 instance and start again, you will see record in the route53 will be update with the new public ip. You can also look the logs in cloudwatch log groups.

With that we came to an end with cloudwatch services tutorial where we discussed services that cloudwatch provide , configuring them step by step with the use cases.

I hope you all get some learning from this. Keep following!!

--

--

Prakhar Agarwal

An enthusiastic coder ,learner and a mountain lover.