x

Meet with us at AWS re:Invent and learn a better way to troubleshoot and secure your applications. Schedule Now!

AWS Lambda: Real-World Use Cases for the DevOps Engineer

Nov 14, 2019

 

aws-lambda-devops-use-cases

Since AWS Lambda’s introduction in 2014, which marked the beginning of the of serverless evolution, the development world has benefited from unprecedented levels of velocity and agility. As a result, developers have been able to focus on development of new features and innovation, without having to maintain or provision complex infrastructure.

For businesses, this has translated into rapid growth at a global scale in a matter of minutes, and all without having to deal with scaling the underlying infrastructure. For DevOps teams, AWS Lambda has the added benefits of reducing operational overhead costs and increasing productivity for DevOps/SRE engineers.

This article will examine real-world use cases in which AWS Lambda can be employed with other serverless services to automate daily tasks and processes for DevOps teams, thus saving significant time and effort. Discussed as follows are just a few task scenarios for which AWS Lambda can be used.

Cost Optimization

Building continuous integration/deployment workflows as well as reproducing and replicating bugs in production efficiently require multiple environments be maintained. This, however, can be costly. In order to reduce AWS cloud infrastructure costs, any EC2 instances running 24/7 unnecessarily (e.g., sandbox and staging environments) must be shut down outside of regular business hours.

Figure 1 below shows an automated process for starting and stopping instances according to a time-based schedule in order to reduce expenses. It is a perfect example of using the serverless approach.

image3Figure 1: Scheduling EC2 instances start/stop

In this case, you can configure a scheduled cron job on Amazon CloudWatch Events to trigger Lambda functions. The function’s handler will scan all EC2 instance metadata and identify those with an “Environment” tag, while ignoring those without. Once all instances with the target tag have been identified, the Lambda function will use AWS EC2 API to start or stop the target instances at the designated time.

Another use case for reducing infrastructure costs is the employment of Lambda functions to delete unassigned Elastic IPs and unused Amazon Elastic Block Store (EBS) volumes. Because these resources constitute a significant portion of total AWS infrastructure costs, deleting orphaned resources can result in major savings.

image5Figure 2: Removing unattached EC2 resources

 

Similarly, the EC2 API can be used to delete or release unassigned IPs and unattached EBS volumes as described in the above scenario.

Logs & Events Analysis

To enable compliance and enhance governance, tracking AWS activity is a must for organizations, no matter the size. Doing so can help you continuously monitor your AWS environment’s security and detect insecure or undesirable activity in real-time. Not only does this increase security protection against infrastructure breaches; it can result in thousands of dollars saved. To this end, AWS offers AWS CloudTrail. The service captures and stores action and activity log feeds created in your AWS account. It enables you to track events in all AWS regions in a single S3 bucket.

image6

Figure 3: AWS serverless events analysis

The above pipeline can be set up from the S3 bucket, and Lambda functions can be made to react to those events. The Lambda functions will then parse events and store them in Elasticsearch for indexing. Interactive charts can also be created in order to visualize these events in a dynamic Kibana dashboard:

image8

Figure 4: CloudTrail log for near real-time analysis in Kibana

In addition, there is an option to set up alerts based on specific events (e.g., someone accessing your environment from an unauthorized location or IP address range) to be alerted in near real-time.

Another way to do log and events analysis is through VPC Flow Logs. This feature records information about the IP traffic going to and from elastic network interfaces (ENI) in your Virtual Private Cloud (VPC) and exports the logs to Amazon CloudWatch Logs. All logs are aggregated and streamed to AWS Kinesis Data Streams. Kinesis triggers Lambda functions (workers), which analyze logs for events or patterns and send a notification to third-party platforms like Slack or PagerDuty if abnormal activity has been detected.

Finally, Lambda posts the dataset to Amazon Elasticsearch with a pre-installed Kibana to visualize and analyze network traffic and logs with dynamic and interactive dashboards in near real-time.

The following diagram summarizes the entire ETL pipeline:

image7

Figure 5: Real-time anomaly detection in VPC Flow Logs

Note: After buffering the incoming logs from the source destination, Kinesis will write the data to an S3 bucket for backup. The bucket can be configured with a lifecycle policy to archive unused and old logs to Amazon Glacier for long-term retention (useful for organizations with compliance and auditing requirements).

Automated Backups

Scheduled tasks and jobs are a perfect fit for Lambda. Instead of keeping an EC2 instance up and running 24/7, AWS Lambda can be used to create backups, generate daily/weekly reports, and execute batch jobs. The following schematic diagram describes how to use AWS Lambda to perform an automated backup for MySQL cluster in case of disaster recovery:

image2

Figure 6: Automated RDS snapshots with AWS Lambda

A cron job will trigger a Lambda function periodically at midnight, which will issue an AWS RDS API call to take a snapshot of the current state of the MySQL cluster’s data, and store the results in an S3 bucket. A lifecycle policy will be applied to the target bucket to move old snapshots to a Glacier after a certain number of days.

ChatOps

A Natural Language Understanding (NLU) service, such as Amazon Lex, can be used to build interactive bots that can trigger Lambda Functions for intent fulfillment in response to voice commands or text. The following diagram describes a use case for building a Reaper bot with AWS Lambda:

image4

Figure 7: Building a Reaper Bot on Slack to manage K8s clusters

A user can issue a /deploy command on a Slack channel with the number of nodes of the desired Kubernetes cluster as a parameter. Slack API will invoke Lex Skill, which will carry out language recognition and transform the text into intents, which will trigger a Lambda function that in return will use Amazon EKS API to deploy a new K8s with the defined number of nodes.

Alerting & Notifications

Another practical use case is real-time monitoring of Lambda functions. Before going serverless in production, it is necessary to understand how monitoring and alerting works with Lambda functions. That’s where Thundra comes into the play. It consumes and analyzes log feeds streamed by Lambda functions to Amazon CloudWatch Logs. It thus offers greater visibility and deep insights, allowing any Serverless platform performance bottlenecks to be identified immediately.

image1

Figure 8: Simplified Serverless insights with Thundra's alerts

Thundra's alerting feature also sends out immediate alerts when an extensive query about memory usage provides abnormal results. A classic example is performing heuristic analysis of Lambda function logs to avoid excessive over-allocation of memory and find the right balance between memory and execution timeout. Organizations can thus reduce and optimize monthly Lambda function costs.

Final Note

Another benefit that AWS Lambda brings to the table is its ability to support a wide range of runtime environments (polyglot) such as Golang and Python. This makes it an easy solution, eliminating the learning curve for non-developers like DevOps and cloud engineers. Operations teams are able to write their automation scripts in any programming language thanks to AWS Lambda runtime and layers APIs, as long as it gets the job done. They can then deploy them to AWS Lambda in seconds with zero operational overhead and minimal effort, and all while maintaining a secure and stable infrastructure.