One of the greatest benefits of serverless is the ability to develop microservices, mobile applications, and APIs quickly and inexpensively without the complications involved with managing servers.
A serverless approach can also boost your team’s productivity, and AWS services like AWS Lambda can handle scaling and high availability while costing less than running a server.
However, serverless does present some challenges. Monitoring, debugging, and security concerns, for example, are all addressed differently than how most teams would tackle these in a traditional server environment.
Thundra is an AWS Partner Network (APN) Advanced Technology Partner with the AWS DevOps Competency. Our cloud-native observability tool helps you test, debug, monitor, and troubleshoot AWS Lambda functions and their environment.
Thundra is available on AWS Marketplace, and its software-as-a-service (SaaS) is available through the five most popular runtimes used on the market, with its seamless observability libraries written for Java, Node.JS, Python, C#, and Golang.
With the ability to view from different perspectives (duration, error, cost, resource consumption), Thundra helps users pinpoint the root cause of errors in AWS Lambda functions and other resources. Errors, cold starts, and timeouts can thus be dramatically reduced.
Thundra enables you to understand the issues behind the errors in stateless environments, and track the health of third-party APIs and resources effortlessly.
This post will demonstrate how Substantial, a software consultancy company, used Thundra for quick monitoring of AWS Lambda functions and other AWS services. I will walk you through how Substantial optimized their monitoring, debugging, and troubleshooting efforts with Thundra’s full serverless observability platform.
Serverless Observability is Crucial
Substantial builds on Amazon Web Services (AWS) using serverless architecture to keep operational costs low and maintain high availability.
Their product, a Trello Power-Up called Hello Epics, is fully serverless and supported by a team of approximately 40 developers, DevOps, and data and analytics team members who collaborate to design, build, and operate this product.
Substantial needed visibility into the AWS Lambda environment to understand issues users were experiencing and to monitor performance over time. They also wanted granular cost monitoring, which would put them completely in control of their bill.
Modern applications in today’s software systems have become highly distributed and more event-driven than ever. Writing tests to identify known unknowns using known metrics has become insufficient. Instead, Substantial required an approach that combined both monitoring and testing of system data.
Thundra’s solution offers a unique view of the serverless architecture, called the “architecture view,” which helps Substantial to spot issues at a glance. Once Thundra was plugged into Substantial’s AWS account, it automatically discovered and drew their serverless architecture diagram. Errors and slowdowns can then be spotted at a glance.
Thundra’s color-coded architecture diagram made it easy for Substantial to understand the severity of the issues, enabling problems to be addressed faster.
Figure 1 – Thundra can discover bottlenecks and detect errors in your serverless architecture.
Thundra gave Substantial the confidence to take on customer projects, like delivering a top-to-bottom software system in serverless using AWS Lambda functions.
The deep visibility of serverless architecture Thundra offers helped Substantial discover bottlenecks with respect to health, latency, and costs. It also let Substantial run design, development, and execution processes for serverless applications hassle-free.
To check the health of the serverless architecture, Thundra enabled Substantial to discover where exactly the error is and what the blast radius of the problem was using the architecture view. Thundra enables Substantial to further analyze the issue by allowing users to check every single invocation containing the errors and by deep tracing the values of local variables.
Discovering the Root Cause for Slowdowns
There are many reasons why an AWS Lambda function’s execution duration takes longer than expected. Thundra provides a bird’s eye view of your functions, allowing you to see the distribution of invocations over time. This is particularly useful for detecting outliers in the absence of system errors.
Thundra helped Substantial identify that the company was dealing with timeouts due to a misconfiguration of Amazon DynamoDB for some transactions. By making a small configuration change in their connection to DynamoDB, these timeouts were completely eliminated.
Figure 2 – Thundra allows you to see the distribution of invocations over time.
Thundra also helps measure latencies added by APIs in transactions. This is important if your system interacts with third-party APIs like Stripe or Auth0.
Controlling Costs at the Highest Level of Granularity
Serverless is highly-distributed and designed to be cheaper because of its pay-per-use model. However, failure to do fine-tuning and make necessary adjustments in your system can lead to increased costs you may not even be aware of.
With Thundra, detailed tracing shows where your functions are wasting their time, providing Substantial a way to optimize the processing time of their AWS Lambda functions.
Substantial wanted to use an observability tool that doesn’t cause indirect costs for them. Thundra offers two different solutions to that problem—intelligent sampling and asynchronous monitoring.
The sampling feature allows Substantial to reduce the amount of data sent to Thundra. They used their data allotment more wisely while keeping the data transferred out of the account minimal.
In addition to basic sampling methods (such as by count and time interval), data can be sampled more intelligently. For example, it’s possible to sample erroneous invocations or those that are performing poorly.
In order to achieve zero latency with Thundra, Substantial took advantage of AWS Lambda’s asynchronous monitoring feature. With this method, Substantial can send the monitoring data asynchronously after the invocation finishes. As a result, data flows through Amazon CloudWatch instead of making a separate HTTP call.
Tracing Serverless Transactions End-to-End
In today’s world of microservices, more applications are becoming distributed. Applications are made up of separate services combined to work as a full system. Often, to scale these distributed microservice-based systems on AWS, you need to replicate your services across Amazon Elastic Compute Cloud (Amazon EC2) instances or AWS Lambda functions.
Since metrics, traces, and logs are also distributed, Thundra offers numerous benefits by surfacing observability into your distributed application in one central and easy to understand location. This gave Substantial developers the power to search for specific transactions and to understand their performance.
Thundra automatically discovers the distributed data traces and chain of invocations inside your serverless architecture—including latencies and data flows between functions—and visualizes them in an architectural diagram, as seen in Figure 3 below.
This also gives you a view of what’s happening in the functions method by method and line by line. This combination of distributed and local tracing is called full tracing.
The full tracing capability enables Lambda users to monitor functions from above (indicating which services and resources they interact with) and in-depth (for example, the ability to see the value of a local variable at a specific line during execution).
Figure 3 - Discovering erroneous parts of your distributed business logic with full tracing.
Responding to Issues Faster
Application performance management tools help users stay on top of problems through troubleshooting, and reduce mean time to repair (MTTR). This is particularly hard in serverless because it’s not straightforward to understand what part of your system is impacted.
Thundra’s flexible querying capability allows you to view your system from different perspectives, including health, latency, resource usage, cost, and more. To better filter your data and distinguish particular data, variables such as tags are needed.
Thundra incorporates custom tags to filter the data even more flexibly. After Substantial created their own way of monitoring their serverless architecture, they managed to monitor many of their key KPIs with Thundra’s actionable alerts.
When a real-time event meets the alert conditions in the query, Thundra creates an alert event. Whenever a critical condition requiring attention occurs, a notification is sent. Clicking on the notification reveals the event violating the alert condition in the Thundra console.
Figure 4 - Details page of an alert event in the Thundra console.
Thundra also displays violating invocations within the specific time range, and it’s possible to delve deeper into the invocation trace charts where the violation occurred.
Installing Thundra to Your AWS Account
The following sample project entails developing a backend API for a mobile application. The project can be downloaded from GitHub.
The first step is subscribing to Thundra on AWS Marketplace. Go to the Thundra app, sign in to the console, and get your API key, which will be used during the configuration settings.
The sample project uses Thundra’s Node.js Lambda Layer, and provides code instrumentation without any change in your AWS Lambda function.
Here is the step-by-step guide:
Step 1: Installation
In the thundra-examples-lambda-nodejs/thundra-sam-mobilebackend directory:
Step 2: Configuration
Open thundra-examples-lambda-nodejs/thundra-sam-mobilebackend/deploy.yml and set your API key:
Default: #TODO: enter your API key here
Step 3: Deploy
In the thundra-examples-lambda-nodejs/thundra-sam-mobilebackend/ directory:
sam package --template-file deploy.yaml --s3-bucket
<YOUR-DEPLOYMENT-BUCKET-HERE> --output-template-file deploy.yaml
sam deploy --template-file ./deploy.yaml --stack-name
<YOUR-STACK-NAME-HERE> --capabilities CAPABILITY_IAM
Step 4: Invoke
In thundra-examples-lambda-nodejs/thundra-sam-mobilebackend directory:
curl -X GET
Step 5: Monitor your function on Thundra
From the Thundra console, your Lambda functions can be seen on the functions page.
Figure 5 - AWS Lambda function details in the Thundra console.
The invocation details are also visible on the functions page.
Figure 6 - Method trace chart of an invocation in the Thundra console.
The complexity of building microservices while benefiting from the latest cloud technologies often proves to be a time burden for developers. Both engineering and DevOps teams can face observability challenges during the development or production stages that can adversely impact cost and efficiency.
Thundra’s serverless observability platform for testing, debugging, monitoring, and troubleshooting your serverless applications offers rich visualizations of aggregated metrics, logs, and traces.
For the team at Substantial, Thundra gives full insight into their serverless stack with distributed and local tracing, and lets them stay on top of incidents with actionable alerts.
According to Aaron Jensen, principal developer at Substantial, "Thundra is now a must-have in our toolkit when building AWS Lambda Serverless applications. It gives us the insight we need to keep things running smoothly for our customers and clients."
Subscribe to Thundra to achieve full visibility into your serverless applications.