Reducing AWS Lambda Costs with Thundra
Serverless technology is praised for multiple reasons, one of which is cost efficiency. And it’s true: many cloud providers offer a generous free tier and as a result, numerous companies don’t pay a cent to run their infrastructure.
But sometimes, “function as a service” offerings, like AWS Lambda, can inflate the bill in unexpected ways. If everything goes smoothly, they’re as cheap as it gets, but unforeseen edge cases can cause them to spin out of control. Flaky function code can lead to multiple invocation retries, all of which means additional charges, and a suboptimal choice of memory for a function lets it run much longer than needed.
Luckily, there is a range of methods we can use to avoid these problems, and Thundra provides the insights needed to implement them.
1. Optimize Memory and CPU Allocation
The first method is to optimize the memory and CPU allocation for a Lambda function. Lambda function settings only allow you to set the memory explicitly, and the CPU will be set as “proportional to the memory configured.” This means that the more memory you allocate, the faster the CPU will get.
It might seem most cost-efficient to turn the memory as low as possible, but this isn’t always the case. For CPU-intensive Lambda functions, it can be cheaper to dial up the memory, which can lead to shorter run times and in turn, cheaper invocations overall. Sometimes a high-memory allocation that completes quickly is cheaper than a low-memory allocation that runs for a long time.
A rule of thumb: I/O-bound functions are often better served with low memory allocation and CPU-bound functions with high memory allocation.
One tool that can help tremendously is AWS Lambda Power Tuning, which will run a Lambda function with all possible memory allocations and suggest the most efficient one.
Important note: Sometimes cost efficiency isn’t a good thing because you end up with a Lambda function that takes forever to complete. Sure, it might be cheap now, but maybe your users don’t want to wait.
The Thundra Console can help you find the Lambda functions that could benefit from optimization. A function that is called only once in a while and doesn’t require a user to wait for it to complete is probably not worth optimizing.
2. Remove Broken Invocations
The next problem is too many invocation retries of your Lambda functions. When you invoke a Lambda function asynchronously, it will retry the invocation if an error occurs. This can have numerous causes—buggy code, for example, or invalid events. If the service retries the invocation repeatedly with the same event, the invocation may always fail, but you’ll still have to pay for it.
There are multiple ways to solve this. If there is buggy code, the obvious route is to boost code quality. Improving testing practices and code reviews are two ways to eliminate bugs in existing code and in newly written code as well. Thinking more about the architecture upfront, utilizing design patterns, and adhering to best practices can also improve code quality and minimize bugs.
The “Functions” page of the Thundra Console can help you identify problematic Lambda functions that could need improvement. You can sort the functions in ascending order according to their average, median, or 99th percentile duration and start investigating broken functions, as shown below.
Figure 1: Filter queries for functions view
Let’s dive into one of the functions whose 99th percentile duration is problematic. When we click on the function name, we see the invocation list for that function. In order to take the correct samples, let’s sort the invocations according to duration and exclude cold-started invocations.
Figure 2: Function invocations
Let’s try to understand the reason for the problem by checking the trace chart of that invocation. You can see at a glance that the first “WRITE” operation to DynamoDB is taking almost 3 seconds and slowing down the invocation.
Figure 3: Trace of function invocation
Let’s use offline debugging to debug what happened in the application. We can see if the values sent to DynamoDB are causing a slowdown by checking them in the Thundra Console.
Figure 4: Offline debugging view of function invocation
3. Minimize Waiting for Services
Lambda functions that idle while communicating with other services are another potential problem. They calculate something and send it somewhere—an internal service or an external API—and then wait for an answer. Depending on the status of the service, this can go quickly or take a long time, and if it does go slowly, you’re paying for the time but not getting anything in return.
The Thundra Console shows you which services your Lambda functions are talking to and how long these services take to complete the requests you’re sending them. Often you can’t do much on the Lambda side of things to optimize this behavior directly. We can look at the same function we investigated in the previous section to see what services are causing more bottlenecks (see below).
Figure 5: Function detail view
The heat map shows that DynamoDB is causing the function to slow down, since it's slower for longer invocations. You can also see that 50 percent of the selected invocations are cold-started.
Throwing more memory at the problem won’t solve anything. The right way is to look at the services you’re talking to or to reconsider the means of communication entirely.
If you have direct access to the service, you can try simple improvements there. For example, more resources (such as RAM, CPU, or instances) could be enough to produce the desired results.
If you don’t have access or the service is already at its limit, the only way to optimize is to rethink how you communicate with it. In situations like this, asynchronous communication is often a good idea.
Many AWS services are able to trigger events that can be handled by a Lambda function. This means that you can send a request to a service, then stop the function right away and wait for the service to start it again via an event trigger that indicates that the service has finished processing your request.
Also, many third-party services that our Lambda functions usually talk to via HTTP offer webhook functionality: You give them a URL, and they will call when they’re done. This URL could be hosted by API Gateway or EventBridge, which route the call to one of your Lambda functions.
Your last resort here is some kind of polling with the help of Step Functions. You stop your Lambda function, then restart it within a given interval to check if the service has new data for you.
4. Avoid Lambda Functions
The last method is to avoid Lambda functions entirely. After all, a function that doesn’t exist can’t cost any money! The background here is that some AWS services, like API Gateway and AppSync, can communicate with other services via VTL templates.
VTL stands for velocity templating language, and in AWS it’s used to transform JSON data—that is, to make the event or request from one service look like something another service can understand.
API Gateway, for example, can take a JSON via HTTP and transform it into something DynamoDB can store. This eliminates the need for a Lambda function to serve as “glue” between these services, and in turn, eliminates the cost and latency associated with such a function.
If you know the services to which you can link via VTL, the “Architecture Overview” page in the Thundra Console (below) can help you identify Lambda functions that can be replaced.
Figure 6: Architecture overview
Richard Boyd, a developer advocate at AWS, has written extensively about API Gateway service integrations in his blog.
Thundra Helps You Avoid Unpleasant Billing Surprises
Many parts of a serverless system can be improved to prevent a surprise at the end of the month when the infrastructure bills come in.
Lambda functions may seem prohibitively expensive compared to the direct costs of an EC2 instance, but serverless architectures emancipated themselves from the lift-and-shift paradigm of the past, and therefore have their own design considerations.
Thundra’s architecture overview lets you locate Lambda functions and services that could benefit from cost reduction. Thundra also makes the optimization process more pleasant by providing the tools to get details on the Lambda functions and their invocations.
The methods discussed in this article can help you keep costs low and occasionally even improve performance, but you should keep in mind that it’s not always about squeezing out the last dime—especially when the user experience starts to suffer.
With Thundra’s generous free plan, you can detect the cost bottlenecks in your application, manage overall application behavior, and sustain application health moving forward. You can sign up directly from the Thundra Console or start your subscription over AWS Marketplace so that if you ever switch to a paid plan, the cost will be added to your AWS bill.