4 minutes read

POSTED Jul, 2020 dot IN troubleshooting

Troubleshooting a problem with Thundra at 3 am

Suna Tarıyan

Written by Suna Tarıyan


Product Manager @Thundra

 X

We have all been at a point where you get stuck with a serverless application issue and can see no possible solutions because the problem isn’t clearly presenting itself. The distributed and asynchronous nature of serverless systems present a unique set of problems due to its inherent complexity, so one must adopt a different troubleshooting approach when it comes to serverless applications compared to traditional monolithic systems. In general, you are left with a pile of logs with no meaningful insights, leading to frustration and confusion about the problem at hand.

Now there are a variety of tools and services that allow you to debug serverless functions from your local machine. Most of these solutions and other third party libraries do a good job in creating a replica of your serverless architecture. However, then there are times when this doesn’t meet your needs because you will need to see the events happening in real time within the actual system. No matter how sophisticated libraries get, a mock system cannot truly depict the performance and behaviour of a third party service or specific resources that are essential parts of your live application.

In these situations, Thundra can step in and help give you the lens needed to view the problem closely and help you test your application for bottlenecks. Let’s explore how Thundra can provide the required observability to troubleshoot and debug serverless functions.

Quick view at Online debugging

Thundra’s online debugging allows you to look through your serverless application code via your local IDE while your function executes in real time. You can add breakpoints and inspect the properties of your function while the execution pauses at those breakpoints. You are able to look inside the black-box (lambda) in real time with minimal effort.

For demonstration purposes, we’ll take a look at how online debugging works with a simple scenario wherein a contact form on submission triggers a lambda function to send email via AWS SES, and store that data to a DynamoDB table. API gateway is the trigger for our function here.

To set up a lambda function for debugging, with your application already instrumented on Thundra, the following environment variables are to be added.

  • thundra_agent_lambda_debugger_auth_token: <YOUR_TOKEN> - This is to be obtained from the Thundra’s console.
  • thundra_api: <YOUR_KEY> - This also can be procured from the console.
  • thundra_agent_lambda_debugger_broker_host: debug.thundra.io - The broker establishes communication between your Lambda function and your IDE debugger. There are 4 regions where the host is available. Choose a value closest to the region you are working with.
  • thundra_agent_lambda_debugger_enable: true/false - Set this if you wish to disable/enable the debugger manually. Useful when the auth_token environment variable is present and you want to disable debugging.

Optional:

Before we begin debugging, we’ll also need to install Thundra’s plugin to our local IDE. Thundra has native support for VSCode and IntelliJ IDEA plugins. For other IDEs there is a portable client that can be setup.

We have the function deployed via AWS SAM and VSCode is the IDE we are working with. Ensure that the function’s timeout value is high enough (around 300 sec or more) for online debugging.

Start the debugger from the command palette. We invoke the function by sending data via API gateway from the console.

command palette

Now the current problem at hand is that the function sends out the email via SES but the form data doesn’t show up inside the DynamoDB table. The function pauses it’s execution at the break points you’ve previously placed.

DynamoDB table

By observing the variable values, it appears that the validation function for email fails because the received email ID itself was incorrect. This is why the if condition before insertion into DynamoDB table doesn’t go through.

This process is similar to debugging the code locally, but its in real time. When you are dealing with a complex function having multiple service calls, finding out bugs like this would be difficult and time consuming as there would be no log trace to indicate what was going wrong inside the function.

Other ways to troubleshoot

Thundra scans your application to present different perspectives and information that can lead to resolve various conflicts within your code. The following are some of the ways you can troubleshoot serverless functions.

  • Offline debugging
    • This primarily involves viewing code snippets which were recorded when the function was invoked. The idea is to peruse through the code to inspect the variable values and any API calls that lambda may have during its execution.
    • You're essentially troubleshooting your application from the inside after the invocation has finished.
    • Navigate to the left panel to Functions -> Choose the function you want to inspect -> select from the list of invocations. You should see a bug icon next to the Method, click on it to view the code.
    • Offline debugging is  available for Java, Python and Node.js runtimes. For more details on supported library versions and enabling this feature refer to the documentation.

  • Unique traces
    • Thundra records the interaction between those services when a specific transaction occurs in your application.
    • For eg. the refund process within an eCommerce application. The refund service flow remains the same for every user and every product, and this is considered a Unique trace that Thundra will show you when you wish to investigate issues specifically with this flow.
    • With unique traces you can narrow down to the specific area of the application that needs to be looked into for troubleshooting it.
    • The unique traces view can be navigated to via the left panel at the console.

Conclusion

Now we have seen some of the essential ways Thundra can help troubleshoot and debug serverless applications. As you further explore the platform, you’ll see that the information that Thundra captures is presented in different views that give meaningful insights on how your distributed system performs in real time.

You’ll be troubleshooting, but even when you’re performing integration tests on a new serverless product in development, it’s imperative that the tests you run are executed in an environment similar to your production environment to capture any issues as quickly as possible. A fail-fast ideology works well for the quality of your product in the longer run. In such situations, Thundra’s observability and troubleshooting tools can have a positive impact on the performance of your system in the short and long term.