How to debug and identify parts of the code that cause a bottleneck effect in a serverless application can be a difficult question to tackle. For a regular application, you have many debugging and monitoring tools that enable you to observe your application while it is running, hence making it easier to identify any faulty and inefficient parts of your application. However, as you may already know, serverless is a brand new technology and, therefore, it lacks such tools that provide debugging and tracing capabilities. As first-hand users of AWS Lambda functions, we understand the agony that comes with not being able to trace and debug your functions, especially when there are errors or there is unexpected behavior. Considering how rapidly popularity of serverless is growing it is imperative for the programmers to have means of unimpeded, smooth development. This is where Thundra comes in, “Full Observability for AWS Lambda”, which aims to give, as the slogan suggests, full visibility of what is going on in your application to ease the programmer’s life.
Currently, the most widely used programming languages for serverless development in the community is Nodejs and Python. In response to the community’s preference, Thundra has expanded its list of supported languages by adding NodeJs, Python and GO after starting off with Java.
In this blog post, I am going to talk about how you can instrument your Python Lambda functions with Thundra’s Python agent (version 1.3.0 and onwards). Using Thundra’s tracing capabilities you can see trace charts, metrics, and logs in a more explanatory and user-friendly manner. You can see execution time and any occurred errors during the invocation of your functions. Make sure you have followed the instructions under Python section of Thundra docs to successfully wrap your function with the Python agent before instrumenting it.
Thundra uses OpenTracing API to implement instrumentation. Thus, you can manually instrument your code by following OpenTracing API instructions. If you are already familiar with OpenTracing then you can easily adapt Thundra’s instrumentation. You can learn more about OpenTracing from this blog.
How to instrument your function manually?
To perform tracing you need to have a tracer object and span objects. Tracer starts spans.
When a function is called, the tracer will start an active span. If another function is called inside this function, another span is started for callee and added as a child span to caller’s existing span. This generates a span tree. In order to trace your functions with Thundra, you need to first create Thundra tracer and start the active span as follows:
def function(): tracer = ThundraTracer.getInstance() with tracer.start_active_span(operation_name='function', finish_on_close=True) as scope: #do things
You can also add tags to active spans manually. For instance, you may use tags for reporting errors. Example usage can be like following:
def function_with_error(): tracer = ThundraTracer.getInstance() with
finish_on_close=True) as scope: try: #do things except Exception as e: scope.span.set_tag('error', True) scope.span.set_tag('error.kind', e.type) scope.span.set_tag('error.message', e.message) finally: scope.finish()
The fields for errors are also specified by OpenTracing.
By adding errors to span tags, you can see them in the Thundra Web Console.
Example Use Case
Let’s see Thundra in action now! Our example application consists of user service and user repository layers which basically have “get_user” and “save_user” operations. Let’s examine save_user operation.
Above is an example of UserService class. We are going to examine “save_user” method which first checks whether a user with the id already exists or not before saving the new user. After validating the id does not start with a number, the user is saved.
Above is the UserRepository class which has “save_user_to_repo” and “find_user” methods. We will trace the “save_user” operation of “UserService”. The flow starts with “save_user” method of UserService class and after some checks it saves the new user (UserService.get_user_with_id -> UserRepository.find_user_with_id -> UserService.validate_id -> UserRepository.save_user_to_repo). By instrumenting the above code we can get very useful trace charts that can help us analyze our functions in depth. You can observe which functions take longer than the others and detect if there are any anomalies. For instance, in the trace chart below, it seems that validate_id() method takes almost the same as find_user_with_id() method although find_user_with_id() makes some I/O operation to fetch the user, but validate_id() does not. It is supposed to be faster since it only checks whether a regular expression is matched or not, so it might be worth it to check your regex.
With the trace chart, you are able to see which functions take longer time than expected or if an error has occurred during the invocations. You can add any data you want to span tags so that you can see them as raw data later in the trace chart. Example usage is shown below:
def function(): tracer = ThundraTracer.getInstance() with tracer.start_active_span(operation_name='function',
finish_on_close=True) as scope: scope.span.set_tag(key, value) #do things
Now that you have seen a demo of Python instrumentation it is time you try it out yourself! You can play with it in Thundra’s free public beta and explore other functionalities as well. If you have any questions about Thundra and serverless observability, join our Slack channel. For now, we have manual instrumentation but stay tuned, we’ll be coming back with new additions to our Python agent very soon!