A Different Lens to Monitor your Serverless Architecture: Operations Search!

Oct 21, 2019

 

operations-search

Less than 2 months left until re:invent 2019 and there is no doubt that serverless will be the hottest topic of this year just like last year. Looking back where was Thundra by last year’s re:invent and where it’s now, it’s obvious that we managed to improve our product according to the feedback we gathered from our customers and serverless community. Some examples can be distributed tracing to let people track their serverless transactions or advanced alerting based on flexible queries, and many more. 

After many updates that resolve the issues from functions perspective, our customers asked us about the ability to look to serverless stack from another angle. “It is no doubt that Thundra has the most flexible way of investigating issues from function perspective but we need to be able to get alerted when my DynamoDB starts to throttle on Thundra” sparked the idea of the feature that we are announcing today. We’re glad to introduce this new angle to look at your serverless architecture. From now on, you don’t need to limit yourself from the lens of serverless functions. Instead, you’ll explore wherever in your serverless stack, and however you want thanks to Thundra’s flexible queries for your operations.

So, What’s new?

As you may know, you can run flexible queries on your functions, invocations, or traces in Thundra. In this way, you can create different ways to look at your system, save your point of view with your colleagues or save it as an alert to stay informed. However, the object of all of these capabilities is functions. You are not able to filter or search from the perspective of -let’s say- DynamoDB. For example; you can’t search for the operations from any resource that got IllegalAccessException from any function. Until today! 

You’re now able to filter your operations much more flexible. For example; you can type a query like this: 


ResourceType=AWS-SQS AND Duration > 300 AND Name IN 
(event-processing-start-lambda-node-lab,event-processing-start-lambda-node-layer) ORDER BY StartTime DESC

In this way, you can filter the considerably slow operations for SQS resources for a specific set of queues. Navigating through AWS services are cool, what about third-party APIs or non-AWS resources? Following query filters the Twilio operations in which you had a specific type of error: 


Erroneous=true AND ResourceType=Twilio AND ErrorType=NOT_FOUND ORDER BY StartTime DESC

With these queries, you can dive as deep as possible into your system. Let’s say you have found what you are looking for. So, what’s next? We are keeping a door open to the rest of Thundra and you can continue discovering about this operation. You can navigate to the invocation that this operation is part of. Then, you can navigate to the distributed trace in which this operation resides. You can see the error stack if those operations are erroneous.

image1

Figure 1: Viewing an erroneous S3 operation with the error message and the error stack.

image3

Figure 2: Jumping into the traces of the operations filtered.

image2

Figure 3: Jumping to the distributed tracing view of the operations.

When you believe that you have created a point of view that you or your teammates want to check later on, you can save it as a query. You can also set it as your default query so that this can be the first thing you see when you first logged in to Thundra. 

This is cool, what’s next?

We are very happy to be the only application that lets you dive deep into your resources along with your functions. As you can expect from our customer-obsessed product development approach, we will continue to make it even better in the following days. In the next couple of days, you’ll be able to save the queries as alerts and get notified when anything unusual happens in your system -not only for functions-. Similar to all other events that we currently have, you’ll be able to forward these alerts to Opsgenie, Pagerduty, Victorops, Slack, your inbox or to any webhook. 

We also have other cool news about our upcoming feature update, we are now working on a feature that will let you see the metrics for your business transactions that are composed of several Lambda invocations and resource executions. For example, you’ll be able to monitor some crucial metrics such as cost, health, throughput for user-signup-transaction that includes Auth0, Stripe, and DynamoDB operations and several Lambda functions. We believe that you’ll be able to improve your business transactions in this way. This is going to be dope! 

Summing up

Re:invent 2019 is approaching and all the vendors and companies are pushing hard to show off their coolest product ideas and enhancements. Enabling our users to check other AWS and third-party resources was another done in our checklist. We are looking forward to announcing some more cool features till re:invent. I’m so excited about many things that are on the road and I can only say “Stay tuned!” for now. If you want to chat on several stages of serverless with us, you can ping us over Twitter(@thundraio) or join our Slack channel. You can sign up to Thundra or see live demo to see Thundra in action.