5 minutes read

POSTED Nov, 2022 dot IN Serverless

Tips and tricks on optimizing your API performance

Serkan Özal

Written by Serkan Özal

Founder and CTO of Thundra


Understanding API performance

APIs play a critical role in the overall performance of an application because it benefits customers to streamline the purchasing of their products and services. APIs should be correct, fast, secure, available, and reliable. Developers can determine API performance by defining performance tests—functional and load tests—and monitor it by tracking various metrics like the number of requests to API endpoints, time taken by an API endpoint to respond, and the amount of data responded to.

This post will explore optimizing API performance, especially for large production environments, and cover some best practices in cloud environments for DevOps engineers as well.

Importance of API Performance

Monitoring and performance testing are always better options than having to work on your API after encountering performance issues. Also, if these issues enter the customer's environment, it could damage the reputation of your software and even the company. Moreover, troubleshooting and debugging performance issues in the production environment is a time-consuming and very tedious task.

All of this means that it’s very important to optimize the performance of your API to compete in today’s technology environment—it also helps deliver improved latency, better results, and higher customer confidence.

Factors Affecting API Performance

Some of the major factors that affect the performance of APIs are:

  • Network latency: No matter how good your API design is or how well your API code is written, the network bandwidth and speed matter a lot and thus indirectly affect performance.
  • Large payload: Many times, the design and contracts of APIs are such that we are sending large amounts of data—and sometimes needless and unwanted data—back and forth between requests and responses.
  • Server-side load: Having too many clients sending requests to your API can spike your server load, and if your server is not able to handle this load, it can result in poor API performance.
  • Complex database queries: Most of the time, your APIs are fetching results from different databases; writing complex queries decreases the performance of database operations and in turn, affects API performance.

Other factors such as security, infrastructure design, third-party components, and application design can also impact the performance of your API.

Methods to Optimize API Performance


One of the best ways to improve API performance is by caching to serve content faster. If there are concurrent requests frequently producing the same response, then the response can be cached to avoid excessive database queries and provide quick access.

There are two main caching mechanisms:

  • Client-side or browser caching allows for the immediate availability of saved responses upon subsequent requests, rather than sending a new request that needs to be further sent to the server. This results in the faster loading of web pages and improved user experience.
  • Server-side caching uses a centralized cache to serve users with required data without hitting web servers and database servers. This helps in reducing the load from the application server and improves the loading time of web pages.


REST allows various data formats like JSON, XML, HTML, and more. In all these formats, data compression is an option for enhancing an API’s performance, but you will require additional CPU time to uncompress the data. The gzip or deflate technique compresses data transmitted between the server and the client.

There are two HTTP request headers you need to achieve a gzip-encoded response:

    • Accept-Encoding: Used in request headers for data compression on the client side.
Accept-Encoding: gzip, compress
    • Content-Encoding: Used in request headers for data compression on the server side, where data is compressed before returning to the client.
200 OK
Content-Type: text/html
Content-Encoding: gzip

Asynchronous Operations

Building asynchronous APIs improves performance, as it allows time-consuming requests to keep executing in the background while other requests are being processed. When there are thousands of records to fetch from the server, response time becomes excessively high. In such cases, asynchronous execution is recommended in a multi-threaded environment to retrieve data from the server.


Deployment of APIs can be done in an autoscaling infrastructure so that the server instances can be scaled up or down per the requirement. Autoscaling ensures there are an ample number of required instances running to handle the number of API requests.

Cloud-service providers like AWS provide a service that allows you to configure autoscaling for selected AWS services of an application. This entails configuring additional computing power to handle the increased load on the application and removing the added load when it’s no longer needed.

Reducing the Number of Calls

To optimize APIs, you need to make sure your API contracts are such that you don’t have to make a lot of API calls either internally or externally. When you're under-fetching from API calls, then you don’t get complete results. This means you need to make subsequent calls to get one complete result and thus also increase the overhead that comes with an API call.

The best way to prevent this is to plan in advance the kind of results you’ll need from your API endpoints and finalize the request/response contracts accordingly.

Optimizing Databases

Databases play a huge role in API performance. If your databases are not properly structured, then no matter what you do, your API performance will always be poor. There are a few steps you can follow to optimize a database:

  • Try to normalize the database tables when dealing with relational databases.
  • Make proper indexes based on your searches to make database search operations faster.
  • Plan in advance the kind of results you will need from the database, and structure the database accordingly.
  • Keep queries simple, and make less use of joins.

Using a High-Speed JSON Serializer

JSON is the primary format used for exchanging data between service providers and service clients; the reason for this is that it is very lightweight and hence consumes less network bandwidth compared to other formats such as XML.

JSON serialization can significantly impact the performance of your API, so it is important to select the correct JSON serializer. It should be fast and have less payload. There are various serializers available; Protobuf is currently one of the fastest.

Moving to Serverless Architecture

Serverless architecture can solve your performance problems if the issue is related to hard-to-scale infrastructure. This is because, with serverless architecture, you have less infrastructure to manage and it can be scaled up and down very easily. Also, moving to serverless architecture means improved latency and flexibility—although this can be a bit hard to do if you are late into the development process, as it requires you to refactor the code. So make sure you make your move in the initial planning phase of your development process, or only if there is no simpler option.

Monitoring API Performance with Thundra APM

Thundra APM is a monitoring and management tool for distributed services. It provides end-to-end visibility and management of an application to efficiently monitor, debug, and troubleshoot. Thundra APM is the first and only tool providing management and monitoring services for various distributed applications spread across containers, serverless architecture, and virtual machines.

Whether you’re a developer or DevOps, Thundra can help you gain numerous useful insights about your application and help you become better at what you do.

Some of the key features of Thundra APM are:

  • A lightweight agent providing automated instrumentation and distributed tracing of your applications on VMs, serverless, and containers without having to write any code or do any additional work
  • Assistance with end-to-end tracing of distributed applications and troubleshooting the root cause of your errors
  • A record and replay feature for offline debugging, enabling you to step over each line of code after the execution and track the value of local variables
  • Assistance with live serverless debugging via an online debugger that runs on your cloud environment


Optimizing API performance is an essential practice if you want to deliver a highly available and responsive application. Several factors can affect this performance, so proper measures need to be taken to reduce the impact of these and boost API performance.

Various optimization techniques can be implemented —data caching and compression, processing requests in an asynchronous fashion, optimizing the number of server or database calls, and introducing serverless architecture.

A monitoring and troubleshooting tool like Thundra APM helps in monitoring API performance and troubleshooting to find and correct the factor causing degraded performance.