Serverless adoption rates have been climbing ever since the technology was brought into the spotlight with the release of AWS Lambda in 2014. That is because serverless makes an offer that cloud developers simply can not resist, providing the following benefits:
- Server management is abstracted to vendor
- Pay-as-you-go model where you only pay for what you use
- Automatically scalable and highly available
These benefits are achieved by the characteristics that define the technology. Serverless applications are stateless distributed systems that scale to the needs of the system, providing event-based and async models of development. This has worked in favor of the technology, resulting in a desirable solution for the cloud.
However, does this offer always live up to what it is perceived as?
With further inspection, there is no doubt that serverless adoption also opens up developers to the possibility of falling into anti-patterns specific to the model. This is especially concerning seeing the high adoption rates of serverless. As more of the industry moves to reap the benefits, we must be wary of what works and what does not work. Serverless is definitely beneficial, however, the wrong use of it could leave a sour taste, pushing the industry away from the technology.
Therefore the purpose of this piece is to highlight the anti-patterns that plague serverless architectures and how they may be avoided. Hence enabling the success of serverless applications and also promoting its adoption.
The Blemish in the Shine of Async
Serverless applications tend to work best when asynchronous. A concept that was preached by Eric Johnson in his talk at ServerlessDays Istanbul titled “Thinking Async with Serverless”. He, later on, went to present a longer version of the talk at ServerlessDays Nashville.
Nevertheless, the same asynchronous characteristic that is revered is also the source of one of the greatest anti-patterns. In understanding why this is the case, it first needs to be remembered that one of the benefits of Serverless is the pay-as-you-go model. Therefore, when a function or a service is waiting for a response from another function or service that has been called upon asynchronously, the first function is in an idle state. Simply waiting for the second function’s response.
This is the result of converting from monolith to serverless architectures without paying attention to detail. For example, in a monolith system, a method may want to perform a read/write operation to DynamoDB. However, to avoid waiting for the operation, and blocking the control flow, the call may be made asynchronously, allowing the method to call upon another method to perform some other task, but still waiting for a response from DynamoDB at the end of the method. The second method may in its own way begin S3 operations.
This logic when being moved to serverless, can not be done in the same manner. This is because intuitively each method can be mapped to its separate serverless function, but it must be remembered that these functions can timeout or simply finish their remaining tasks and become idle waiting for callbacks.
As a result, the function that is in the idle state will also be charged since it is still technically active. There is still a worker node servicing the function with all the needed underlying architecture as the function simply waits.
This problem is further exasperated when chaining functions together. This is the process whereby one function makes an async call to another function, waiting for a response, while the second function is called upon another function or makes a read/write operation to a storage service. This increases the possibility of unreliability as the first function might time out. This is even worse when functions make calls to storage devices outside the vendor’s ecosystem, or on-prem storage services.
The solution is not to abandon asynchronous patterns, because the issue does not lie in async calls but the way such calls are incorporated. For example, it is often the case when decomposing the monolith, that there are controller functions, managing the transfer of data. This leads to unnecessary costs and also increases the unreliability of functions in terms of possible timeouts.
The solution, in this case, is simple and involves rethinking the control flow. Therefore the function structure above could be transformed into the structure of a function as seen below:
As can be seen from the image above, there now exist three functions performing a trivial task, where each function is triggered by an event from the prior function in the flow. Having three separate functions would be considered inefficient in any platform apart from Serverless. However, it must be remembered that with serverless, costs depend on the time of execution and not CPU resources. Hence if an EC2 instance was orchestrated for the purpose, another alternative to the structure of function in the image above may have been preferred.
It is thus seen that asynchronous, when done the right way, can be immensely beneficial. Reducing execution times while supporting parallelization where needed. However, when not given much thought, going async can be detrimental to not only the needs of the system but to the whole benefit model of serverless.
Sharing Is NOT Caring
The goal of building with serverless is to dissect the business logic in a manner that results in independent and highly decoupled functions. This, however, is easier said than done, and often developers may run into scenarios where libraries or business logic or, or even just basic code has to be shared between functions. Thus leading to a form of dependency and coupling that works against the serverless architecture.
Functions depending on one another with a shared code base and logic leads to an array of problems. The most prominent, is that it hampers scalability. As your systems scale and functions are constantly reliant on one another, there is an increased risk of errors, downtime, and latency. The entire premise of microservices was to avoid these issues. Additionally, one of the selling points of serverless is its scalability. By coupling functions together via shared logic and codebase, the system is detrimental not only in terms of microservices but also according to the core value of serverless scalability.
This can be visualized in the image below, as a change in the data logic of function A will lead to necessary changes in how data is communicated and processed in function B. Even function C may be affected depending on the exact use case.
Before diving into the solutions, it must be conceded that some use cases may have no resolve but to have shared logic and code bases. Such issues spring up in applications of machine learning where large libraries have to be shared across various functions used to process test, validation, and training datasets. The process calls for the same model achieved upon the training data set to be validated and reinforced using the validation and test data sets.
In most cases, the need to share code libraries and logic was not only an antipattern but also a technical limit on serverless functions. For example, AWS Lambda functions have a hard limit of 512MB on/tmp storage. That means when developers are building their AWS Lambda functions code, one must always be aware of this limit and how they are using it. After all, the /tmp directory is meant for temporary storage, Therefore once the serverless worker node is torn down, the data within the /tmp is also no longer available.
AWS recently solved this problem with the release of a much-coveted Amazon EFS and AWS Lambda integration. This new integration allows functions to access a shared library or data, via an integrated Amazon EFS instance. Nevertheless, this does not justify making functions dependant on one another. Just because something is now achievable, does not mean it is the most effective solution considering the pitfalls resulting from the antipattern mentioned above.
Therefore the solution, and probably a very basic yet effective solution, is constant awareness in building the system architecture. Coupling and interdependency are not new problems that have resulted from the novelty of serverless. Solutions to promote awareness are already present and implemented by various teams across the industry.
For example, one of the most popular solutions is DRY (Don’t Repeat Yourself), a concept first coined in 1999 by Andrew Hunt and David Thomas in their book, The Pragmatic Programmer. An alternative approach to DRY is comically the WET (Write Everything Twice) principal.
Overall, tightly coupling functions together can potentially rollback all the benefits gained from microservices and serverless. Awareness in how cloud architecture is being built is the only manner in which the issue can be effectively avoided. Breaking up business cases into separate functions may not always be conceptually easy, but an activity that must be conducted nevertheless, that too with caution.
How Small is Too Small
Building upon the notion of breaking large compact business cases into smaller independent functions, there is a possibility of reaching a level of granularity that eventually proves detrimental. Breaking down monolith systems definitely has its benefits, but there is also some overhead that has to be conceded. There eventually comes a point where the overhead would exceed the benefit and hence it is imperative that this point is found.
One of the greatest overheads that can be expected is the need for communication between these separate entities. This is expected since serverless is event-driven. Therefore, there is a need to ensure that events can flow between the different components in the architecture as compared to the control of flow already being contained within a large monolith system.
The need for communicating events between individual functions leads to thinking about webhooks and APIs. Therefore, an increase in the engineering efforts, security risks, and latency. As the number of functions scales, these concerns are multiplied.
Serverless’s main goal is to abstract the complex underlying architecture, allowing the prime focus on business logic. However, it is clear that as a push towards breaking down the business logic to individual functions reaches a certain point, the overhead negates the benefits. Hence acting as an antipattern.
The overhead of using serverless functions is well acknowledged by cloud vendors. For example, AWS released its serverless event bus service, AWS EventBridge. The service alleviates the problems associated with communicating events between functions and even allows third-party tools to send events to the AWS architecture. Nevertheless, this does not fully resolve the problem.
To understand the solution, it must first be understood why the problem may emerge in the first place. The ease of developing with serverless functions is already well known. Building functions is relatively extremely easy, and hence developers are prone to continuously creating functions and over-engineering.
The solution is to begin thinking of architectural design from the start of the development process with a deep understanding of the business logic. This can be achieved by analyzing the expected manner in which customers will use the application, the desired performance, and the use-cases.
The goal is to understand what flows the customer is expected to take and which areas of the application are expected to experience higher load and As a result, A clear understanding of these requirements will allow in determining where functions are actually needed and the scope of those functions. it is imperative to work closely with product managers, or anyone else mapping out the goals and user flows of the business logic.
Recursion Is Not A Friend
Recursion is a concept that is integral to computer science, often leading to lower compute and runtime complexities, contributing positively to the “Big O notation” that is sprinkled throughout relevant literature. However, in serverless, recursion can result in unforeseen repercussions, which are exacerbated by its characteristics of scalability, acting inimically in the case of recursion. Especially when the recursive algorithm leads to an infinite loop.
When programming for containers or other CPU centric instances, the core problem lies in maxing out the CPU as recursion loads the processing power. One function, continuously triggering itself which could lead to an exponential burst in the number of functions being triggered in various threads.
When considering serverless, maxing out the CPU instance is not an issue thanks to the concept of auto scalability. Theoretically, an infinite amount of worker nodes can be spun up, scaling to the demands of even the most rigorous recursive algorithms. The problem, however, lies in the cost incurred, as recursive functions would lead to a DoW (Denial on Wallet) attack.
It should be remembered that with serverless, compute power is not the concern of billing, invocations, and runtimes are. As a result, recursion could very easily lead to an explosion on the costs charged by the cloud vendor, acting counter-intuitively to the benefit of cost-effectiveness wiring the option in favor of serverless in the first place.
The obvious solution is to be aware of recursive algorithms in the codebase when building serverless applications. However, there may be some applications that still require recursive operations. For example, in the applications of machine learning, it would be beneficial to train a model repetitively until a certain accuracy is reached on the training data or validation data. The question, however, is how many recursions can be afforded?
The biggest threat with recursive functions is, as mentioned, the possibility of an infinite loop is created. In most cases, this is an unintended behavior as no application aims for infinite loops. Therefore, if recursion is absolutely needed, ensure rigorous testing to avoid the problem altogether.
Furthermore, you may also pass data between your functions to keep a recursion count and use it to have fail-safe switches in order to stop the running function when the recursion count reaches a certain number. This would allow the system to be aware of the number of times recursion occurs and also allow a configurable limit that can be changed over time keeping in mind cost and other factors of your serverless application.
Serverless definitely is revolutionizing how applications are built in the cloud. However, with this new model and architecture, comes its own unique antipatterns.
If not careful, it is relatively easy to fall for these antipatterns, whitewashing all the benefits gained by choosing serverless in the first place. In fact, depending on the severity of the problem, a serverless architecture could prove more harmful than beneficial to the business application.
Nevertheless, the benefits of the technology strongly promote its adoption. Hence the strong adoption rates have been seen through the cloud community over the years. It is therefore necessary to be actively aware of how software is being built with serverless, keeping an eye out for antipatterns. After all, as the great saying goes, with great power comes great responsibility!