x

[WHITE PAPER]: Introduction to Serverless Monitoring Get the e-book! 📖

AWS Step Functions - Doing Serverless is Easier Than You Think

Jan 31, 2019

                                    aws-step-functions-v2

AWS introduced AWS Step Functions in re:Invent 2016; since then very few blogs and articles have come out explaining Step Functions and its use cases. In this blog, I will try to unify all the relevant materials from the articles I read, and add more on it, to get you started with AWS Step Functions. I will also go through a demo function that would highlight significant capabilities of Step Functions and help you decide where to use them as per your needs.

What is AWS Step Functions?

AWS Step Function is a service which enables you to orchestrate the modules in your application. To elaborate, it allows you to design and build the flow of execution of modules in your application in a simplistic manner. Thus, a developer can solely focus on ensuring that each module performs its intended task, without having to worry about connecting each module with others. Using Step Functions, you model your workflow with easy-to-understand state machines, where each state can represent your module. What makes Step Functions so powerful is their ability to automatically trigger each state as configured, detect errors in each step of execution, retry when there are errors and, above all, allow users to visualize the state machine while designing and while it is executing.

The state machine is designed using Amazon States Language which is a JSON-based language. Following is the simplistic example of the state machine taken from the `Hello World` template provided by AWS:

{
    "Comment": "A Hello World example of the Amazon States Language using a Pass state",
    "StartAt": "HelloWorld",
    "States": {
        "HelloWorld": {
            "Type": "Pass",
            "Result": "Hello World!",
            "End": true
        }
     }
 }

The corresponding state machine for the above JSON code:

step functions-1`States` and `StartAt` are mandatory fields in the state machine where `States` is a non-empty array of objects representing states and `StartAt` is a string whose value must match at least one of the `State`.

States

States are the logic of the application. The output of the predecessor state is the input to the current state. The user specifies input for the first state at the beginning of the execution. In the above example, state type is “Pass”. Hereby, the state’s input is passed as output. Following are the available state types:

Pass: Simply passes input to output. If there is `ResultPath` and `Result` in the state declaration then the value of the `Result` will be appended to input as

$.ResultPath: $.Result


where $ is the input ($ is not just an arbitrary notation, we will talk about it later).

For example, if our state definition is:

"HelloWorld": {
    "Type": "Pass",
    "Result": "Hello World!",
    "ResultPath": "$.out"
    "End": true
}


And input is:

{
    "input": "I am an input"
 }


The corresponding output will be:

{
    "input": "I am an input",
    "out": "Hello World!"
 }


Task
: This state requires “Resource” field in its declaration which contains a URI that points to the task to be executed. Make sure that the state’s timeout is not less than the execution time of the task. The default timeout is 60 seconds if not declared in the state declaration.

Choice: It allows you to add branching logic to the state machine. The machine can make decisions of which state to run depending on the output of the previous state. There must be non-empty `Choices` field with a set of rules for the next state.

Wait: Delays the state machine for the amount of time mentioned in the state declaration.

Succeed: Terminates state machine successfully.

Fail: Terminates state machine with a failure.

Parallel: It allows parallel execution of states in the machine. There must be a non-empty `Branches` field with a set of states to be executed in parallel. Interpreter waits until all branches terminate before moving on to next state. Even if one of the branches fails the whole state fails terminating other branches.

Errors and Retries


Step Functions offers capabilities such as error-handling and attempting to execute a state again if it fails.

To catch an error in the state you need to have a non-empty `Catch` field with the set of objects consisting of errors that should be caught and which state to transition to in case of that error. Step Functions have a set of built-in errors which you can find it in the appendix here. Users can also catch their application-specific errors, for example, a lambda throwing `java.lang.NullPointerException` but they should not begin with “States”. Below is the example from the template provided by AWS:

 

"HelloWorld": {
    "Type": "Task",
    "Resource": "arn:aws:lambda:REGION:ACCOUNT_ID:function:FUNCTION_NAME",
    "Catch": [{
            "ErrorEquals": ["CustomError"],
            "Next": "CustomErrorFallback"
        },
        {
            "ErrorEquals": ["States.TaskFailed"],
            "Next": "ReservedTypeFallback"
        },
        {
            "ErrorEquals": ["States.ALL"],
            "Next": "CatchAllFallback"
         }
     ],
     "End": true
}

 


To catch any unexpected error developer can use `States.ALL` as shown in the above example as well.

“Task” and “Parallel” states can have a field called `Retry` that contains the information on how many retry attempts should be made in case of a specific error. The example below is again taken from the template:

"HelloWorld": { 
"Type": "Task",
"Resource": "arn:aws:lambda:REGION:ACCOUNT_ID:function:FUNCTION_NAME",
"Retry": [{
"ErrorEquals": ["CustomError"],
"IntervalSeconds": 1,
"MaxAttempts": 2,
"BackoffRate": 2.0
},
{
"ErrorEquals": ["States.TaskFailed"],
"IntervalSeconds": 30,
"MaxAttempts": 2,
"BackoffRate": 2.0
},
{
"ErrorEquals": ["States.ALL"],
"IntervalSeconds": 5,
"MaxAttempts": 5,
"BackoffRate": 2.0
}
],
"End": true
}


`BackOffRate` multiplies delays for each subsequent retries. For instance, `BackOffRate` of 2 and `IntervalSeconds` of 1s means that delay between subsequent retries will be 1s, 2s, 4s, 8s and so on.

Input and Output Processing

All the inputs to a state are in the form of JSON text. If you remember, I used $ symbol above to represent input object in `Pass` state description. It is because the input to a function is bound to the $ symbol. Similarly, the output of each state is also bound to $ symbol (overriding its previous value) which is also an input for next state.

Fortunately, the developer has the ability to control the inputs of a state. In state declaration, the developer can specify exactly which parts of input should be passed as input to the state with `InputPath` field. Similarly, the developer can also append the output of the state to the existing $ instead of overriding it using `ResultPath` field. `OutputPath` takes the corresponding field from the $ and makes it the output of the state.

Let's have a look at an example. Suppose you have the following state:

"HelloWorld": {
"Type": "Pass",
"InputPath": "$.msg",
"ResultPath": "$.result",
"End": true
}


We pass the following input to the state:

{
"x": "Useless variable X",
"y": "Useless variable Y",
"msg": "Hello World"
}


The state will just consider “msg” field from the input as its ‘real’ input and pass it as its output. Since we have `ResultPath` as well it will append the output of the state to the input it received. The output of the state will be:

{
"x": "Useless variable X",
"y": "Useless variable Y",
"msg": "Hello World",
"result": "Hello World"
}


Now let’s assume we had the following in our state declaration as well

"OutputPath": "$.x"


Then the state would look at the previous output and get the value of "x". The output would be:

"Useless variable X"


Fun Part

I am a person who learns by getting my hands dirty in the sand rather than by just looking others making a sand castle. Therefore, I made a demo step function so that I can understand the Step Functions better. Following step function is not comprehensive but it demonstrates `Parallel` and `Choice` functionality of Step Functions. The state machine diagram is as follows:

step functions-2

Description

Above machine is a simplistic machine defined just for the purpose of understanding Step Functions. The first state fills the DynamoDB table with predefined users. Then, we get all the users from the DynamoDB table and iterate through each of them validating them and checking if they are happy with the product. If the customer is not happy we send them an email inquiring about their problem otherwise we just pass over to the next customer. The JSON for the above state machine can be found here.

Parallel states comprise of checking if the name and social security number are correct for each user. Since those checks do not depend on each other we can check them in parallel. There are 3 choice states in the above machine making different checks (CheckUser, CheckValid, and CheckRating).

Users are passed to `Pass` state in the form of an array:

{
users: [...]
}


And it disregards the first element of the array in each iteration. Below is the snippet of `Pass` state:

 "Pass": {
"InputPath": "$.users[1:]",
"Type": "Pass",
"Next": "CheckUser",
"ResultPath": "$.users"
}


“InputPath” removes the first element of the array each iteration. Therefore, in the beginning, the first element of the array is a dummy variable. Machine terminates as it reaches the last element in the `users` array. The last element of the array is a string “DONE”. Below snippet shows the termination condition:

"CheckUser":{
"Type": "Choice",
"Choices": [
{
"Variable": "$.users[0]",
"StringEquals": "DONE",
"Next": "Success"
}
],
"Default": "ValidateUser"
}


`Success` state is the terminal state. For more detailed machine visit the demo function on github. As you can see, Step Functions makes complex workflows easy to follow and modularization makes it easy to find bugs in your application.

Conclusion

In this blog, I have highlighted some basic concepts of AWS Step Functions and demonstrated with a demo function how Step Functions can simplify complex workflows with easy-to-understand visuals and straightforward JSON-based language. Jeff Barr, Chief Evangelist for AWS, talks about how The Coca-Cola Company adopted Step Functions to ease their workflow in this blog which is quite interesting. In my upcoming blog, I will show how Thundra can help you gain insights into AWS Lambda Functions orchestrated by Step Functions, so stay tuned.