5 minutes read

POSTED Nov, 2021 dot IN Testing

Be prepared for the worst with Chaos Testing Through Foresight

Serkan Özal

Written by Serkan Özal


Founder and CTO of Thundra

linkedin-share
 X

Nowadays, testing is crucial for the success of any product. No matter what product you offer, customers have seemingly endless options to choose from. This means keeping bugs out of your production environment and maintaining a smooth user experience so you can stay competitive and retain customers.

But functional and performance testing may not be enough. Contemporary software applications have a number of third-party libraries or cloud services integrated into them. What can you do if there’s an issue with an external service? What will you do if your application cannot connect to these services at critical times, like on sale days?

Chaos testing is one way to prevent those things from happening. It’s a method for testing your application’s response to catastrophic situations so you can see how it will behave. Based on the results, you may add a mechanism to handle unexpected events so that you can minimize the consequences.

Setting up test environments and monitoring chaos engineering from scratch is no trivial task. It can take a whole team of system administrators months to implement. Luckily, with the help of monitoring cloud services like Thundra Foresight, that task is easily done with a few lines of code.

In this post, we’ll build a simple application to show how Thundra Foresight helps efficiently implement chaos testing.

Creating the Service

To get started, we’ll write a service to manage products. This sample service is built using AWS Lambda and DynamoDB and we’ll run it on our local computer with the help of LocalStack.

Step 1: Define Makefile

First, we need to define our makefile so it’s easy to download and run commands while developing our application. You can see how we did this below.

export DOCKER_BRIDGE ?= $(shell (uname -a | grep Linux > /dev/null) && echo 172.17.0.1 || echo docker.for.mac.localhost)
export SERVICES = serverless,cloudformation,sts,sqs,dynamodb,s3,sns
export AWS_ACCESS_KEY_ID ?= test
export AWS_SECRET_ACCESS_KEY ?= test
export AWS_DEFAULT_REGION ?= us-east-1
export START_WEB ?= 1
export THUNDRA_APIKEY={thundra-api-key}
export THUNDRA_AGENT_TEST_PROJECT_ID={thundra-project-id}
export THUNDRA_AGENT_TRACE_INSTRUMENT_TRACEABLECONFIG = com.cuongld.*.*[traceLineByLine=true]

usage:          	## Show this help
    @fgrep -h "##" $(MAKEFILE_LIST) | fgrep -v fgrep | sed -e 's/\\$$//' | sed -e 's/##//'

install:        	## Install dependencies
    npm install
    which serverless || npm install -g serverless
    which localstack || pip install localstack
    which awslocal   || pip install awscli-local

build:          	## Build app
    echo "Building Serverless app ..."
    mvn clean package -DskipTests

test:           	## Test app
    echo "Building Serverless app ..."
    mvn clean test

deploy:         	## Deploy the app locally
    echo "Deploying Serverless app to local environment ..."
    SLS_DEBUG=1 serverless deploy --stage local --region ${AWS_DEFAULT_REGION}

start:          	## Build, deploy and start the app locally
    @make build;
    @make deploy;

deploy-forwarded:   ## Deploy the app locally in forwarded mode
    echo "Deploying Serverless app to local environment ..."
    LAMBDA_FORWARD_URL=http://${DOCKER_BRIDGE}:8080 SLS_DEBUG=1 serverless deploy --stage local --region ${AWS_DEFAULT_REGION} --artifact null.zip

start-embedded: 	## Deploy and start the app embedded in forwarded mode from Localstack
    @make deploy-forwarded;

.PHONY: usage install build test deploy start

Step 2: Define the serverless.yml File

Second, we need to define the ;serverless.yml file for configuring our API route and DynamoDB table as seen below.

service: demo-thundra-lambda

plugins:
  - serverless-deployment-bucket
  - serverless-localstack # only activated when stage is "local"

custom:
  stage: ${opt:stage, "local"}
  region: ${opt:region, "us-east-1"}
  artifact: ${opt:artifact, "./target/donald-le-demo-thundra.jar"}
  deploymentBucketName: ${self:service}-deployment-bucket-${self:custom.stage}
  tableName: ${self:service}-PRODUCTS_TABLE-${self:custom.stage}
  localstack:
    stages:
      # list of stages for which the plugin should be enabled
      - local
    debug: true
    autostart: true

package:
  artifact: ${self:custom.artifact}

provider:
  name: aws
  runtime: java8
  stage: ${self:custom.stage}
  region: ${self:custom.region}
  memorySize: ${opt:memory, 512}
  timeout: ${opt:timeout, 60}
  deploymentBucket:
    name: ${self:custom.deploymentBucketName}
  environment:
    PRODUCTS_TABLE_NAME: ${self:custom.tableName}
    THUNDRA_APIKEY: ${env:THUNDRA_APIKEY}

functions:
  createProduct:
    handler: com.cuongld.handler.Product
    events:
      - http:
          path: /products
          method: post


resources:
  Resources:
    appDatabase:
      Type: AWS::DynamoDB::Table
      Properties:
        TableName: ${self:custom.tableName}
        AttributeDefinitions:
          - AttributeName: id
            AttributeType: S
          - AttributeName: name
            AttributeType: S
        KeySchema:
          - AttributeName: id
            KeyType: HASH
          - AttributeName: name
            KeyType: RANGE
        ProvisionedThroughput:
          ReadCapacityUnits: 1
          WriteCapacityUnits: 1

Step 3: Code the Service

Now that we’ve configured our files, we need to write the code for our service with the API controllers, the product, and DynamoDB models. You can find the full source code on GitHub.

Step 4: Write Integration Tests

Now it’s time to write our integration tests for the product management service and create new product tests. We’ll use JUnit for the test.

public class CreateRequestTest extends LocalStackTest {  	@Test 	public void testCreateNewRequest() throws IOException {     	LambdaServer.registerFunctionEnvironmentInitializer(             	ChaosInjector.createDynamoDBChaosInjector("createProduct"));      	CloseableHttpClient httpclient = HttpClients.createDefault();      	HttpPost postMethod = new HttpPost(lambdaUrl);      	System.out.println(postMethod);      	CloseableHttpResponse rawResponse = httpclient.execute(postMethod);      	assertThat(rawResponse.getStatusLine().getStatusCode()).isEqualTo(HttpStatus.SC_OK); 	} }

Step 5: Create a New Project in Thundra

To get started, go to Thundra and login/signup with your GitHub or Google account, then select Foresight.

Figure 1: Thundra applications

Once you’re set up with Foresight, choose the “Spring Boot with Maven” option, since we’ll use Maven as a build tool for our application. Then choose manual setup, since we’ll run and test the service in our local environment.

Notice that key-id and project-id will automatically be generated and available.

<build>
        <finalName>${project.artifactId}</finalName>
         <plugins>
            <plugin>
                <groupId> org.apache.maven.plugins <groupId>
                <artifactId> maven-shade-plugin  <artifactId>
                </version>${maven.shade.plugin.version}</version>
                <configuration>
                    <createDependencyReducedPom>false</createDependencyReducedPom>
                    <transformers>
                        <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
                    <transformers>
                <configuration>
                <executions>
                    <executions>
                        <phase>package</phase>
                        <goals>
                            <goal>shade <goal>
                        <goals>
                    <execution>
                <executions>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-surefire-plugin</artifactId>
                <version>${maven-surefire-plugin.version}</version>
                <configuration>
                    <argLine>
                        -javaagent:${settings.localRepository}/io/thundra/agent/thundra-agent-bootstrap/${thundra.agent.version}/thundra-agent-bootstrap-${thundra.agent.version}.jar -Dthundra.apiKey={thundra-api-key} -Dthundra.agent.test.project.id={thundra-project-id}
                    </argLine>
                </configuration>
            </plugins>
        </plugins>
    </build>

Once it completes, run the product management service and integration test.

Monitoring Your Application in Thundra

You’re now set up with a Thundra account, tests, and a sample application to test out. If you navigate to Thundra Foresight, you’ll see the previous test run in the Latest Test Runs list.

The result should appear as having passed since we haven’t injected any errors yet.

Figure 2: Trace map of a successful test

To inject an error into the product management service application, we’ll use ErrorInjector and define the error injection method. Specifically, to see the benefits of chaos testing, we’ll inject an error for lost connection with DynamoDB with an error percentage of 70%. This can be seen below:

ErrorInjectorSpanListener errorListener =             	
ErrorInjectorSpanListener.                builder().                     				                           		errorType(AmazonDynamoDBException.class).     
	errorMessage("Error due to lost connection to Amazon DynamoDB").                     	
    		injectPercentage(70).                     	
            build();     	FilteringSpanListener 
errorFilteringSpanListener =             	
FilteringSpanListener.        builder().                     	
listener(errorListener).                     	
filter(FilteringSpanListener.                                 	
filterBuilder().                                 	
className("AWS-DynamoDB").                                 	
build()).  build();     	
TraceSupport.registerSpanListener(errorFilteringSpanListener);

We'll now call the inject method from the test function. If we don't want to inject the error, we simply comment on the calling part.

@Test public void testCreateNewRequest() throws IOException {     	LambdaServer.registerFunctionEnvironmentInitializer(             	ChaosInjector.createDynamoDBChaosInjector("createProduct")); CloseableHttpClient httpclient = HttpClients.createDefault();      	
HttpPost postMethod = new HttpPost(lambdaUrl);    System.out.println(postMethod);      	
CloseableHttpResponse rawResponse = httpclient.execute(postMethod);      	assertThat(rawResponse.getStatusLine().getStatusCode()).isEqualTo(HttpStatus.SC_OK); 	}

Now when you go to monitor your application in Thundra, you’ll see it’s failed due to a lost connection with DynamoDB.

Setting up the infrastructure to collect metrics from the application is a big task, requiring effort from both system administrators and developers. But with Thundra and its supported libraries in multiple languages, it's easy to set up chaos testing. Thundra supports multiple databases and third-party services—from MySQL and Redis to Lambda functions—so you can start testing right away.

Figure 3: Test failure after error injection

 

Figure 4: Test failure with 500 error

The 500 error shown above happens in the try/catch from our handleRequest method due to a lost connection with AWS DynamoDB, which you can see in the code below:


@Override
public APIGatewayProxyResponseEvent handleRequest(APIGatewayProxyRequestEvent request, Context context) {
try {
logger.info("Request --> " + request);
if ("/products".equals(request.getPath()) &&  "POST".equals(request.getHttpMethod())) {
return startNewRequest();
} else {
return new APIGatewayProxyResponseEvent().
withStatusCode(404);
}
} catch (Exception e) {
logger.error("Error occurred handling message. Exception is ", e);
return new APIGatewayProxyResponseEvent().
withStatusCode(500).
withBody(e.getMessage());
}
}

Figure 5: Trace map showing failed test due to lost connection to DynamoDB

Since we can see what the failure is through Foresight, we’ll add a mechanism that tells the test to automatically retry in the event of a timeout or exceptions in our product service management, including the 500 Internal Server Error.

public APIGatewayProxyResponseEvent handleRequest(APIGatewayProxyRequestEvent request, Context context) {
        int count = 0;
        int maxTries = 10;
        while (true) {
            try {
                logger.info("Request --> " + request);
                if ("/products".equals(request.getPath()) && "POST".equals(request.getHttpMethod())) {
                    return startNewRequest();
                } else {
                    return new APIGatewayProxyResponseEvent().
                            withStatusCode(404);
                }
            } catch (Exception e) {
                if (++count == maxTries) {

                    logger.error("Error occurred handling message. Exception is ", e);
try { Thread.sleep(1000); 
} catch(InterruptedException ex) { 
Thread.currentThread().interrupt(); 
}

                    return new APIGatewayProxyResponseEvent().
                            withStatusCode(500).
                            withBody(e.getMessage());

                }
            }
        }
    }

After running the application, you’ll be relieved to see that the tests have now passed. This is thanks to the retry mechanism that is instigated when it meets the 500 error code.

Figure 6: Passing test after adding the retry mechanism

Conclusion

The example scenario we’ve shown here might seem trivial, but in real-world use cases, things get a lot messier. Chaos testing is the only way to know how our products will behave if there’s an unexpected issue. As shown here, observability is key for remediating issues, whether they're familiar or brand new. Finding the source of failed tests is a pain point for any developer, and a buggy product is a pain for any customer.

Prepare your products for the unknown with Thundra Foresight. Your engineers, customers, and bottom line will all benefit.