x

[eBook Presented by AWS & Thundra] Mastering Observability on the Cloud 📖 Download:

DRY (Don’t Repeat Yourself) on the cloud with Pulumi

Jul 2, 2020

 

DRY (Don’t Repeat Yourself) on the cloud with Pulumi

Any enterprise working on the cloud is likely to use Infrastructure as Code as it simplifies the deployment process and makes iterative serverless applications easier to manage. There are several open source tools out there that work with most cloud providers, each of them with their own implementation processes.

In this article we are going to dive into the workings of one particular tool, Pulumi. Pulumi is multi-language, multi-cloud, and fully extensible in both its engine and ecosystem of packages. What makes Pulumi unique from it’s industry counterparts, like Terraform, which also supports multiple cloud platforms, is the ability to code the infrastructure in any language of your choice for any cloud provider. Pulumi supports NodeJS (Javascript, Typescript or any NodeJS compatible language) Python, .NET Core and Go. 

Let’s explore how Pulumi works by building an AWS project that deploys a Lambda function, which is invoked by a video file (.mp4), and is then uploaded into an S3 bucket. The function utilizes a Lambda layer having a statically compiled ffmpeg library to capture an image frame and put it into the same S3 bucket.

Overview of Steps

  1. Setting up Pulumi CLI
  2. Create an initial project
  3. Coding the infrastructure design
  4. Deployment

Setting up Pulimi CLI

After creating an account, setting up Pulumi is a simple process. Depending on your OS, run the following

  • curl -fsSL https://get.pulumi.com | sh for Linux
  • choco install pulumi for Windows using Chocolatey package manager
  • brew install pulumi for MacOS

In this project we are going to use Python as the implementation language. Pulumi requires Python 3.6 or later. If you do not have Python, follow the installation and setup guide from here.

We’ll also require the AWS CLI setup with a profile that has permissions to allow Pulumi for provisioning the stack. If you have the AWS CLI already configured, Pulumi will use your default configuration settings, and therefore you can ignore this part of the setup.

In case you have multiple AWS profiles, run

export AWS_PROFILE=<profile_name>

(or)

After the project has been setup (which we will do in the next step), run

pulumi config set aws:profile <profilename>


Create initial project

Create a new directory with any name and run the following command from inside that directory.

pulumi new aws-python

You will be asked to enter your project name, description, stack name and the region where you want to deploy the project. The command will also create a virtual environment with Pulumi’s being installed automatically from requirements.txt file. 

Note: If you run into issues when Pulumi attempts to create the virtual environment, you can create one manually named as ‘venv’ and then install (pip install -r requirements.txt) the dependencies from the requirements.txt file

Your project will be created and then can be viewed as part of your stacks on Pulumi.

Pulumi Projects page

Let’s review some of the generated project files:

  • Pulumi.yaml  - Defines the project.
  • Pulumi.dev.yaml - Configuration values for the stack we initialized.
  • __main__.py - The program that defines our stack resources. 
  • requirements.txt - Any new dependencies that your project may have should be included in this file.

Coding the infrastructure design

Pulumi uses a desired state model for managing infrastructure. Their general notion of declaring the infrastructure in your program is to allocate a resource object whose properties correspond to the desired state of your infrastructure. 

You assign properties to a variable by calling the appropriate resource methods. A language host computes the desired state for a stack’s infrastructure. The deployment engine compares this desired state with the stack’s current state and determines what resources need to be created, updated or deleted.

Now for our program, we’ll provision the resources in the following manner.

__main__.py:

# An AWS Python Pulumi program that creates a thumbnail from a video file and stores into S3

import pulumi
from pulumi_aws import s3,lambda_,iam

# Create bucket to store ffmpeg library to be used for Lambda layer
ffmpeg_layerBucket = s3.Bucket('ffmpeg-bucket')

# Create bucket that acts as a trigger for Lambda when a video is uploaded
video_thumbnail = s3.Bucket('video-thumb-bucket')

# Upload ffmpeg library to bucket
upload_ffmpeg_lib = s3.BucketObject(
        resource_name='python-layer-ffmpeg.zip',
        bucket=ffmpeg_layerBucket.id,
        source=pulumi.FileAsset("./python-layer-ffmpeg.zip")
        )

# Creating Lambda Layer with the .zip located in the S3 bucket
lambda_layer_ffmpeg = lambda_.LayerVersion(
    resource_name="ffmpeg-layer",
    s3_bucket=ffmpeg_layerBucket.id,
    s3_key=upload_ffmpeg_lib.key,
    layer_name="ffmpeg-layer",
    description="This layer retrieves a thumbnail from the a video",
    compatible_runtimes=["python3.6"],
)

# Create Lambda IAM lambda_role
lambda_role = iam.Role(
    resource_name='lambda-ffmpeg-iam-role',
    assume_role_policy="""{
        "Version": "2012-10-17",
        "Statement": [
            {
                "Action": "sts:AssumeRole",
                "Principal": {
                    "Service": "lambda.amazonaws.com"
                },
                "Effect": "Allow",
                "Sid": ""
            }
        ]
    }"""
)

# Attach policy to the role
lambda_role_policy = iam.RolePolicy(
    resource_name='lambda-ffmpeg-iam-policy',
    role=lambda_role.id,
    policy="""{
        "Version": "2012-10-17",
        "Statement": [
            {
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents",
                "s3:*"
            ],
            "Resource": "*"
        }
        ]
    }"""
)

# Create Lambda function and attach the created role 
lambda_ffmpeg_fn = lambda_.Function(
    resource_name='ffmpeg-thumbnail',
    role=lambda_role.arn,
    runtime="python3.6",
    handler="lambda_handler_function.lambda_handler",
    code=pulumi.AssetArchive({
        '.': pulumi.FileArchive('./ffmpeg_code')
    }),
    layers=[lambda_layer_ffmpeg.arn],
    timeout=30,
    memory_size=512
)

# Give bucket permission to invoke Lambda
lambda_event = lambda_.Permission(
    resource_name="lambda_video_event",
    action="lambda:InvokeFunction",
    principal="s3.amazonaws.com",
    source_arn=video_thumbnail.arn,
    function=lambda_ffmpeg_fn.arn
)

# Bucket notification that triggers Lambda on 'Put' operation
bucket_notification = s3.BucketNotification(
    resource_name="s3_notification",
    bucket=video_thumbnail.id,
    lambda_functions=[{
        "lambda_function_arn":lambda_ffmpeg_fn.arn,
        "events": ["s3:ObjectCreated:*"],
        "filterSuffix":".mp4"
    }]
)

# Export created assets (buckets, lambda function, lambda layer)
pulumi.export('ffmpeg_bucket_name', ffmpeg_layerBucket.id)
pulumi.export('thumb_bucket_name', video_thumbnail.id)
pulumi.export('lambda function', lambda_ffmpeg_fn.arn)

Place the following lambda function code named as lambda_handler_function.py inside a directory ffmpeg_code at the root level of your project.

Lambda code:

import boto3
import os
import uuid
from urllib.parse import unquote_plus

s3 = boto3.client("s3")

def lambda_handler(event, context):
        
    for record in event['Records']:
        bucket = record['s3']['bucket']['name']
        key = unquote_plus(record['s3']['object']['key'])
        tmpkey = key.replace('/', '')

        download_path = '/tmp/{}{}'.format(uuid.uuid4(), tmpkey)

        print("downloading file")

        s3.download_file(bucket, key, download_path)

        print("file downloaded into tmp")

        frame = '/tmp/{}framed.png'.format(uuid.uuid4())
        
        print("running ffmpeg")

        # Capture a frame from 2min:09sec of the video
        os.system('/opt/python/ffmpeg -i ' + str(download_path) + ' -ss 00:02:09.000 -vframes 1 '+str(frame))
        
        try:
            print("uploading thumbnail to S3")
            s3.upload_file(frame, bucket, '{}framed.png'.format(uuid.uuid4()))
        except Exception as e:
            return e
        return True

Lastly, place the ‘python-layer-ffmpeg.zip’ file at the root level. This file contains the compiled ffmpeg library that we will upload to S3 for creating the lambda layer. The source of the library was obtained from here.

The directory of the project structure should look something like this.

An AWS Python Pulumi program


Deploying the stack

Activate your python virtual environment by running this command from the root level of your directory.

cd venv/bin/ && source activate

 Change back to root level of the directory and run,

pulumi up

pulumi up initiates the deployment process. This command can also be used to perform updates to an existing stack.

 Your terminal will look something like this after a successful deployment.

Pulumi terminal

Pulumi’s deployment engine will determine how the resources are to be provisioned in order to reach the desired infrastructure state. The lambda function will be created only if the lambda layer has been created first, and likewise, the lambda layer will be created only after the ffmpeg library has been uploaded to S3.

Notice that all resource names are created with a random alphanumeric suffix associated with the resource name you specify in the program. Pulumi auto-names the resources to avoid possible name conflicts with existing stacks. You could override this by providing the name attribute to each of the resources being created.

To terminate the resources from the cloud run

pulumi destroy

If you wish to remove the stack from Pulumi’s console, then run

pulumi stack rm

Conclusion

Pulumi makes it quite simple and straightforward to deploy resources to the cloud. With popular cloud infrastructure/configuration management tools like Terraform, Ansible, AWS CloudFormation or Chef, the one key aspect that would make Pulumi a compelling choice is that there is no requirement to learn a new language for designing a cloud infrastructure. Even if you have existing deployments with other IaC tools, Pulumi can seamlessly integrate with them. For more information refer to their documentation from here.

Pulumi is capable of handling most use cases but like all pieces of software, every offering can have limitations. It’s quite possible that certain complex edge cases may be served better by opting another tool/service. However, Pulumi does make the jump to it’s platform quite appealing by giving their users the ability to code in a programming language of their choice.