Can You Decrease CPU Load by Increasing Memory Size on AWS Lambda?

Oct 23, 2018

 

If your AWS Lambda application is experiencing terrible latencies and delivering a frustrating user experience, you may target high CPU loads as the main problem to solve.

Your first instinct might be to increase memory size in your AWS Lambda configuration. If your system scaled vertically, this would be a proven solution. But in your Lambda environment does increasing memory size really decrease your CPU loads and improve application performance?

I performed an experiment to find out. And, the answer is no, increasing memory does *not* decrease your CPU load. But, it does improve the performance!

 

The Experiment

TL;DR: Skip to 4. The Effect of Increased Memory and come back if you are interested in more.

 

1. How was the experiment conducted?

In the past, I’ve performed tests to calibrate Thundra’s CPU load calculations. This experiment was run in a very similar fashion with the specific goal to see how memory size affects CPU load.

I tested the effect of CPU stats on all memory size options between 512 MB and 3008 MB. For each memory size, I invoked three different Lambda mini applications, 100 times.

For each application, I gathered CPU statistics. Then I averaged the data collected over 100 runs to calculate mean CPU values for each memory size.

Here, you can check my test code:
https://github.com/plazma-prizma/lambda-cpu-load-test

The code is written in Golang. However, the behavior of your Lambda functions won’t change from language to language so you can expect similar results for other languages as well.

 

2. How do you accurately calculate CPU usage?

I used the formula in this stackoverflow discussion to calculate CPU usage.

 

Basically, I read the proc/stat and proc/[pid]/stat files to give an accurate calculation of the process CPU usage. We find the total time passed and then subtract iowait and idle time from that.

You can visit here to learn more about what these terms mean.

 

3. How do you increase CPU usage intentionally?

I ran three different mini applications to see the effect on CPU usage and understand how to load up the Lambda environment. The applications were:

  • A method that only sleeps. This was to set a baseline CPU load.
  • Another one ran goroutines while sleeping as a way to create intermittent loading on the system.
  • The last one calculates the sqrt of the numbers from one to billion as an approach to put heavy continuous loading on the system.

Remember that I measured proc/stat before and after and calculated d which is the delta time.
The output below show the mean results of each application from the 512 MB testing.

 

3.1. Sleep

 

time.Sleep(time.Second * 3)

 

When I ran this application, I saw that it only increased the idle time of the process. This was as expected.

 

"dUtime",0.02
"dStime"
,0.02
"dUser"
,0.84
"dSystem"
,0.07
"dIdle"
,598.72
"dNice"
,0
"dIowait"
,0.12
"dIrq"
,0
"dSoftirq"
,0
"dSteal"
,0
"dGuest"
,0
"dGuest_nice"
,0
"sys_usage"
,0.00151868163932768
"proc_usage"
,6.66668518523663e-05

 

As you can see, all of the time that the CPU sleeps is counted as idle timeand our CPU usage is very low like 0.1%. We intentionally wasted our resources in this case actually.

 

3.2. Goroutine

Let’s try our second approach. The below code basically runs a main thread that sleeps while also creating goroutines that run until a done signal has arrived.

 

done := make(chan int)
for i := 0; i < runtime.NumCPU(); i++ {
go func() {
for {
select {
case <-done:
return
default
:
}
}
}()
}
time.Sleep(time.Second * 3)
close(done)

 

Ok, here I must admit the tests in the experiment are not the most scientific because we don’t have a control group in here. But, since we are comparing the CPU load, I think they are good for our purposes.

When we run this code we see the idle time is decreased and the time spent on user mode is increased. That’s the beauty of using goroutines. When your main thread sleeps, you still able to use the CPU to execute your tasks.

 

"dUtime",93.52
"dStime"
,0.02
"dUser"
,94.3
"dSystem"
,0.01
"dIdle"
,515.53
"dNice"
,0
"dIowait"
,0.1
"dIrq"
,0
"dSoftirq"
,0
"dSteal"
,0
"dGuest"
,0
"dGuest_nice"
,0
"sys_usage"
,0.154619305835262
"proc_usage"
,0.153360715528494

 

Our results show thatCPU usage is about 15%, which means we haven’t really loaded up the system.

3.3 Sqrt

In our last test, we ran a program to calculate the square root of all numbers from 0 to 1 billion.

 

for i := 0.0; i < 1000000000; i++ {
math.Sqrt(i)
}

Here are the results when we run that code:

 

"dUtime",130.01
"dStime"
,0.03
"dUser"
,129.88
"dSystem"
,5.59
"dIdle"
,706.13
"dNice"
,2.61
"dIowait"
,2.71
"dIrq"
,0
"dSoftirq"
,0
"dSteal"
,0
"dGuest"
,0
"dGuest_nice"
,0
"sys_usage"
,0.163065294399567
"proc_usage"
,0.153541282894587

 

What we see in here that user modeis high but also dIdle is high too!

Our process is not being executed by the CPU at full capacity, it is still idle from time to time. This is the first interesting finding we have.

 

4. The Effect of Increased Memory

In the previous chapter, we learned that when our process sleeps, the CPU stays idle. On the other hand, goroutines run instead of the sleeping process and keeps the CPU busier.

Additionally, we learned what happens when we execute a CPU-bounded job, like a square root calculation task. In this case, the CPU doesn’t always run and in fact stays idle higher than we would expect.

Now, I repeated these tasks by increasing the memory size on AWS Lambda.

 

CPU Load of square root test function by memory size

 

As you can see from the graph, the Sqrt task increases the CPU load as memory size increases!

Wait a second - why is this happening? Logically, if we’ve been allocated a more powerful CPU, shouldn’t the load be decreased?

Let’s try this again with the goroutine test case.

 

CPU Load of goroutine test function by memory size

 

The Goroutine test function’s CPU load also increased as memory size increasesd. In fact, in our 2048 MB test the CPU load increased even more than sqrt’s CPU load did at 2048 MB. We even reached 90% CPU load at 3008 MB.

What is going on here? Let’s look more closely at the data (see the table below).

When we increased the memory size, we saw an increase on dUtime and a decrease on dIdle time. That’s why the CPU load increases in both cases.

Think of it this way: By increasing memory size, we don’t get a better physical CPU but we do get better CPU utilization.

CPU doesn’t wait on idle to execute our tasks. It executes them at its best.

To sum up, increasing the memory size doesn’t make our CPU load decrease. If your application is experiencing latencies or bad performance, you can increase the memory size to help improve performance. But, don’t expect to see a CPU load decrease. Instead, you should expect to see improved CPU utilization for your application.


Here you can inspect all of the data I created. Remember that dUtime and dStimeare the statistics from a single process while dUser and dStimeare statics summarizing all processes. In our Lambda environment these numbers are very similar because there aren’t any processes running other than our own.

 

 

I’m open to any suggestion to improve the test and I also encourage anyone to try it by themselves. Leave your comments to discuss these results. I hope this article would be helpful to everyone and encourage you to try out Thundra to help understand how your AWS Lambda applications are performing.