AWS Lambda Power Tuning

AWS Lambda Power Tuning is a state machine powered by AWS Step Functions that helps you optimize your Lambda functions for cost and/or performance in a data-driven way.

The state machine is designed to be easy to deploy and fast to execute. Also, it's language agnostic so you can optimize any Lambda functions in your account.

Basically, you can provide a Lambda function ARN as input and the state machine will invoke that function with multiple power configurations (from 128MB to 10GB, you decide which values). Then it will analyze all the execution logs and suggest you the best power configuration to minimize cost and/or maximize performance.

Please note that the input function will be executed in your AWS account - performing real HTTP requests, SDK calls, cold starts, etc. The state machine also supports cross-region invocations and you can enable parallel execution to generate results in just a few seconds.

What does the state machine look like?

It's pretty simple and you can visually inspect each step in the AWS management console.

What results can I expect from Lambda Power Tuning?

The state machine will generate a visualization of average cost and speed for each power configuration.

For example, this is what the results look like for two CPU-intensive functions, which become cheaper AND faster with more power:

How to interpret the chart above: execution time goes from 35s with 128MB to less than 3s with 1.5GB, while being 14% cheaper to run.

How to interpret the chart above: execution time goes from 2.4s with 128MB to 300ms with 1GB, for the very same average cost.

How to deploy the state machine

There are 5 deployment options for deploying the tool using Infrastructure as Code (IaC).

The easiest way is to deploy the app via the AWS Serverless Application Repository (SAR).
Using the AWS SAM CLI
Using the AWS CDK
Using Terraform by Hashicorp and SAR
Using native Terraform

Read more about the deployment options here.

State machine configuration (at deployment time)

The CloudFormation template (used for option 1 to 4) accepts the following parameters:

Parameter	Description
PowerValues type: list of numbers default: [128,256,512,1024,1536,3008]	These power values (in MB) will be used as the default in case no `powerValues` input parameter is provided at execution time
visualizationURL type: string default: `lambda-power-tuning.show`	The base URL for the visualization tool, you can bring your own visualization tool
totalExecutionTimeout type: number default: `300`	The timeout in seconds applied to all functions of the state machine
lambdaResource type: string default: `*`	The `Resource` used in IAM policies; it's `*` by default but you could restrict it to a prefix or a specific function ARN
permissionsBoundary type: string	The ARN of a permissions boundary (policy), applied to all functions of the state machine
payloadS3Bucket type: string	The S3 bucket name used for large payloads (>256KB); if provided, it's added to a custom managed IAM policy that grants read-only permission to the S3 bucket; more details below in the S3 payloads section
payloadS3Key type: string default: `*`	The S3 object key used for large payloads (>256KB); the default value grants access to all S3 objects in the bucket specified with `payloadS3Bucket`; more details below in the S3 payloads section
layerSdkName type: string	The name of the SDK layer, in case you need to customize it (optional)
logGroupRetentionInDays type: number default: `7`	The number of days to retain log events in the Lambda log groups. Before this parameter existed, log events were retained indefinitely
securityGroupIds type: list of SecurityGroup IDs	List of Security Groups to use in every Lambda function's VPC Configuration (optional); please note that your VPC should be configured to allow public internet access (via NAT Gateway) or include VPC Endpoints to the Lambda service
subnetIds type: list of Subnet IDs	List of Subnets to use in every Lambda function's VPC Configuration (optional); please note that your VPC should be configured to allow public internet access (via NAT Gateway) or include VPC Endpoints to the Lambda service
stateMachineNamePrefix type: string default: `powerTuningStateMachine`	Allows you to customize the name of the state machine. Maximum 43 characters, only alphanumeric (plus `-` and `_`). The last portion of the `AWS::StackId` will be appended to this value, so the full name will look like `powerTuningStateMachine-89549da0-a4f9-11ee-844d-12a2895ed91f`. Note: `StateMachineName` has a maximum of 80 characters and 36+1 from the `StackId` are appended, allowing 43 for a custom prefix.

Please note that the total execution time should stay below 300 seconds (5 min), which is the default timeout. You can easily estimate the total execution timeout based on the average duration of your functions. For example, if your function's average execution time is 5 seconds and you haven't enabled parallelInvocation, you should set totalExecutionTimeout to at least num * 5: 50 seconds if num=10, 500 seconds if num=100, and so on. If you have enabled parallelInvocation, usually you don't need to tune the value of totalExecutionTimeout unless your average execution time is above 5 min. If you have a sleep between invocations set, you should include that in your timeout calculations.

How to execute the state machine

You can execute the state machine manually or programmatically, see the documentation here.

State machine input (at execution time)

Each execution of the state machine will require an input where you can define the following input parameters:

Parameter	Description
lambdaARN (required) type: string	Unique identifier of the Lambda function you want to optimize
num (required) type: integer	The # of invocations for each power configuration (minimum 5, recommended: between 10 and 100)
powerValues type: string or list of integers	The list of power values to be tested; if not provided, the default values configured at deploy-time are used; you can provide any power values between 128MB and 10,240MB (⚠️ New AWS accounts have reduced concurrency and memory quotas (3008MB max))
payload type: string, object, or list	The static payload that will be used for every invocation (object or string); when using a list, a weighted payload is expected in the shape of `[{"payload": {...}, "weight": X }, {"payload": {...}, "weight": Y }, {"payload": {...}, "weight": Z }]`, where the weights `X`, `Y`, and `Z` are treated as relative weights (not percentages); more details below in the Weighted Payloads section
payloadS3 type: string	A reference to Amazon S3 for large payloads (>256KB), formatted as `s3://bucket/key`; it requires read-only IAM permissions, see `payloadS3Bucket` and `payloadS3Key` below and find more details in the S3 payloads section
parallelInvocation type: boolean default: `false`	If true, all the invocations will be executed in parallel (note: depending on the value of `num`, you may experience throttling when setting `parallelInvocation` to true)
strategy type: string default: `"cost"`	It can be `"cost"` or `"speed"` or `"balanced"`; if you use `"cost"` the state machine will suggest the cheapest option (disregarding its performance), while if you use `"speed"` the state machine will suggest the fastest option (disregarding its cost). When using `"balanced"` the state machine will choose a compromise between `"cost"` and `"speed"` according to the parameter `"balancedWeight"`
balancedWeight type: number default: `0.5`	Parameter that express the trade-off between cost and time. Value is between 0 & 1, 0.0 is equivalent to `"speed"` strategy, 1.0 is equivalent to `"cost"` strategy
autoOptimize type: boolean default: `false`	If `true`, the state machine will apply the optimal configuration at the end of its execution
autoOptimizeAlias type: string	If provided - and only if `autoOptimize` if `true`, the state machine will create or update this alias with the new optimal power value
dryRun type: boolean default: `false`	If true, the state machine will execute the input function only once and it will disable every functionality related to logs analysis, auto-tuning, and visualization; the dry-run mode is intended for testing purposes, for example to verify that IAM permissions are set up correctly
preProcessorARN type: string	It must be the ARN of a Lambda function; if provided, the function will be invoked before every invocation of `lambdaARN`; more details below in the Pre/Post-processing functions section
postProcessorARN type: string	It must be the ARN of a Lambda function; if provided, the function will be invoked after every invocation of `lambdaARN`; more details below in the Pre/Post-processing functions section
discardTopBottom type: number default: `0.2`	By default, the state machine will discard the top/bottom 20% of "outliers" (the fastest and slowest), to filter out the effects of cold starts that would bias the overall averages. You can customize this parameter by providing a value between 0 and 0.4, with 0 meaning no results are discarded and 0.4 meaning that 40% of the top/bottom results are discarded (i.e. only 20% of the results are considered).
sleepBetweenRunsMs type: integer	If provided, the time in milliseconds that the tuner function will sleep/wait after invoking your function, but before carrying out the Post-Processing step, should that be provided. This could be used if you have aggressive downstream rate limits you need to respect. By default this will be set to 0 and the function won't sleep between invocations. Setting this value will have no effect if running the invocations in parallel.
disablePayloadLogs type: boolean default: `false`	If provided and set to a truthy value, suppresses `payload` from error messages and logs. If `preProcessorARN` is provided, this also suppresses the output payload of the pre-processor.
includeOutputResults type: boolean default: `false`	If provided and set to true, the average cost and average duration for every power value configuration will be included in the state machine output.

Here's a typical execution input with basic parameters:

{
    "lambdaARN": "your-lambda-function-arn",
    "powerValues": [128, 256, 512, 1024],
    "num": 50,
    "payload": {}
}

State Machine Output

The state machine will return the following output:

{
  "results": {
    "power": "128",
    "cost": 0.0000002083,
    "duration": 2.9066666666666667,
    "stateMachine": {
      "executionCost": 0.00045,
      "lambdaCost": 0.0005252,
      "visualization": "https://lambda-power-tuning.show/#<encoded_data>"
    },
    "stats": [{ "averagePrice": 0.0000002083, "averageDuration": 2.9066666666666667, "value": 128}, ... ]
  }
}

More details on each value:

results.power: the optimal power configuration (RAM)
results.cost: the corresponding average cost (per invocation)
results.duration: the corresponding average duration (per invocation)
results.stateMachine.executionCost: the AWS Step Functions cost corresponding to this state machine execution (fixed value for "worst" case)
results.stateMachine.lambdaCost: the AWS Lambda cost corresponding to this state machine execution (depending on num and average execution time)
results.stateMachine.visualization: if you visit this autogenerated URL, you will be able to visualize and inspect average statistics about cost and performance; important note: average statistics are NOT shared with the server since all the data is encoded in the URL hash (example), which is available only client-side
results.stats: the average duration and cost for every tested power value configuration (only included if includeOutputResults is set to a truthy value)

Data visualization

You can visually inspect the tuning results to identify the optimal tradeoff between cost and performance.

The data visualization tool has been built by the community: it's a static website deployed via AWS Amplify Console and it's free to use. If you don't want to use the visualization tool, you can simply ignore the visualization URL provided in the execution output. No data is ever shared or stored by this tool.

Website repository: matteo-ronchetti/aws-lambda-power-tuning-ui

Optionally, you could deploy your own custom visualization tool and configure the CloudFormation Parameter named visualizationURL with your own URL.

Additional features, considerations, and internals

Here you can find out more about some advanced features of this project, its internals, and some considerations about security and execution cost.

Contributing

Feature requests and pull requests are more than welcome!

How to get started with local development?

For this repository, install dev dependencies with npm install. You can run tests with npm test, linting with npm run lint, and coverage with npm run coverage. Unit tests will run automatically on every commit and PR.

Name		Name	Last commit message	Last commit date
Latest commit History 751 Commits
.github		.github
cdk		cdk
imgs		imgs
lambda		lambda
layer-sdk		layer-sdk
scripts		scripts
terraform		terraform
test		test
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
.mocharc.js		.mocharc.js
.nycrc.json		.nycrc.json
CHANGELOG.md		CHANGELOG.md
LICENSE.txt		LICENSE.txt
README-ADVANCED.md		README-ADVANCED.md
README-DEPLOY.md		README-DEPLOY.md
README-EXECUTE.md		README-EXECUTE.md
README-SAR.md		README-SAR.md
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
template.yml		template.yml

License

alexcasalboni/aws-lambda-power-tuning

Folders and files

Latest commit

History

Repository files navigation

AWS Lambda Power Tuning

What does the state machine look like?

What results can I expect from Lambda Power Tuning?

How to deploy the state machine

State machine configuration (at deployment time)

How to execute the state machine

State machine input (at execution time)

State Machine Output

Data visualization

Additional features, considerations, and internals

Contributing

How to get started with local development?

About

Topics

Resources

License

Stars

Watchers

Forks

Languages