Development Best Practices (The Ultimate Guide to AWS Lambda Development Chapter 1)

George Mao
7 min readJan 12, 2024

--

Chapter 1 of this guide focuses on themes you should follow during your development lifecycle. There are 7 themes we will walk through. Lets go!

If you have questions or would like to discuss, join us in Discord at #BelieveInServerless!

Don’t load until needed (Lazy Load)

“Load everything in global scope so you can reuse during warm invokes.”

In general this is good advice, but you should consider how much of your code base needs the globally loaded variable. For example, consider global loading of the AWS SDK. You normally do this statically at the top of your function. If your function has two distinct code paths and only one of those paths actually needs to use the AWS SDK — you should consider lazy loading the SDK. The more code paths you have, the more this will help.

Compare these two pseudo-code samples. This first one performs static loading of the AWS v3 DDB SDK in global scope on all cold invokes of the function. All code running in the else block will suffer a cold start penalty since the SDK is not needed.

// Load the AWS SDK in Global scope for warm reuse
const {DynamoDBClient, ListTablesCommand, DescribeTableCommand}
= require("@aws-sdk/client-dynamodb");
const ddbClient = new DynamoDBClient({ region: "us-east-2" })

exports.handler = async function(event, context) {
// Code path 1 runs about 50% of the time and uses DDB
if([[[code path 1]]]){
const command = new ListTablesCommand({});
const results = await ddbClient.send(command);
}
else {
// This path does not use the DDB SDK
console.log("This path does not use the DDB SDK")
}

return "Whatever!"
}

This second example performs a Lazy load only when the SDK is needed (ie. Code Path 1 is executed).


let DynamoDBClient, ListTablesCommand
let ddbClient

exports.lambdaHandler = async (event, context) => {

// Code path 1 uses DDB
if([[[code path 1]]]){
if(!ddbClient){ // Don't lazy load more than once
({DynamoDBClient, ListTablesCommand} = require("@aws-sdk/client-dynamodb"));
ddbClient = new DynamoDBClient({ region: "us-east-2" })
}
const command = new ListTablesCommand({});
const results = await ddbClient.send(command);
}
else {
// This path does not use the DDB SDK. Don't load it here.
console.log("This path does not use the DDB SDK")
}

return "Whatever!"
}

You can do this for all sorts of global variables, such as secrets, SDKs, and other parameters that may require API or network calls.

Use the right SDK (Hint: Upgrade!)

Most of the AWS SDKs have evolved through multiple major versions — generally, the older versions of the SDKs were designed before Lambda existed. This means they are not designed to be fully modular and performant in Lambda. Make sure you use the latest version of the SDK for your runtime. This can significantly reduce package size and improve performance. Keep in mind major version upgrades are not forwards compatible. You will need to rewrite code.

  • Node: Use SDK v3, not v2 (v3 is included in the Lambda runtime as of the node18 runtime)
  • Java: Use SDK v2, not v1 (not included in Lambda runtimes)
  • Python: Boto3 is the only SDK available (bundled in Lambda runtimes)

Use the newer(est) Lambda Runtimes!

I know you have Java 8 somewhere. It runs 15–20% slower than Java 17+. Upgrade, please :)

All of the latest Lambda runtimes (node20, java21, python3.12) have been upgraded to Amazon Linux 2023. This base image has been optimized and reduced from ~100mb to ~40mb.

Import only what is needed

This one is simple … assuming you’re already using the right SDK version. Only include things needed to actually execute the function.

No documentation, no sample code, no extra libraries, and no debugging dependencies.

Example: If you’re setting up dev testing and build dependencies, make sure they don’t end up in your deployable package. Here’s a Node package.json that includes unnecessary dev dependencies into the build.

{
"name": "firstFn",
"version": "1.0.0",
"description": "Serverless Lambda Nodejs example",
"main": "src/index.mjs",
"devDependencies": {
"c8": "^7.13.0"
},
"dependencies": {
"mocha": "^10.2.0", // This should be in devDependencies
"esbuild": "^0.19.5", // This should be in devDependencies
"aws-xray-sdk-core": "3.5.3",
"@aws-sdk/client-dynamodb": "^3.347.1",
"https-proxy-agent": "^7.0.2",
"@aws-sdk/node-http-handler": "^3.3.3",
"@aws-lambda-powertools/logger": "^1.14.2"
}
}

Use Configuration/Integration over Code

In general, your architectures should prefer to use native AWS integrations to pass data between components, rather than writing custom code. For example consider a use case where you’re writing data to DynamoDB and need to process changes in your data. You could write you function to:

  1. Write to DynamoDB
  2. Notify a second component that data has arrived
  3. Spin up (or allocate) compute required to process the data
  4. Query and perform the processing
Tightly coupling AWS services with custom code

Instead, you should simplify and reduce this architecture:

  1. Write to DynamoDB
  2. Enable the built in DynamoDB stream feature and let AWS push change data to the stream
  3. Lambda will poll it your behalf, and deliver records to a processing Lambda.
Simplified architecture using AWS features

This lets you remove multiple components and get rid of a DynamoDB Query/Get call, resulting in cost efficient scaling and better performance.

Whenever you can let AWS handle all of the logistics of pushing and storing data. Let AWS handle polling it and delivering it to you.

Another example: Instead of writing code to notify or invoke downstream systems, push the notification to a messaging service such as SNS / SQS / Kinesis. Let the built in poller grab messages for you. Use Filters to determine if the message should be delivered at all.

Filter, Filter, FILTER!

Most AWS messaging services that work with Lambda now support event filtering. There’s no need to deliver bad records or fan out every message to all consumers when you have consumers designed for specific types of messages. Finally, Message payloads can change as a result of upstream changes or a bug can result in messages that can’t be processed. Filtering for processable message will prevent brownouts of your Lambda → downstream services.

Filter out bad messages to reduce Lambda invocations
Fan out only the right messages to the correct consumer

Only deliver proper, relevant messages to your downstream. This reduces cost and improves performance.

Note that the Filter configuration differs depending on the Service that the filter is configured on. This is due to the difference in Payload syntax each service uses. The Filter needs to match the Services’ syntax. You start with a Service Key from this list:

The JSON syntax looks like this:

{
"Filters": [
{
"Pattern": {
"[insert filtering key here]": {
"Futher define the JSON for pattern matching"
}
}
]
}

Here’s an example for a Filter on a DynamoDB stream (which uses the “dynamodb” key):

{
"Filters": [
{
"Pattern": {
"dynamodb": {
"Keys": {
"CustomerName": {
"S": [ "George" ]
}
}
}
}
}
]
}

Here’s one for SQS, which uses the “body” key:

{
"Filters": [
{
"Pattern": {
"body": {
"SomeJSONKey": "Some value"
}
}
]
}

You can use comparison operators as well. See the supported list here.

Establish & Reuse

Anything that you intend to use throughout your Lambda lifecycle or is relatively static should be loaded once and reused many times. The general logic should be like this:

// This will execute once during cold start
let aSecret = await fetchSecret();

exports.lambdaHandler = async (event, context) => {
// Check if Secret has been loaded.
// Check if its valid, not expired, etc
// Load it needed
if(aSecret != "valid"){
aSecret = await fetchSecret();
}

// Proceed with function logic
...
}

async function fetchSecret(){
// fetch and load secrets from API, DB, AWS Service, etc
}

If your function executes 10x in a row, this code only calls your downstream secret store once. Without logic like this in place the code will call your secret store 10 times and multiplied by your function concurrency.

Summary

The TLDR;

  • Lazy load where possible
  • Use the latest SDKs
  • Use the latest Runtimes
  • Import only what is needed
  • Prefer Configuration over Code
  • Filter all events to Lambda
  • Establish and Reuse

Join us Discord #BelieveInServerless to talk about these. Let me know if there are more themes you follow!

--

--

George Mao

Distinguished Engineer @ Capital One leading all things Serverless | Ex -AWS WW Serverless Tech Lead.