All You Need to Remember Before Going serverless with AWS Lambda
AWS (Amazon Web Services) Lambda is a high-scale, provision-free serverless compute offering based on functions. It’s easy to use and seems relatively cheap so everyone seems to be trying it out, but before that, there are a couple of things to keep in mind.
1. Lambda Cold Start
👉 What exactly is a cold start?
By definition ‘A cold start refers to the set-up time required to get a serverless application’s environment up and running when it is invoked for the first time within a defined period’.
When you create a new lambda function a container is created for it and further executions take place in that container for an indefinite period of time that is up to the discretion of the cloud provider.
👉 When does it happen?
Cold starts happen every time a new function is created, but apart from that, it can happen when there are no warm containers available when there is a change in the software version of your application code.
👉 How does it affect your application?
Every time you create a new function or your code is updated a cold start takes place. This can cause latency due to the packages that need to be downloaded and the environment that needs to be set up prior to running the function. This latency affects the performance of your application, especially when it is a public-facing one.
👉 What can you do about it?
Cold start is the most pressing issue right now when it comes to the serverless architecture which has no clear solution as of today. But there are a few things we can take care of from our side such as choosing an appropriate language.
- Cold start time varies according to the language chosen. Java and C# take the longest time during a cold start from the graph below. We can also see that, by increasing the memory of your function the cold start time reduces.
- Another tip is to reduce the size of your deployment package as this adds to the amount of time required to set up the container. Lambda function code packages are permitted to be at most 50 MB when compressed and 250 MB when extracted in the runtime environment. Every time your container is setup the dependencies are downloaded and the code unzipped, by reducing the overall size you can reduce the cold start time.
- Use warm containers as caches for various connections, object initialisation etc. By doing so you are reducing execution time required for these and in turn saving costs.
The first time you invoke your function, AWS Lambda creates an instance of the function and runs its handler method to process the event. When the function returns a response, it sticks around to process additional events. If you invoke the function again while the first event is being processed, Lambda creates another instance.
As more events come in, Lambda routes them to available instances and creates new instances as needed. Your function’s concurrency is the number of instances serving requests at a given time. For an initial burst of traffic, your function’s concurrency can reach an initial level of between 500 and 3000, which varies per region.
The regional concurrency limit starts at 1,000 and can be increased by submitting a request to AWS. Since there is a region wide limit there is a chance that one lambda function can exhaust the concurrency limit, hence its wise to use the ‘Reserved Concurrency’ field while creating your lambda function. This makes sure that at any point in time the lambda function with ‘X’ reserved concurrency will always have ‘X’ number of instances at its disposal to run on.
Read more: How To Get The Ball Rolling With AWS Lambda
3. Memory and Time
The Lambda free tier includes 1M free requests per month and 400,000 GB-seconds of compute time per month. The memory size you choose for your Lambda functions determines how long they can run in the free tier.
Choosing an optimal memory time ratio is important to save costs while using AWS Lambda. It is a recommended best practice to test your Lambda function’s performance by increasing the memory and observing the time taken by the function for execution. There should be a logarithmic increase in performance with increased memory, but you will notice that after a certain threshold is reached there will be no advantage in increasing the memory. Here, every time you choose a certain amount of memory a proportional amount of CPU is allocated to it. In that case after a certain amount of memory the graph flatlines because the required amount is already present for the function, hence a further increase in memory would not be of any use.