Hello, folks, hope you are doing fantastic. Welcome to the third post of this newsletter, where we'll have a little discussion about the serverless, specifically the cold start problem, in addition to the cloud multitenancy noisy neighbor problem.
Before we get started, hello 👋 to the new subscribers that have subscribed this week.Â
With that being said, let's kick off this edition of the newsletter.Â
Serverless: off the top of our heads
If I ask you, should we use serverless for our service architecture? What comes to your mind? I need us to recall things off the top of our heads without shuffling through our notes.Â
Being aware that every tech or architectural style has trade-offs, some of the pros we might think of would be dramatically reduced operational overhead when leveraging the cloud serverless platforms as they abstract away infrastructure management (server provisioning, configuration, scaling, maintenance, security updates and all that), enabling us to focus on code and shipping our software faster.Â
In addition, we don't have to run our servers all the time. The servers get triggered when an event occurs and we only pay-per-events processed, thus saving money.Â
Now, when our servers don't run all the time in the serverless setup, they have a cold start time associated that is required by the infrastructure to initialize resources and boot up instances when the customer or client request arrives or an event occurs.Â
This boot up time is not good for latency-sensitive use cases and is what devs contemplate when considering serverless for their service architectures.Â
Furthermore, with serverless, we are vendor-locked and do not have control over the service deployment as well.Â
Well, these are trade-offs, as I mentioned before. With serverless, we have reduced operational overhead and reduced pricing, but they also bring along cold-start times, less control over deployment and state management and such.
If you wish to delve into serverless state management, check out a detailed system design case study that I've published on my other newsletter, where I've discussed serverless compute and storage at the edge with stateless and stateful functions.Â
Discussing things further, InfoQ recently published an article debunking the myths around the cold start problem, how it is often misunderstood and the strategies to mitigate it. Cold starts might not be that big of a challenge for every cloud use case after all. I'll give the gist and have linked back to the article for you to read on.Â
The cold start problemÂ
Not all serverless implementations have the same cold start time. Factors that influence it are the choice of runtime, configuration settings and if the function is a part of a virtual private cloud. While some cold starts may take a few seconds, others can be much quicker.
They might not be an issue for services that are not latency-sensitive and want to leverage the upsides of the serverless architecture.
A VPC (Virtual Private Cloud) is an isolated virtual section in a public cloud infrastructure. It enables businesses to own a private space within a public cloud to have a customizable setup and deployments.Â
With it, businesses, while staying with a public cloud, can have more flexibility, scalability, and security with their cloud infrastructure like they have it with their private data centers.Â
While cold starts can introduce latency, their frequency and effect can be significantly mitigated with different platforms' fitting architecture and solutions.Â
They do not occur all the time, only when a new instance is required. This is typically after a period of inactivity, after a new function is deployed and it serves the first request or when scaling to handle increased load. AWS Lambda, for instance, reuses function instances whenever possible.
Keeping the serverless functions warm with no-operation instructions
It's a common thought in developer circles: what if we keep our serverless functions warm with no-operation instructions to bypass the cold start problem? However, this is more of a misconception and is not practical.Â
A no-operation instruction is essentially an empty command to keep our serverless functions alive. There are other names used for these instructions as well, such as heartbeat instruction, keep-alive instruction or dummy instruction.Â
They emphasize the idea of sending signals at regular intervals to keep the functions alive.Â
No-op instructions do not guarantee warm instances. In contrast, triggering them may result in unpredictable warm state management that the cloud automatically manages. It may result in wasted resources, throttling, etc. and hence is not recommended by cloud platforms. They may terminate warm instances at any time to optimize resource allocation.
Too many uncontrolled triggers could overload our serverless quota, resulting in increased server costs.Â
As an alternative, provisioned concurrency is a recommended approach where we can reserve a fixed number of warm instances to ensure faster response times, averting the cold start problem.Â
Serverless Provisioned ConcurrencyÂ
Serverless provisioned concurrency is a feature offered by cloud platforms that enables us to configure a specific number of function instances to be kept warm and pre-initialized, ready to handle incoming requests. This helps to eliminate or significantly reduce the cold start issue.Â
However, like regular server instances, warm serverless instances have a cost for idle running, unlike instances that are triggered by events.Â
For optimum configuration, it's essential we continuously monitor our serverless usage. Based on the monitoring results, we can configure an optimum balance between the frequency of invocations, cost and the service latency.
System observability allows us to gauge the impact of cold starts on the overall service performance. This is essential for coming up with effective optimization strategies.Â
If you wish to delve into observability in distributed services, here is a detailed post on it I have written.Â
Here is the InfoQ article on cold starts again, if you wanna delve further.Â
The noisy neighbor problemÂ
The noisy neighbor problem is a commonly occurring problem in the cloud where the system resources are shared between multiple applications, also called tenants.Â
This multi-tenant deployment is the norm in public cloud platforms that run on economies of scale and some running workloads may try to hogg up most of the resources (CPU, memory, disk I/O, and network bandwidth), negatively affecting the infrastructure for other workloads running in conjunction.Â
This can lead to resource contention, where other workloads experience degraded performance, increased latency, or even service disruptions.
Virtualization is a key enabler for a system to share its resources amongst multiple apps or services. In the illustration above, three applications share the resources of a bare metal server, each running on their respective virtual machines. The VMs (Virtual Machines) are hosted on a hypervisor over the host bare metal OS.Â
App B proves to be a noisy neighbor hogging most of the resources and creating issues for other apps to function. All VMs share the CPU, memory, and disk I/O of the bare metal and the App B VM consumes a disproportionate amount of resources, impacting the performance of other VMs running on the same host.
Cloud providers employ multiple strategies to avert this issue, such as strict resource isolation with quotas and limits enforced, dynamic resource allocation with continuous monitoring and so on.Â
However, it is important for us as developers to be aware of this problem to effectively plan the workload's resource availability, scalability and performance. We should design applications that are resilient to fluctuations in resource availability and performance.Â
By evaluating how different cloud providers manage resource isolation and mitigate the impact of noisy neighbors, we can pick a provider and the deployment model that best aligns with our application's performance and scalability requirements.Â
Ideally, we should review the platform's documentation to learn about resource isolation mechanisms, resource allocation policies, and any performance guarantees or Service Level Agreements (SLAs) in place, in addition to evaluating the sensitivity of our application to changes in resource availability or performance. And this is applicable to even simple blog deployments on shared hosting.Â
That's a wrap, folks. I'll see you in the next post. If you found the content helpful do share it with your network for more reach.Â
You can read the previous post here:
Your feedback is crucial to this newsletter. I can't stress this enough. Please do reply to this email with your thoughts and if you need anything in particular to be brought up in my future posts. You can connect with me on LinkedIn & X as well.
Bye for now!Â
Have you played with Unikraft? They provide really snappy cold starts, have a look here https://packagemain.tech/p/millisecond-scale-to-zero-with-unikernels