Backend Engineering Insights #1: PACELC theorem, create an ultra fast backend server with Nitro & more

Jan 29, 2024

Hello, citizens of the backend engineering realm. My name is Shivang and this is the first edition of the backend engineering insights newsletter that I've kickstarted to keep myself and you in the loop of the recent developments in the backend engineering space.

What will you find in this newsletter?

Ideally, you'll find short informational snippets on distributed systems, cloud, application development, shiny new products, tech trends, learning resources, and essentially everything that is part of the backend engineering ecosystem.

Being a part of this newsletter, you'll stay on top of the developments that happen in this space on an ongoing basis.

Why this newsletter?

I actively read to stay informed in the backend engineering space, so why not list down my learnings and findings in a newsletter for myself and the community. Also, I will focus on keeping this content concise for a fun, relaxed read.

If you are someone who looks forward to long-form content, you might want to check out my other newsletter called the Web Scale, where I write long-form content on distributed systems design and backend engineering in general.

PACELC Theorem: An Extension To The CAP Theorem

PACELC theorem is an extension to the CAP theorem. It states that in the case of network partitioning in a distributed system, we have to choose between availability and consistency (as per the CAP theorem), but even when the system is running normally in the absence of partitions, we have to choose between latency and consistency.

For instance, in a globally distributed service deployed in different cloud regions, to ensure low latency, we need to move the compute and storage near the end user. However, when the DB writes happen in respective cloud regions, ensuring strong consistency globally isn't possible. Writes will take some time to sync across different cloud regions.

Several databases, such as Cassandra, Riak, Cosmos DB, etc., follow this theorem: If a partition occurs, they give up consistency for availability, and under normal circumstances, they choose low latency over consistency.

Nitro: An Open Source TypeScript Framework For An Ultra Fast Backend Server

Nitro is an open-source TypeScript framework that helps us build an ultra-fast backend server capable of running across different JavaScript runtimes such as Bun, Deno, and NodeJS. It can generate different output formats from the same code base fit for different hosting providers such as AWS, Azure, Cloudflare pages, Netlify, Vercel, and so on.

When deploying to the production using CI/CD, the framework tries to automatically detect the provider environment and set the right one without any additional configuration.

Building Pinterest's New Wide Column Database Using RocksDB

Pinterest consolidated its different key-value systems into a single unified service called KVStore, which acted as a client-facing abstraction. They also built a storage service, a wide column, schemaless NoSQL database built using RocksDB.

Their engineering post goes into the details of how they built this massively scalable, highly available wide-column database, including a discussion on the data model, APIs, and other key features.

I've written a newsletter post in the recent past on how DoorDash integrates caching libraries like Redis, Caffeine, etc., in their code with a standardized interface providing better control and observability of cache implementation in their system.

The post also includes a discussion on how we can implement cache with an abstraction layer when integrating third-party caching libraries with our code. Tightly coupling third-party code with our code isn't a great idea; It makes the code messy and also prevents us from bailing out on a certain technology without significant code refactoring when required.

Surreal DB - VART (Versioned Adaptive Radix Trie): A Persistent Data Structure For Snapshot Isolation

DB Isolation levels define the degree to which one transaction can operate independently of another. The SQL standard recognizes various isolation levels, each addressing specific anomalies such as dirty writes, dirty reads, lost updates, and other potential inconsistencies that can arise in a concurrent transactional environment.

This Surreal DB blog post delves into the intricacies of transaction isolation and discusses VART—a persistent data structure designed for snapshot isolation. It serves as an index within SurrealKV (their in-memory persistent Key-Value store written in Rust) to manage concurrency support via snapshot isolation. It's an interesting read.

Type Safety

What comes to your mind when you hear the terms Strongly Typed, Weakly Typed, Statically Typed and Dynamically Typed in the context of programming languages?

Can strongly typed and statically typed be used interchangeably? And does the same goes for weakly typed and dynamically typed?

Statically Typed: In a statically typed language, the data types of variables are explicitly checked at compile-time (before the program is run).

Dynamically Typed: In a dynamically typed language, variable types are checked at runtime (while the program is executing).

When it comes to strongly typed and weakly typed, there is no universally agreed on technical definition. It's more about how strictly the language enforces the type rules and handles type conversions.

In a strongly typed language, the type of a variable is strictly enforced, and implicit type conversions are limited. Operations involving incompatible types result in type errors, and explicit type conversions are often required.

In a weakly typed language, the type of a variable can be implicitly changed during operations. Type conversions are often performed automatically, and the language may allow operations between different types without explicit type casting.

Architecture Decisions In Neon (A Serverless Postgres Service) - Separating Storage & Compute

Neon is a managed serverless Postgres service with modern cloud-native architecture. The service has separate storage and compute in its cloud-native architecture to provide a top-notch Postgres service.

The separation of compute and storage enabled them to:

Run multiple compute instances without having multiple copies of the data.
Perform a fast startup and shutdown of compute instances.
Provide instant recovery for your database.
Simplify operations, like backups and archiving, to be handled by the storage layer without affecting the application.
Scale CPU and I/O resources independently.

The detailed discussion on their service architecture is an interesting read.

Exercism Helps Us Get Good At Programming With 67+ Programming Languages (100% Free Forever)

Exercism is an independent, community-funded, not-for-profit organization that helps us learn coding with support for over 67+ programming languages.

We can write code either locally and submit it via our local CLI terminal or work in their browser editor. Not just this, we also get automated analysis on our code, in addition to getting human mentoring for free. Wait What? ὸ.ό

That's all for now, folks. I'll see you in the next post. Pretty soon. If you found the content helpful, maybe exciting (I don't know), do share it with your network for more reach.

I'll see you around. Cheers!
And, Oh yes, you can find me on LinkedIn & X.
Bye Bye!

Backend Insights

Discussion about this post