͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

Forwarded this email? Subscribe here for more

Building your own Ledger Database

...or how to replace Amazon Quantum Ledger Database

Oskar Dudycz

Nov 11

∙

Preview

READ IN APP

Building your own database sounds like moronic move. Yet, that’s the advice I gave recently on the Architecture Weekly Discord channel. Why did I do it?

One member asked about the recommendation for replacing Amazon QLDB. What’s Amazon QLDB? Apparently, it's the database type that Amazon is sunsetting next year. It was also the database that the question asker was using in most of his project services.

Jokes aside, Amazon states that:

Amazon Quantum Ledger Database (Amazon QLDB) is a fully managed ledger database that provides a transparent, immutable, and cryptographically verifiable transaction log owned by a central trusted authority. You can use Amazon QLDB to track all application data changes, and maintain a complete and verifiable history of changes over time.

And suggests migrating it to Amazon Aurora PostgreSQL.

Which is an option, but I suggested something else. Before we go to my suggestion, let’s discuss what are actually Ledger Databases. We talk about them in the previous editions, so #169 - 4th March 2024 and #181 - 27th May 2024.

What Are Ledger Databases?

Ledger databases are specialized data stores designed to provide an immutable and verifiable record of transactions over time. They ensure that every change to the data is recorded and cannot be tampered with, making them ideal for applications where data integrity and audibility are critical. Many industries have such legal requirements.

Unlike traditional databases that allow data to be overwritten or deleted, ledger databases use an append-only model. Every transaction is added as a new entry, and previous entries remain untouched. This creates a complete history of all data changes, which is crucial for compliance, auditing, and tracing.

Under the hood, ledger databases combine familiar database technologies with cryptographic techniques to guarantee data integrity. Each transaction is often cryptographically hashed and linked to the previous one, forming a chain that makes tampering evident.

Use cases for Ledger Databases

So, where do ledger databases fit into our architectural toolbox?

In business domains where you have to prove that no one even has a slight option to mutate the operations log. In some of them, user access management is not enough. You’d better use databases natively made write-ahead, not allowing you to mutate or delete anything. For instance:

Financial Services: Banks, payment processors, and financial institutions need to maintain accurate and immutable records of transactions for compliance and auditing purposes. A ledger database ensures that every deposit, withdrawal, or transfer is permanently recorded.
Healthcare Records: Patient data must be accurate, secure, and compliant with regulations like HIPAA. Ledger databases help maintain a record of patient information and medical history, ensuring that any access or modification is immutably logged.
Government and Legal Records: Official documents like property titles, court records, and legislative changes require integrity and traceability. An immutable ledger provides a trustworthy record that can withstand legal scrutiny.
Supply Chain Management: Tracking the movement of goods from origin to destination requires transparency and trust among multiple parties. Ledger databases can record each step in the supply chain, providing a verifiable history that stakeholders can rely on.

Immutable log of data, ummmm. Do I smell Event Sourcing?

Ledger Databases vs. Event Stores and Event Sourcing

There's often confusion between ledger databases and event stores used in event sourcing. While both deal with immutable data and event histories, they serve different purposes.

Event Sourcing is a design pattern in which outcomes of the business logic are stored as events. Then, they’re used for making the next decisions. They’re stored in sequences called streams. That can sound a bit scary, but in other words, streams represent records in classical databases. Events are the state; there’s no other state. Thus, the name Event Sourcing.

In Event Sourcing flow looks as follows:

read all events from the stream,
interpret it getting the current state,
run business logic,
store the new event(s) in the same stream you’ve read from.
rinse/repeat.

So, Event Stores are essentially Key-Value databases: The key is the record id, and the value is the sequence of events. And they’re used for transactional operations focused on business logic. Read more in my articles:

Ledger databases, on the other hand, are specialized databases optimized for storing transactions, often with financial data semantics. They focus on accurately recording transactions, with an emphasis on data integrity and compliance, and often include cryptographic verification. So, they’re more like audit logs with much different data granularity than event stores.

Your main driver is to record and keep immutable data as side effects of your business logic, not your decision model.

Event stores are also general-purpose databases, while ledger databases are tailored for specific domains like finance. For example, financial ledger databases accurately record monetary transactions and balances and ensure consistency in financial operations. Going down the niche allows for domain-specific optimizations and limits the scope of usage.

Custom Ledger Databases

Those optimisation needs drove Uber and Stripe to build their own Ledger Databases; I’ve gone through them in linked editions.

Stripe Ledger

Stripe built its own ledger system focused on storing and tracking financial data. Stripe operates at a massive scale, processing billions of events daily. They had to standardize the representation of money movement. With various systems and partners involved, they required a unified way to represent transactions despite the complexity and imperfections of real-world financial data.

The ledger had to be an immutable source of truth that could be relied upon for internal operations and external audits. So, they also had to ensure data integrity and trustworthiness.

The real world is imperfect, and they had to embrace it. Banking and network partners can provide malformed reports or errors, and they needed custom system to keep these imperfections manageable.

Stripe's Ledger is an immutable log of events that models internal data-producing systems with common patterns like:

State Machine Representation: Ledger encodes producer systems as state machines, modeling behavior as logical fund flows—the movement of balances between accounts.
Double-Entry Bookkeeping: They applied traditional accounting principles to validate money movement, ensuring that credits and debits balance out.
Data Quality Platform: On top of Ledger, they built a platform to unify detection of money movement issues and provide response tooling. This ensures proactive alerting and surfaces issues promptly.

As mentioned, they operate on an extreme scale. Ledger processes five billion events per day. To handle this scale, they optimized their systems for high throughput and low latency. By using double-entry bookkeeping and immutable logs, they ensured that every transaction is accounted for and that the system can detect discrepancies.

Uber's LedgerStore

Similarly to Stripe, Uber developed its own ledger system, LedgerStore. To better meet its needs, Uber moved from using DynamoDB to its custom solution.

At Uber's scale, DynamoDB became expensive. Managing petabytes of data and trillions of indexes was not cost-effective. DynamoDB's limitations led to issues like hot partitions and write throttling, requiring complex workarounds that increased system complexity.

Uber needed a system that ensured data integrity, handled massive data volumes efficiently, and provided better performance.

Similarly, they needed to ensure the data integrity and correctness guarantees, ensuring that individual records are immutable and corrections are trackable. To do that, they came up with their own ledger database. They designed an indexing system capable of handling trillions of indexes, supporting various access patterns via different types of indexes (strongly consistent, eventually consistent, time-range indexes).

Uber migrated over a trillion ledger entries from DynamoDB to LedgerStore. They used Apache Spark for incremental backfill, processing data in chunks to avoid system overloads.

To ensure accuracy, Uber used shadow validation, comparing outputs from the old and new systems to detect discrepancies, achieving a 99.99% accuracy rate.

They encountered performance issues due to poorly distributed index data and inefficient indexing methods. To address this, they optimized partition keys and revised their indexing approach to reduce unnecessary scans.

What was the Discord member use case?

👋 Before I move on, I’d like to note that besides the paid content, we have an exclusive community on Discord! I know how being a tech lead or architect can be lonely. Sometimes, we don’t have other people to challenge our ideas or discuss our challenges. That’s why the community was built.

This article is an example of discussions we have there. Become a paid subscriber and join us!

The person who asked the question faced the challenge of migrating over 100 ledgers before Amazon QLDB was deprecated. The company operates in a regulated environment, providing imaging solutions for healthcare and pharmaceutical clients. ...

Continue reading this post for free in the Substack app

Claim my free post

Or upgrade your subscription. Upgrade to paid

Like

Comment

Restack

Tech Debt doesn't exist, but trade-offs do

Monday, November 4, 2024

Tech debt is deader than dead, shock is all in your head. At least I'm shocked that after 32 years we're still using this term. I discussed today why I consider Tech Debt metaphore harmful, why

Frontent Architecture, Backend Architecture or just Architecture? With Tomasz Ducin

Monday, October 28, 2024

What's more important Frontend or Backend? What is Frontend Architecture? Is it even a thing? Where to draw the line, what are the common challenges in Frontend world? How do we shape our teams:

Don't Oversell Ideas: Trunk-Based Development Edition

Monday, October 21, 2024

We're living in the kiss-kiss-bang-bang era. Answers have to be quick, solutions simple, takes hot. One of the common leitmotifs that I see in my bubble is "just do trunk-based development

Why to measure and make our system observable? How to reason on chaotic world

Sunday, October 20, 2024

The world is messy and chaotic, who knew? Embracing that hard fact can bring relief, and be a first step to understanding how to handle known knowns, unknown unknowns and all that jazz. Today I

Webinar #23 - Gojko Adzic on designing product development experiments with Lizard Optimization

Monday, October 7, 2024

"My favorite conspiracy theory is that the stuff we make in software actually has any sense." As you see, we started strong in this week's episode. That's a quote from Gojko Adzic,

Data Science Weekly - Issue 588

Thursday, February 27, 2025

Curated news, articles and jobs related to Data Science, AI, & Machine Learning ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏

💎 Issue 458 - Why Ruby on Rails still matters

Thursday, February 27, 2025

This week's Awesome Ruby Newsletter Read this email on the Web The Awesome Ruby Newsletter Issue » 458 Release Date Feb 27, 2025 Your weekly report of the most popular Ruby news, articles and

📱 Issue 452 - Three questions about Apple, encryption, and the U.K

Thursday, February 27, 2025

This week's Awesome iOS Weekly Read this email on the Web The Awesome iOS Weekly Issue » 452 Release Date Feb 27, 2025 Your weekly report of the most popular iOS news, articles and projects Popular

💻 Issue 451 - .NET 10 Preview 1 is now available!

Thursday, February 27, 2025

This week's Awesome .NET Weekly Read this email on the Web The Awesome .NET Weekly Issue » 451 Release Date Feb 27, 2025 Your weekly report of the most popular .NET news, articles and projects

💻 Issue 458 - Full Stack Security Essentials: Preventing CSRF, Clickjacking, and Ensuring Content Integrity in JavaScript

Thursday, February 27, 2025

This week's Awesome Node.js Weekly Read this email on the Web The Awesome Node.js Weekly Issue » 458 Release Date Feb 27, 2025 Your weekly report of the most popular Node.js news, articles and

Architecture Weekly - Building your own Ledger Database

Building your own Ledger Database

...or how to replace Amazon Quantum Ledger Database

What Are Ledger Databases?

Use cases for Ledger Databases

Ledger Databases vs. Event Stores and Event Sourcing

Custom Ledger Databases

Stripe Ledger

Uber's LedgerStore

What was the Discord member use case?

Continue reading this post for free in the Substack app

Older messages

Tech Debt doesn't exist, but trade-offs do

Frontent Architecture, Backend Architecture or just Architecture? With Tomasz Ducin

Don't Oversell Ideas: Trunk-Based Development Edition

Why to measure and make our system observable? How to reason on chaotic world

Webinar #23 - Gojko Adzic on designing product development experiments with Lizard Optimization

You Might Also Like

Data Science Weekly - Issue 588

💎 Issue 458 - Why Ruby on Rails still matters

📱 Issue 452 - Three questions about Apple, encryption, and the U.K

💻 Issue 451 - .NET 10 Preview 1 is now available!

💻 Issue 458 - Full Stack Security Essentials: Preventing CSRF, Clickjacking, and Ensuring Content Integrity in JavaScript

💻 Issue 458 - TypeScript types can run DOOM

💻 Issue 453 - Linus Torvalds Clearly Lays Out Linux Maintainer Roles Around Rust Code

💻 Issue 376 - Top 10 React Libraries/Frameworks for 2025 🚀

February 27th 2025

📱 Issue 455 - How Swift's server support powers Things Cloud