Skip to main content
These foundational papers describe production systems that have influenced the state of active research and the design of modern distributed databases.

Core Papers

The following papers are seminal works that have shaped how we build distributed storage and database systems today.

Dynamo

Amazon’s highly available key-value store that solved the problem of fault tolerance in an elegant way

Bigtable

Google’s distributed storage system for structured data

Google File System

The foundational distributed file system that powers Google’s infrastructure

Cassandra

A decentralized structured storage system heavily inspired by Dynamo, now open source

Amazon Dynamo

It is very rare for a paper describing an active production system to influence the state of active research in any industry. Dynamo is one of those seminal distributed systems papers that solves the problem of a highly available and fault-tolerant database in an elegant way.Key Contributions:
  • Consistent hashing for data partitioning
  • Vector clocks for versioning
  • Gossip-based membership protocol
  • Sloppy quorum and hinted handoff
Impact: Dynamo paved the way for systems like Cassandra and many other AP systems (available and partition-tolerant in the CAP theorem) using consistent hashing.
Read Dynamo Paper

Google’s Storage Systems

Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers.Key Features:
  • Sparse, distributed, persistent multi-dimensional sorted map
  • Data indexed by row key, column key, and timestamp
  • Built on top of GFS
  • Automatic data sharding across tablets
Read Bigtable Paper
GFS is a scalable distributed file system for large distributed data-intensive applications. It provides fault tolerance while running on inexpensive commodity hardware.Design Principles:
  • Component failures are the norm
  • Files are huge by traditional standards
  • Most files are mutated by appending new data
  • Co-designing applications and file system API benefits the overall system
Read GFS Paper

Open Source Systems

Cassandra

Cassandra is a decentralized structured storage system inspired heavily by Dynamo, combining Dynamo’s partitioning and replication techniques with Bigtable’s data model.
Key Characteristics:
  • Peer-to-peer architecture (no single point of failure)
  • Elastic scalability
  • Tunable consistency
  • Column-family data model
  • Eventual consistency with tunable trade-offs
Read Cassandra Paper

CRUSH & Ceph

CRUSH (Controlled Replication Under Scalable Hashing) is a scalable pseudo-random data distribution function designed for distributed object-based storage systems.Purpose:
  • Basis for the Ceph distributed storage system
  • Deterministic placement of replicated data
  • Minimal data migration when cluster changes
Resources:

Learning Path

1

Start with Dynamo

Understand the principles of highly available key-value stores and consistent hashing
2

Study Google's Systems

Read both GFS and Bigtable to understand how Google built scalable storage infrastructure
3

Explore Cassandra

See how Dynamo’s principles were combined with Bigtable’s data model in an open-source system
4

Deep Dive into Ceph

Learn about modern distributed storage with CRUSH and RADOS
These papers are best read in conjunction with understanding the CAP theorem and the trade-offs between consistency, availability, and partition tolerance.

Build docs developers (and LLMs) love