Fundamental Papers

These papers are considered quintessential reading for anyone working with distributed systems. They establish foundational concepts that appear throughout distributed systems literature.

Essential Papers

The following papers represent the foundational knowledge every distributed systems engineer should understand.

Time, Clocks and Ordering of Events

Lamport’s quintessential distributed systems primer on logical clocks and event ordering

Session Guarantees for Weakly Consistent Replicated Data

A 1994 paper establishing standard vocabulary for eventually consistent systems

CAP Theorem

The fundamental theorem about trade-offs in distributed systems

FLP Impossibility Result

Proof that consensus is impossible in asynchronous systems with even one faulty process

Lamport’s Time and Clocks

Times, Clocks and Ordering of Events in Distributed Systems This is Leslie Lamport’s seminal paper that establishes the foundations for understanding time and ordering in distributed systems. It introduces the concept of logical clocks and the “happens-before” relationship, which are fundamental to reasoning about distributed computations.

Why This Paper Matters

This paper is considered the quintessential distributed systems primer. Nearly all of Lamport’s work is influential, but this particular paper is essential reading because it:

Introduces logical clocks as a way to order events without synchronized physical clocks
Defines the “happens-before” relationship (→)
Provides the foundation for understanding causality in distributed systems
Establishes concepts that appear throughout distributed systems literature

Session Guarantees for Weak Consistency

Session Guarantees for Weakly Consistent Replicated Data This 1994 paper discusses various recommendations for session guarantees in eventually consistent systems. It established much of the standard vocabulary used in distributed systems papers today.

Key Concepts

The paper introduces several important guarantees that are now standard terminology:

Monotonic Reads: If a process reads a value, subsequent reads will never return earlier values
Read Your Writes: A process will always see its own writes in subsequent reads
Writes Follow Reads: Writes are ordered after reads that causally precede them
Monotonic Writes: Writes from a single process are applied in the order they were made

CAP Theorem

CAP Theorem | Plain English Explanation The CAP theorem states that in a distributed system, you can only guarantee two of the following three properties:

Consistency: All nodes see the same data at the same time
Availability: Every request receives a response
Partition Tolerance: The system continues to operate despite network partitions

Understanding CAP is essential before starting work on distributed systems. It explains fundamental trade-offs that affect every architectural decision.

FLP Impossibility

Impossibility of Distributed Consensus with One Faulty Process | Easier Blog Post The FLP Impossibility Result (named after Fischer, Lynch, and Paterson) proves that in an asynchronous distributed system, consensus is impossible if even a single process can fail. This is a fundamental theoretical result that shapes practical distributed system design.

Practical Implications

While FLP proves consensus is theoretically impossible in purely asynchronous systems with failures, practical systems work around this by:

Adding timeouts (partial synchrony)
Using randomization
Accepting that consensus might not always be reached
Using algorithms like Paxos and Raft that work in practice despite theoretical impossibility

Fallacies of Distributed Computing

Fallacies of Distributed Computing Before diving deep into distributed systems, understand these common false assumptions:

Expect things to break, everything. The fallacies remind us that:

The network is NOT reliable
Latency is NOT zero
Bandwidth is NOT infinite
The network is NOT secure
Topology does NOT stay constant
There is NOT one administrator
Transport cost is NOT zero
The network is NOT homogeneous

Additional Resources

Distributed Systems Theory for the Distributed Engineer

A BFS (breadth-first search) approach to learning distributed systems. Many papers in this guide overlap with other sections.

An Introduction to Distributed Systems

@aphyr’s excellent introduction to distributed systems

Overview

Learning Resources

Core Concepts

System Types

Operations

Community

Essential Papers

Time, Clocks and Ordering of Events

Session Guarantees for Weakly Consistent Replicated Data

CAP Theorem

FLP Impossibility Result

Lamport’s Time and Clocks

Session Guarantees for Weak Consistency

CAP Theorem

FLP Impossibility

Fallacies of Distributed Computing

Additional Resources

Distributed Systems Theory for the Distributed Engineer

An Introduction to Distributed Systems

Build docs developers (and LLMs) love

Overview

Learning Resources

Core Concepts

System Types

Operations

Community

​Essential Papers

Time, Clocks and Ordering of Events

Session Guarantees for Weakly Consistent Replicated Data

CAP Theorem

FLP Impossibility Result

​Lamport’s Time and Clocks

​Session Guarantees for Weak Consistency

​CAP Theorem

​FLP Impossibility

​Fallacies of Distributed Computing

​Additional Resources

Distributed Systems Theory for the Distributed Engineer

An Introduction to Distributed Systems

Build docs developers (and LLMs) love

Essential Papers

Lamport’s Time and Clocks

Session Guarantees for Weak Consistency

CAP Theorem

FLP Impossibility

Fallacies of Distributed Computing

Additional Resources