Skip to main content

Awesome Distributed Systems

A (hopefully) curated list on awesome material on distributed systems, inspired by other awesome frameworks like awesome-python. Most links will tend to be readings on architecture itself rather than code itself.

Academic Papers

Seminal papers from pioneers like Lamport covering consensus, fault tolerance, and distributed storage

Books & Textbooks

From beginner-friendly to advanced texts on distributed systems principles and paradigms

Online Courses

University courses from MIT, CMU, ETH Zurich, and more covering distributed algorithms and systems

Consensus Algorithms

Deep dives into Paxos, Raft, Byzantine Fault Tolerance, and CRDTs

Real-World Systems

Papers on production systems like Dynamo, Bigtable, GFS, Kafka, and Cassandra

Testing & Verification

Frameworks like Jepsen for verifying distributed systems and tracing with Dapper

Blogs & Articles

Industry insights from Amazon, Google engineers, and distributed systems experts

Community & Research

Academic conferences, journals, and curated reading lists from the community

Why Distributed Systems Matter

Distributed systems power the modern internet infrastructure, enabling scalability, fault tolerance, and high availability for applications that serve billions of users. Understanding distributed systems is essential for building reliable, scalable software in today’s cloud-native world.
While nearly all of Lamport’s work deserves attention, this collection highlights the must-read papers, books, and resources that form the foundation of distributed systems knowledge.

What You’ll Find Here

This curated collection covers the entire spectrum of distributed systems:
  • Foundational Theory: CAP theorem, FLP impossibility, consensus algorithms
  • Storage Systems: Distributed databases, file systems, and key-value stores
  • Messaging & Coordination: Message queues, logs, and coordination services
  • Fault Tolerance: Byzantine fault tolerance, replication, and failure detection
  • Testing & Verification: Frameworks for testing distributed systems at scale
  • Programming Models: Languages and frameworks designed for distributed computing
New to distributed systems? Start with the Getting Started guide to follow a structured learning path through the essential concepts.

Categories at a Glance

The resources are organized into:
  • Bootcamp: Essential readings to start your journey
  • Books: Comprehensive texts from free online books to university textbooks
  • Papers: Fundamental research papers organized by topic (storage, messaging, consensus, testing, programming models)
  • Videos: Conference talks and lecture series
  • Courses: University courses with lecture notes and assignments
  • Blogs: Industry perspectives and practical experiences
  • Research: Academic conferences and journals
  • Meta Lists: Other curated collections to explore
Whether you’re a software engineer looking to understand distributed systems architecture, a student studying the theory, or a practitioner building production systems, this collection provides pathways to deepen your knowledge.

Build docs developers (and LLMs) love