Skip to main content
CAP Theorem

Introduction

The CAP theorem is one of the most famous concepts in computer science, but also one of the most misunderstood. Proposed by Eric Brewer in 2000 and formally proven by Seth Gilbert and Nancy Lynch in 2002, it fundamentally shapes how we design distributed systems.
The CAP theorem is often referenced in database selection, but understanding its nuances is critical for making the right architectural decisions.

What is the CAP Theorem?

The CAP theorem states that a distributed system can’t provide more than two of these three guarantees simultaneously:

Consistency

All clients see the same data at the same time, no matter which node they connect to

Availability

Any client requesting data gets a response, even if some nodes are down

Partition Tolerance

The system continues to operate despite network partitions (communication breakdowns)

The Three Guarantees Explained

Consistency (C)

Definition: All clients see the same data at the same time, no matter which node they connect to. Consistency in CAP means that a read operation will return the most recent write. If you write a value to node A, then immediately read from node B, you’ll see that newly written value.
User 1 writes X=5 to Node A

    [Sync happens]

User 2 reads X from Node B → Gets 5 ✓
This is different from “Consistency” in ACID, which refers to maintaining database invariants and constraints.
Example: In a banking system, after you transfer $100 to another account, every subsequent query to any server should show the updated balances immediately.

Availability (A)

Definition: Any client which requests data gets a response, even if some of the nodes are down. Availability means every request receives a non-error response, regardless of the state of any individual nodes. The system remains operational and responsive.
Node A: Responding ✓
Node B: Responding ✓
Node C: Down ✗

User request → Gets response ✓
Example: An e-commerce website remains accessible and returns product information even if some backend servers have failed.

Partition Tolerance (P)

Definition: The system continues to operate despite network partitions. A partition is a communication break between nodes—they can’t talk to each other. Partition tolerance means the system continues functioning even when nodes can’t communicate.
Network Partition:
┌─────────┐     ✗✗✗✗✗     ┌─────────┐
│ Node A  │  ←(broken)→  │ Node B  │
│ Node C  │               │ Node D  │
└─────────┘               └─────────┘
  Cluster 1               Cluster 2
  
System continues to operate ✓
Reality check: In distributed systems, network partitions are inevitable. You can’t eliminate them, so partition tolerance is not optional.
Network partitions will happen. It’s not a question of “if” but “when.” This makes partition tolerance effectively mandatory for distributed systems.

The Trade-offs: Pick Two?

The traditional “pick two out of three” framing can be misleading. Let’s examine what each combination means:

CP: Consistency + Partition Tolerance

Trade-off: Sacrifice availability during network partitions. When a partition occurs, the system must reject requests to maintain consistency.
Network Partition Occurs

System detects partition

Chooses Consistency over Availability

Rejects writes/reads to maintain consistency

Returns errors until partition heals
Characteristics:
  • Strong consistency guarantees
  • May return errors during partitions
  • Prioritizes correctness over availability
Examples:
  • HBase: Blocks operations until partition heals
  • MongoDB (with majority write concern): May refuse writes
  • Redis (with strong consistency configuration)
  • Etcd, ZooKeeper: Prefer consistency for coordination
Use cases:
  • Financial systems where correctness is paramount
  • Inventory management systems
  • Coordination services (leader election, configuration)
  • Systems where stale data causes serious problems

AP: Availability + Partition Tolerance

Trade-off: Sacrifice consistency during network partitions. The system remains available and responsive but may return stale or divergent data.
Network Partition Occurs

System detects partition

Chooses Availability over Consistency

Both partitions continue serving requests

Data may diverge (eventual consistency)

Resolve conflicts when partition heals
Characteristics:
  • Always responsive
  • May return stale data
  • Eventually becomes consistent
Examples:
  • Cassandra: Always available, eventually consistent
  • DynamoDB: Configurable but can prioritize availability
  • Riak: Designed for high availability
  • CouchDB: Multi-master with conflict resolution
Use cases:
  • Social media feeds (stale tweets are acceptable)
  • Product catalogs (slight staleness is tolerable)
  • Chat applications (message delivery over ordering)
  • Shopping carts (temporary inconsistency is acceptable)
Companies don’t choose Cassandra for chat applications simply because it’s an AP system. There are many other characteristics that make Cassandra desirable for storing chat messages. The CAP classification is just one factor.

CA: Consistency + Availability

Trade-off: Not partition tolerant. This combination only works in non-distributed systems or when you can guarantee no network partitions (which is impossible in reality). Examples:
  • Single-node databases: PostgreSQL, MySQL (on one server)
  • Traditional RDBMS without replication
In truly distributed systems, CA is not achievable. Network partitions are inevitable, making partition tolerance mandatory.

Common Misconceptions

Misconception 1: “Pick Two” is Complete

Myth: You simply pick two properties and you’re done.Reality: The theorem is about behavior during network partitions, not overall system design.
The CAP theorem “prohibits only a tiny part of the design space: perfect availability and consistency in the presence of partitions, which are rare.” — Eric Brewer, “CAP Twelve Years Later: How the ‘Rules’ Have Changed”

Misconception 2: The Choice is Binary

Myth: Systems are either “consistent” or “available.”Reality: Consistency and availability exist on a spectrum. Systems can be “mostly consistent” or “highly available.”
Modern databases offer tunable consistency:
# Cassandra: Tune consistency per query
db.execute(query, consistency_level='QUORUM')  # More consistent
db.execute(query, consistency_level='ONE')     # More available

Misconception 3: CAP is About 100%

The theorem is about perfect availability and perfect consistency. In practice, systems make trade-offs:
  • 99.9% availability (not 100%)
  • Eventually consistent (not always consistent)
  • Tunable consistency levels

Misconception 4: CAP is Sufficient for Database Selection

Myth: Justifying database choice purely based on the CAP theorem is sufficient.Reality: Many other factors matter: performance characteristics, operational complexity, ecosystem, team expertise, query patterns, etc.
When choosing Cassandra for chat messages, consider:
  • Write-heavy workload (Cassandra excels here)
  • Time-series data model fits well
  • Horizontal scalability needs
  • Operational maturity and team skills
  • Not just “it’s AP”

PACELC Theorem: A More Nuanced View

The PACELC theorem extends CAP to be more realistic: PACELC stands for:
  • If Partition (P): Choose between Availability (A) and Consistency (C)
  • Else (E): Choose between Latency (L) and Consistency (C)
PACELC recognizes that even when there’s no partition, you must trade off latency against consistency.

The Latency vs Consistency Trade-off

When the network is healthy (no partition): Lower latency → Weaker consistency:
Write to Node A → Immediately return ✓
                  (Don't wait for replication)
                  = Fast but potentially stale reads
Higher latency → Stronger consistency:
Write to Node A → Wait for majority of nodes to acknowledge
                → Then return ✓
                  = Slower but consistent reads
PACELC Classifications:
  • PA/EL: Cassandra, Riak (Available during partition, Low latency during normal operation)
  • PC/EC: HBase, MongoDB (Consistent during partition, Consistent during normal operation)
  • PA/EC: DynamoDB (Available during partition, Consistent during normal operation)

Real-World Systems

Most modern databases allow you to tune the trade-off:

DynamoDB

# Strongly consistent read (slower, consistent)
response = table.get_item(
    Key={'id': '123'},
    ConsistentRead=True  # PC behavior
)

# Eventually consistent read (faster, may be stale)
response = table.get_item(
    Key={'id': '123'},
    ConsistentRead=False  # PA behavior
)

Cassandra

# Strong consistency (CP-like)
db.execute(query, consistency_level='ALL')     # All replicas must respond
db.execute(query, consistency_level='QUORUM')  # Majority must respond

# High availability (AP-like)
db.execute(query, consistency_level='ONE')     # Any replica responds

MongoDB

// Write concern: How many nodes must acknowledge write
db.collection.insertOne(
  { name: "Alice" },
  { writeConcern: { w: "majority" } }  // Wait for majority (CP)
);

db.collection.insertOne(
  { name: "Bob" },
  { writeConcern: { w: 1 } }  // Wait for one node (AP)
);

Design Implications

Detecting Partitions

Systems must detect when partitions occur:
1

Heartbeat Monitoring

Nodes send periodic heartbeat messages to each other
2

Timeout Detection

If heartbeat isn’t received within timeout, assume partition
3

Decision Point

System must choose: consistency or availability?
4

Partition Handling

Execute the appropriate strategy (reject requests or accept with divergence)

Conflict Resolution (AP Systems)

When choosing availability over consistency, systems need conflict resolution: Last Write Wins (LWW):
Node A: Set X=5 at timestamp 100
Node B: Set X=7 at timestamp 105

Resolution: X=7 (higher timestamp wins)
Vector Clocks:
Track causality of updates
Detect true conflicts
Application resolves conflicts
CRDTs (Conflict-free Replicated Data Types):
Mathematically proven to converge
No conflicts by design
Examples: Counters, Sets, Registers

Practical Guidelines

When to Choose CP (Consistency + Partition Tolerance)

Financial transactions where accuracy is critical
Inventory systems (can’t oversell products)
Coordination services (leader election, locking)
Systems where stale data causes real problems
Accept:
  • Possible downtime during partitions
  • Higher latency for writes
  • Lower availability

When to Choose AP (Availability + Partition Tolerance)

Social media feeds and content
Recommendation systems
Caching layers
Analytics and monitoring
Shopping carts (temporary inconsistency acceptable)
Accept:
  • Eventual consistency
  • Possible stale data
  • Need for conflict resolution
  • More complex application logic

Is the CAP Theorem Really Useful?

Yes, but with caveats:
CAP opens our minds to trade-off discussions in distributed systems. It forces us to think about failure modes.
CAP is only part of the story. We need to dig deeper when picking the right database, considering:
  • Latency requirements
  • Query patterns
  • Operational complexity
  • Team expertise
  • Performance characteristics
CAP reminds us that there are fundamental trade-offs in distributed systems. There’s no perfect solution.
Use CAP as a starting point for discussion, not as the sole criterion for database selection.

Conclusion

The CAP theorem teaches us:
  1. Partition tolerance is mandatory in distributed systems
  2. Trade-offs are inevitable between consistency and availability during partitions
  3. The “pick two” framing is oversimplified—real systems offer tunable trade-offs
  4. PACELC provides a more complete picture including latency considerations
  5. Database selection requires more than CAP—consider the full context
Modern databases increasingly allow you to tune consistency per operation. You don’t have to make a system-wide choice—you can optimize each use case individually.

Next Steps

ACID Properties

Understand transaction guarantees in databases

Database Replication

Learn how replication affects consistency

Database Sharding

Explore horizontal scaling strategies

Distributed Systems

Learn about distributed system patterns

Build docs developers (and LLMs) love