How Transactions Work
CockroachDB treats all SQL statements as transactions—even single statements run in what’s called “autocommit mode.” This ensures consistency is maintained at all times. When you execute multiple statements together, CockroachDB bundles them into a single all-or-nothing transaction.Transaction Begins
When you start a transaction, CockroachDB assigns it a timestamp using hybrid-logical clocks (HLC). This timestamp tracks both physical time and logical ordering, ensuring transactions maintain a consistent view of your data.
Writes Create Intents
Write operations don’t immediately modify data. Instead, they create write intents—provisional values that act as both locks and uncommitted data. These intents point to a transaction record that tracks the transaction’s state (
PENDING, STAGING, COMMITTED, or ABORTED).Reads Check for Conflicts
When reading data, CockroachDB checks for write intents from other transactions. If it encounters an intent, it resolves the conflict by checking the intent’s transaction state and either waiting, pushing timestamps forward, or aborting one transaction.
Transaction Lifecycle
Here’s what happens when your transaction moves through CockroachDB:Gateway Node Processing
Your SQL statement arrives at a gateway node, which:- Parses and validates your SQL
- Creates a logical execution plan
- Converts SQL operations into key-value operations
- Manages transaction coordination
Any node can act as a gateway. CockroachDB’s symmetrical architecture means you can connect to any node and access any data while maintaining strong consistency.
Distribution to Leaseholders
The gateway’sDistSender component routes operations to the appropriate leaseholder nodes. Each range has one leaseholder that:
- Serves all reads for that range
- Coordinates all writes to that range
- Acts as the Raft group leader for consensus
Consensus via Raft
Write operations go through the Raft consensus protocol:- The leaseholder proposes the write to its Raft group
- A majority of replicas must acknowledge the write
- Once consensus is reached, the write is committed to the Raft log
- The write is applied to the storage engine
CockroachDB uses Leader leases to ensure the Raft leader and leaseholder are always co-located, reducing network round trips and improving performance.
Isolation Levels
CockroachDB supports two isolation levels:SERIALIZABLE (Default)
SERIALIZABLE (Default)
The strongest isolation level that prevents all concurrency anomalies:
- Guarantees: Complete isolation from concurrent transactions
- Read behavior: Transactions maintain a single read snapshot
- Write behavior: Uses read refreshing to handle timestamp conflicts
- Retry handling: May require client-side retry logic for conflicts
SERIALIZABLE when you need the strongest consistency guarantees and can handle occasional transaction retries.READ COMMITTED
READ COMMITTED
A weaker isolation level that avoids serialization errors:
- Guarantees: Prevents dirty reads and dirty writes
- Read behavior: Each statement gets a fresh read snapshot
- Write behavior: Allows read/write timestamp skew at commit time
- Retry handling: Automatic statement-level retries (no client-side logic needed)
READ COMMITTED when you need higher concurrency and can tolerate non-repeatable reads.Concurrency Control
CockroachDB manages concurrent access through several mechanisms:Timestamp Cache
The timestamp cache tracks the highest timestamp for reads served by each leaseholder. When a write arrives, CockroachDB checks it against the timestamp cache:- If the write’s timestamp is too old, it’s pushed forward to preserve serializability
- This ensures writes can never rewrite history
- Under
SERIALIZABLEisolation, pushed timestamps may trigger read refreshing or transaction retries
Lock Management
The concurrency manager uses latches and locks to sequence operations:- Latches: Short-lived locks held during request evaluation
- Write intents: Replicated exclusive locks stored with the data
- Unreplicated locks: Fast in-memory locks for
SERIALIZABLEtransactions (unless durable locking is enabled)
Transaction Conflicts
When transactions conflict, CockroachDB resolves them through:- Priority-based resolution: Explicit
HIGHorLOWpriority transactions win/lose automatically - Expiration checking: Expired transactions are aborted
- Wait queue: Active transactions wait in the
TxnWaitQueuefor blocking transactions to complete - Deadlock detection: Random abort if circular dependencies are detected
Performance Optimizations
Transaction Pipelining
Write intents are replicated in parallel rather than sequentially. The gateway doesn’t wait for each write to fully replicate before sending the next—it only waits at commit time. Impact: Dramatically reduces latency for multi-statement transactions by making consensus work approach O(1) instead of O(n).Parallel Commits
CockroachDB commits transactions in one round of consensus instead of two:- Mark transaction as
STAGINGwith list of pending writes - Wait for all writes to succeed
- Return success to client (transaction is now implicitly committed)
- Asynchronously update transaction record to
COMMITTED
Read Refreshing
When aSERIALIZABLE transaction’s timestamp is pushed, CockroachDB attempts to refresh all previously read values at the new timestamp:
- If no values changed: Transaction commits at the pushed timestamp
- If values changed: Transaction must retry
Automatic Retries
CockroachDB automatically retries certain transactions:Individual Statements
Individual Statements
Single statements (implicit transactions) are automatically retried as long as results stay under 16KiB (configurable via
sql.defaults.results_buffer.size).Example: DELETE FROM users WHERE id = 123 retries automatically.Batched Statements
Batched Statements
Multiple statements sent as a single batch retry automatically if:
- They’re sent without returning intermediate results
- Combined results stay under the buffer size
- The client uses PostgreSQL Extended Query protocol with
Syncmessages, or Simple Query protocol with semicolon-separated statements
Increase automatic retry opportunities by raising
sql.defaults.results_buffer.size. However, larger buffers use more memory.Best Practices
See Also
- Data Replication - How CockroachDB replicates data across nodes
- Multi-Region Deployments - Configuring transactions for geo-distributed clusters
- Resilience and Recovery - How transactions survive failures