Transaction Fundamentals
A transaction is a sequence of operations performed as a single logical unit of work. In YugabyteDB:- All operations are transactions: Even single-row updates use the transaction infrastructure
- Cross-shard support: Transactions can span multiple tablets on different nodes
- ACID guarantees: Full support for atomicity, consistency, isolation, and durability
- Multiple isolation levels: Serializable, Snapshot (Repeatable Read), and Read Committed
In YSQL, when
autocommit is enabled (default), each statement executes as its own transaction unless wrapped in an explicit BEGIN/COMMIT block.Hybrid Logical Clocks
YugabyteDB uses Hybrid Logical Clocks (HLC) to provide globally-ordered timestamps without requiring atomic clocks like Google’s TrueTime.HLC Structure
Each HLC is a tuple:(physical_time, logical_counter)
- Physical component: Initialized from node’s system clock (CLOCK_REALTIME)
- Logical component: Monotonically increasing counter for same physical time
How HLC Updates Work
How HLC Updates Work
- Node computes its local HLC as (current_time, 0)
- On RPC communication, nodes exchange HLC values
- Node with lower HLC updates to max(local_HLC, received_HLC) + (0, 1)
- Physical component only increases, logical resets to 0 when physical advances
Ordering Guarantees
Ordering Guarantees
- Events connected by causality get increasing hybrid timestamps
- “A happens before B on same server” → HLC(A) < HLC(B)
- “A sends RPC to server where B happens” → HLC(A) < HLC(B)
- HLCs are compared as tuples: physical time takes precedence
Clock Skew Handling
Clock Skew Handling
- No external clock synchronization required (but NTP recommended)
- Bounded by maximum clock skew between nodes (typically < 500ms)
- Higher skew increases transaction conflict probability
- Leader leases prevent split-brain despite clock drift
HLC in Action
Provisional Records
Uncommitted transaction data is stored separately from committed data to maintain atomicity across shards.IntentsDB vs RegularDB
Each tablet maintains two RocksDB instances:IntentsDB
Stores provisional (uncommitted) records from active transactions
RegularDB
Stores committed data visible to all readers
- Atomicity: Uncommitted data stays invisible until commit
- Easy cleanup: Abort transactions by deleting IntentsDB entries
- Efficient scanning: List all provisional records for a transaction
- Independent strategies: Different compaction/flush policies
Provisional Record Types
1. Primary Provisional Records (Write Intents)- Acts as a persistent lock on the key
- Contains the actual value being written
- Lock types: SI Write, Serializable Read/Write, Weak/Strong
- Maps transaction to its status tablet
- Stores isolation level (Serializable, Snapshot, Read Committed)
- Random priority for conflict resolution (Fail-on-Conflict mode)
- Enables finding all provisional records for a transaction
- Used during commit/abort cleanup
- Write ID suffix prevents key collisions
Lock Types and Conflict Resolution
| Lock Type | Used For | Conflicts With |
|---|---|---|
| StrongSIWrite | Column write (Snapshot Isolation) | Any write |
| WeakSIWrite | Row-level marker | Strong writes on same row |
| SerializableRead | Read in serializable txn | Any write |
| SerializableWrite | Write in serializable txn | Any read or write |
Transaction Status Tracking
Transaction status is tracked in a distributed transaction status table.Status Table Architecture
- Sharded table: Transaction IDs map to status tablet via hash
- In-memory: Data kept in memory, backed by Raft WAL
- Single-shard ACID: Status updates use single-shard transactions
- High availability: Replicated via Raft like any other tablet
Transaction Status Records
Commit Process
Multi-Version Concurrency Control (MVCC)
YugabyteDB uses MVCC to allow concurrent transactions without locking on reads.How MVCC Works
Each transaction reads at a specific hybrid timestamp. MVCC ensures that the transaction sees a consistent snapshot, even as other transactions commit changes.
Garbage Collection
Old versions are cleaned up when:- No active transactions need them (based on oldest running transaction)
- History retention period expires
- Compaction runs on the tablet
Isolation Levels
YugabyteDB supports three isolation levels in YSQL:Serializable
- Strongest guarantee: Appears as if transactions executed serially
- Read and write locks: Tracks both reads and writes
- Conflict detection: Aborts if conflicts detected
- Performance: Highest isolation, potential for more conflicts
Snapshot (Repeatable Read)
- Snapshot consistency: Reads see consistent snapshot at transaction start
- Write conflict detection: Aborts on write-write conflicts
- Default for YCQL: YCQL only supports this level
- Performance: Good balance of consistency and throughput
Read Committed
- Statement-level snapshots: Each statement sees latest committed data
- Lowest isolation: Allows non-repeatable reads and phantom reads
- High concurrency: Fewer conflicts, higher throughput
- PostgreSQL compatible: Default in PostgreSQL
Transaction Example
Failure Scenarios
Transaction Manager Failure
Tablet Leader Failure
Performance Characteristics
Single-Shard Transactions
Single-Shard Transactions
- Fastest path: no distributed coordination
- Raft replication only (3-5ms typical latency)
- No provisional records needed in some cases
- Automatically detected and optimized
Multi-Shard Transactions
Multi-Shard Transactions
- Additional RTT to status tablet for commit
- Provisional records written to all shards
- Cleanup happens asynchronously
- Typical latency: 10-20ms (geo-distributed: 100-200ms)
Conflict Resolution
Conflict Resolution
- Wait-on-Conflict (default): Wait for conflicting transaction
- Fail-on-Conflict: Abort lower priority transaction immediately
- Configurable per transaction or globally
- Affects throughput under high contention
Best Practices
Keep Transactions Short
Minimize transaction duration to reduce conflicts and lock contention
Batch Operations
Group related operations in single transaction for atomicity
Use Appropriate Isolation
Choose lowest isolation level that meets consistency needs
Handle Conflicts
Implement retry logic for serialization errors and timeouts
Next Steps
Consistency Model
Learn about consistency guarantees and linearizability
Replication
Understand how Raft replication works
Data Model
Explore how data is stored in DocDB
Architecture
Review the overall system architecture

