Architecture
Read replicas extend the Raft consensus protocol with observer nodes:Key Characteristics
- Observer Nodes: Don’t participate in Raft voting or consensus
- Async Replication: Changes stream asynchronously from primary cluster
- Timeline Consistency: Readers see a consistent snapshot at a point in time
- No Write Impact: Writes don’t wait for read replica acknowledgment
- Independent RF: Read replica clusters have their own replication factor (can be even numbers)
- Topology Awareness: Read replicas are aware of universe topology
Timeline Consistency vs. Eventual Consistency
Read replicas provide timeline consistency, which is strictly stronger than eventual consistency:| Property | Timeline Consistency | Eventual Consistency |
|---|---|---|
| Read View | Consistent snapshot at specific timestamp | May observe out-of-order updates |
| Time Travel | Application’s view never moves backward | View can move backward and forward |
| Programmability | Predictable, easier to reason about | Complex application logic required |
| Guarantees | Reads at T see all writes < T | Reads eventually see all writes |
balance = 100, subsequent reads will never see balance = 80 from an earlier time. With eventual consistency, this regression is possible.
Replication Factor
Every YugabyteDB universe has:- One primary cluster: Participates in Raft consensus (typically RF=3, 5, or 7)
- Zero or more read replica clusters: Each with independent RF
Even Replication Factors
Read replica clusters can use even RFs since they don’t participate in consensus:Write Handling on Read Replicas
Applications can send write requests to read replica nodes:- Read replica node receives write request
- Internally forwards to primary cluster leader
- Primary cluster executes Raft consensus
- Write commits in primary cluster
- Change asynchronously replicates to read replicas
- ✅ Simplified application logic (single connection string)
- ✅ No need for application-level routing
- ❌ Higher write latency (cross-region forwarding + consensus)
- ❌ Best for read-heavy workloads
Schema Changes
DDL operations are transparently applied to read replicas:Deployment Scenarios
Global Reads with Regional Writes
Use Case: E-commerce platform with US-based writes, global readsDisaster Recovery
Use Case: DR site that serves reads during normal operationAnalytics and Reporting
Use Case: Offload analytical queries from productionSetup and Configuration
Creating Read Replica Cluster
Using yugabyted:- Navigate to universe details
- Click “Add Read Replica”
- Configure:
- Region/zones for read replica
- Replication factor
- Instance type
- Node count
- Deploy
Configuring Connection Pools
Direct read traffic to read replicas:Application Configuration Patterns
Separate Connection Pools:Monitoring Read Replicas
Replication Lag
Health Metrics
Key metrics to monitor:| Metric | Description | Threshold |
|---|---|---|
async_replication_lag_micros | Replication delay from primary | < 10s |
async_replication_sent_lag_micros | Network propagation delay | < 1s |
follower_lag_ms | Lag within read replica cluster | < 100ms |
handler_latency_yb_tserver_TabletServerService_Read | Read latency | < 50ms |
Grafana Dashboard
Performance Considerations
Read Latency
Expected latencies:- Same region: 5-20ms (network + query execution)
- Cross-region: 20-100ms (depends on geographic distance)
- Cross-continent: 100-300ms
Replication Lag
Factors affecting lag:- Network bandwidth: Higher throughput reduces lag
- Write rate: Sustained writes can increase lag
- Tablet splits: Temporary lag spike during split operations
- Compactions: Background operations may affect lag
Scaling Reads
Increase read capacity by:- Adding more read replica nodes in existing cluster
- Creating additional read replica clusters in new regions
- Increasing RF of read replica cluster
Failover and Promotion
Promoting Read Replica to Primary
Scenario: Primary region failure, promote DR read replica- Former read replica now participates in consensus
- Write latency determined by new primary region
- Can create new read replicas as needed
Best Practices
-
Right-Size Read Replica Clusters:
- Match RF to availability requirements
- Use smaller RFs (1-2) for cost optimization
- Consider read workload patterns
-
Monitor Replication Lag:
- Alert on lag > 10 seconds
- Investigate sustained lag immediately
- Correlate with write throughput
-
Application Design:
- Use separate connection pools for reads/writes
- Leverage smart drivers for topology awareness
- Handle stale reads gracefully in application logic
-
Geographic Distribution:
- Place read replicas close to user populations
- Consider data residency requirements
- Balance cost vs. latency for region selection
-
Resource Allocation:
- Read replicas can use different instance types
- Optimize for read workload (more CPU, less storage IOPS)
- Monitor and adjust based on actual usage
-
Testing:
- Regularly test failover procedures
- Verify application behavior with stale reads
- Load test read replica clusters independently
Limitations
- Write Latency: Writes forwarded from read replicas incur cross-region penalty
- Replication Lag: Reads may be slightly stale (typically < 1 second)
- No Strong Consistency: Read replicas don’t provide read-your-writes for cross-region writes
- Schema Changes: DDL operations propagate asynchronously

