Overview
Pulsar’s geo-replication features:- Asynchronous replication - Messages are replicated asynchronously across clusters
- Topic-level and namespace-level - Configure replication per topic or namespace
- Multi-directional - Support for bidirectional and multi-way replication
- Automatic failover - Clients can automatically switch to other clusters
- Selective replication - Replicate specific topics or namespaces
Architecture
Geo-replication works by:- Producer publishes message to local cluster
- Message is stored in local BookKeeper
- Replicator processes read messages from local storage
- Messages are sent to remote clusters
- Remote clusters store messages in their BookKeeper
- Consumers in any cluster can read messages
Configuration
Cluster Setup
Each cluster must know about other clusters in the replication group.Register Clusters
Register each cluster in the global configuration store:List Clusters
Broker Configuration
Configure replication settings inbroker.conf:
Name of the local cluster. Must match the registered cluster name.
Enable replication metrics collection.
Maximum connections to open for each broker in a remote cluster. More connections improve throughput over high-latency links.
Replicator producer queue size for outbound messages.
Duration to check replication policy to avoid replicator inconsistency due to missing ZooKeeper watch. Set to
0 to disable.Enable Replication for Namespace
Enable replication at the namespace level:cluster-east and cluster-west for all topics in the namespace.
Enable Replication for Topic
Enable replication for a specific topic:Replication Patterns
Active-Active (Bidirectional)
Messages are replicated in both directions:Active-Passive (Unidirectional)
Messages flow from primary to backup cluster:Multi-Region Hub and Spoke
Central cluster replicates to regional clusters:Full Mesh
All clusters replicate to all other clusters:Message Deduplication
Enable deduplication to prevent duplicate messages in replicated topics:Enable message deduplication at the broker level.
Maximum number of producer information persisted for deduplication.
Monitoring Replication
Replication Stats
Get replication statistics for a topic:replicationBacklog- Number of messages waiting to be replicatedreplicationDelayInSeconds- Replication laginboundConnectedSince- When replicator connectedoutboundConnectedSince- When replicator connected
Prometheus Metrics
Alert on Replication Lag
Client Configuration
Multi-Cluster Client
Clients can connect to multiple clusters for failover:Automatic Failover
Clients automatically failover to available clusters when one becomes unavailable.Selective Replication
Replicate Specific Topics
Only enable replication for critical topics:Message Properties
Filter replication based on message properties using topic policies and filters.Troubleshooting
Check Replication Status
Common Issues
Replication Not Starting
-
Verify cluster registration:
- Check broker connectivity to remote cluster
-
Verify namespace replication configuration:
High Replication Lag
- Check network bandwidth between clusters
- Increase
replicationConnectionsPerBroker - Monitor BookKeeper read performance
- Check remote cluster capacity
Duplicate Messages
-
Enable deduplication:
- Use idempotent message processing in consumers
Best Practices
- Use separate clusters - Deploy clusters in different failure domains
- Monitor replication lag - Set up alerts for high lag
- Plan for network latency - Configure appropriate timeouts
- Enable deduplication - Prevent duplicate messages in active-active scenarios
- Test failover - Regularly test cluster failover procedures
- Capacity planning - Ensure clusters can handle replicated traffic
- Security - Use TLS for cross-cluster replication
- Selective replication - Only replicate data that needs global distribution
- Regional routing - Use DNS or load balancers for regional client routing
- Backup strategy - Combine geo-replication with backup and restore procedures