replication collector monitors PostgreSQL replication status, tracking replication lag and replica state.
Status
Default: EnabledMetrics
pg_replication_lag_seconds
Type: GaugeDescription: Replication lag behind master in seconds Labels: None Values:
0- No lag (primary server or replica fully caught up)>0- Seconds behind the primary
pg_replication_is_replica
Type: GaugeDescription: Indicates if the server is a replica Labels: None Values:
1- Server is a replica (in recovery mode)0- Server is a primary
pg_replication_last_replay_seconds
Type: GaugeDescription: Age of last WAL replay in seconds Labels: None
SQL Query
How Lag is Calculated
- Primary server: Returns 0 (not in recovery)
- Replica fully caught up: Returns 0 when
receive_lsn=replay_lsn - Replica with lag: Returns time since last transaction replay
PostgreSQL Versions
Supported: PostgreSQL 9.1+ Function Name Changes:- PostgreSQL 10+: Uses
pg_last_wal_*functions - PostgreSQL 9.x: Uses
pg_last_xlog_*functions (renamed in v10)
Required Permissions
The monitoring user needs:- Execute permission on replication functions (granted to
PUBLICby default)
Example Output
On Primary:Use Cases
Monitoring Replication Health
Detecting Replication Issues
High lag can indicate:- Network issues between primary and replica
- High write load on primary
- Slow storage on replica
- Long-running queries on replica blocking replay
- Replication slot not being consumed
Troubleshooting
Check Replication Status
On Primary:Common Issues
- Replication slot full: Check
pg_replication_slots - Network latency: Use
wal_sender_timeoutandwal_receiver_timeout - Conflicts: Check
pg_stat_database_conflicts