Overview
Sentinel mode adds:- Automatic failover: Sentinel elects a new primary when the current primary is unreachable
- Service discovery: Clients query Sentinel for the current primary endpoint
- Split-brain prevention: Quorum-based election prevents dual primaries
Configuration
Enable Sentinel mode by settingspec.mode: sentinel:
Sentinel Defaults
Fromapi/v1/rediscluster_types.go:82-87:
- Port: 26379 (standard Sentinel port)
- Instances: 3 (fixed, cannot be changed)
- Quorum: 2 (majority of 3)
How It Works
Sentinel Pod Lifecycle
-
Creation: When
mode: sentinel, the operator creates 3 Sentinel pods- Pod names:
<cluster-name>-sentinel-0,<cluster-name>-sentinel-1,<cluster-name>-sentinel-2 - Labels:
redis.io/workload=sentinel,redis.io/cluster=<cluster-name>
- Pod names:
-
Monitoring: Each Sentinel pod monitors the data pods
- Runs
SENTINEL MONITOR <cluster-name> <primary-ip> 6379 2 - Tracks primary and replicas
- Performs health checks (ping every 1s)
- Runs
-
Failover: On primary failure:
- Sentinels detect primary is unreachable (30s timeout)
- Quorum vote elects new primary (2/3 Sentinels must agree)
- Sentinel issues
REPLICAOF NO ONEto new primary - Other replicas reconfigured to follow new primary
-
Operator Sync: Operator queries Sentinel for current primary
- Updates
status.currentPrimary - Updates
-leaderservice selector - Updates pod role labels
- Updates
internal/controller/cluster/sentinel.go:20-97 for implementation.
Primary Discovery
The operator queries Sentinel to discover the current primary:sentinel.go:144-174.
Reconciliation Flow
Seesentinel.go:20-97 for reconcileSentinelMaster().
Client Configuration
Clients must use Sentinel-aware libraries.Redis Sentinel Service
The operator creates a service for Sentinel pods:Client Examples
Go (go-redis):Failover Behavior
Automatic Failover
Scenario: Primary pod crashes- Detection: Sentinels detect primary unreachable after 30s (3 failed pings)
- Vote: Sentinels vote to elect new primary (quorum=2/3)
- Promotion: Sentinel issues
REPLICAOF NO ONEto elected replica - Reconfiguration: Other replicas reconfigured to follow new primary
- Operator Sync: Operator queries Sentinel, updates status and services
- T+0s: Primary crashes
- T+30s: Sentinels detect failure
- T+35s: Quorum vote completes, new primary elected
- T+40s: Operator queries Sentinel, updates status
- T+45s: All replicas following new primary
Manual Failover
Trigger failover via Sentinel:Recovery After Network Partition
Scenario: Primary is network-isolated but still running- Sentinels cannot reach primary → elect new primary
- Old primary continues accepting writes (split-brain risk)
- When network heals:
- Sentinel detects old primary
- Issues
REPLICAOF <new-primary-ip> 6379 - Old primary demoted to replica
- Conflicting writes lost (last-write-wins)
primaryIsolation to prevent split-brain:
Status Fields
Count of Sentinel pods passing readiness probes.
Pod name of the current primary as reported by Sentinel.
Advanced Configuration
Custom Sentinel Config
Sentinel configuration is managed by the operator. The following parameters are set:Sentinel with TLS
Ifspec.tlsSecret is set, Sentinel pods are configured with TLS:
Sentinel with ACLs
Sentinel does not support ACLs in Redis 7.2. Usespec.authSecret for password-based auth:
Best Practices
Use Sentinel for production HA
Standalone mode requires manual intervention during primary failure. Sentinel mode provides automatic recovery. Comparison:| Feature | Standalone | Sentinel |
|---|---|---|
| Automatic failover | ❌ | ✅ |
| Manual failover | Via annotation | Via Sentinel |
| Client discovery | Static service | Sentinel query |
| Downtime on failure | Minutes | ~30-45 seconds |
Deploy 3 or 5 Sentinels (odd number)
Current version deploys exactly 3 Sentinels (hardcoded). For higher availability, you could run multiple RedisCluster resources with Sentinel in different clusters and use cross-cluster replication.Set minSyncReplicas to prevent data loss
Monitor Sentinel health
Check Sentinel status:status.sentinelReadyInstances < 2 (lost quorum).
Use pod anti-affinity for Sentinel pods
Ensure Sentinel pods are on different nodes:Troubleshooting
Sentinel pods not starting
Symptom:Failover not triggering
Symptom: Primary pod deleted but Sentinel doesn’t elect new primary. Debug:- Lost quorum: < 2 Sentinels running
- Network partition: Sentinels can’t reach each other
- Wrong master name: Sentinel monitoring different name
Sentinel reports wrong primary
Symptom:status.currentPrimary doesn’t match actual primary.
Debug:
Split-brain after network partition
Symptom: Two pods both think they’re primary. Prevention: EnableprimaryIsolation: