Operator Upgrades
Operator upgrades use controlled rolling deployment with leader election for zero-downtime control plane handoff.What Happens During an Operator Upgrade
New controller pod starts
Kubernetes rolls out a new operator Deployment pod with the updated image.
Leader election handoff
Leader election ensures only one active controller at a time. The new pod acquires the lease after the old pod terminates.
Reconciliation continues
The new controller resumes reconciliation of all
RedisCluster resources without interruption.Pod hash recalculation
The reconciler computes a
redis.io/spec-hash annotation for each data pod based on:- Redis image
- Redis container resources
- Operator init container image (
OPERATOR_IMAGE_NAME) - Projected secret references
- Redis config from
.spec.redis
Data pods are not restarted unless their spec hash changes. A pure operator code upgrade does not restart Redis instances.
API Versioning and Compatibility
Current State
- The CRD API currently serves
redis.io/v1 - No active multi-version CRD setup
- No conversion webhook in the current release
Compatibility Policy
The operator follows these principles to avoid data loss and API breakage:- Additive changes - New fields are always
+optional, new enum values added where safe - Deprecation window - Fields are marked deprecated for at least one full release before removal
- Backward compatibility - Reconcile logic remains compatible for deprecated fields during deprecation period
- Conversion webhooks - Only introduced when a new CRD version (e.g.,
v2) is added alongsidev1
Helm Upgrade Procedure
Pre-Upgrade Checklist
Performing the Upgrade
Update CRDs (if changed)
If the new version includes CRD changes, apply them first:Or from the Helm chart:
Upgrade Helm release
Use
--reuse-values to preserve your existing configuration. Override specific values with --set as needed.Post-Upgrade Validation
Verify cluster health
- Phase:
Healthy - Ready instances match desired instances
- Current primary is set
Check for pod restarts
Redis Version Upgrades
Upgrade Redis itself by changing theimageName field in your RedisCluster spec.
Minor Version Upgrade (e.g., 7.2.0 → 7.2.5)
- Calculate new spec hash
- Update replicas first (highest ordinal to lowest)
- Switch over primary to an updated replica
- Update old primary pod
- Return cluster to
Healthyphase
Minor version upgrades are typically safe and require no downtime if you have replicas.
Major Version Upgrade (e.g., 7.2 → 7.4)
Review Redis release notes
Check Redis changelog for breaking changes, deprecated commands, and new features.
Test in staging
Clone your cluster spec, change the name and image, and deploy to a staging namespace:Verify application compatibility.
staging-cluster.yaml
Supervised vs. Unsupervised Primary Updates
Control how primary updates are handled during rolling upgrades:Unsupervised (Default)
Primary is automatically updated after all replicas:- You trust the operator to handle failover automatically
- Downtime tolerance is low (switchover is quick)
- You monitor via alerts but don’t need manual approval
Supervised
Primary update waits for manual approval:- Operator updates all replicas
- Cluster enters
WaitingForUserphase - You review cluster health and approve:
- Operator performs primary switchover and update
- Cluster returns to
Healthy
- You want manual control over primary updates
- Coordinating with maintenance windows
- Extra caution for mission-critical clusters
Rollback Procedures
Rollback Operator
If the new operator version has issues:Rollback Redis Version
Troubleshooting Upgrades
Operator Pod CrashLooping
Cause: CRD schema mismatch or invalid webhook configuration. Solution:Cluster Stuck in Updating Phase
Cause: Pod update blocked by PDB or scheduling constraints. Solution:Primary Not Updating After Replicas
Cause:primaryUpdateStrategy: supervised is set and awaiting approval.
Solution:
Check cluster conditions:
PrimaryUpdateWaiting condition. Approve the update:
Best Practices
- Always create backups before major upgrades
- Test upgrades in staging environment first
- Upgrade operator and Redis versions separately
- Review changelogs for breaking changes
- Monitor clusters for 24 hours post-upgrade
- Use
supervisedprimary updates for critical production clusters - Schedule upgrades during maintenance windows
- Keep operator and Redis versions reasonably current (within 2-3 minor versions)