Skip to main content
Maintenance windows allow you to suppress alerts during planned maintenance, deployments, or testing periods. This prevents alert fatigue and false alarms when you expect CronJobs to fail or miss schedules.

How Maintenance Windows Work

Maintenance windows are defined using cron expressions and durations. During a window:
  • Alerts are suppressed: No notifications are sent to alert channels
  • Monitoring continues: Guardian still tracks executions and metrics
  • Status is visible: The dashboard shows when a window is active

Basic Maintenance Window

Define a weekly maintenance window every Sunday at 2 AM for 4 hours:
apiVersion: guardian.illenium.net/v1alpha1
kind: CronJobMonitor
metadata:
  name: my-monitor
  namespace: production
spec:
  maintenanceWindows:
    - name: weekly-maintenance
      schedule: "0 2 * * 0"  # Every Sunday at 2 AM
      duration: 4h
      timezone: America/New_York
      suppressAlerts: true
The schedule field uses standard cron syntax. Times are in the specified timezone (default: UTC).

Multiple Maintenance Windows

You can define multiple windows for different maintenance scenarios:
apiVersion: guardian.illenium.net/v1alpha1
kind: CronJobMonitor
metadata:
  name: production-monitor
  namespace: production
spec:
  maintenanceWindows:
    # Weekly database maintenance
    - name: database-maintenance
      schedule: "0 2 * * 0"  # Sunday 2 AM
      duration: 4h
      timezone: UTC
      suppressAlerts: true
    
    # Monthly platform updates
    - name: platform-updates
      schedule: "0 0 1 * *"  # First day of month, midnight
      duration: 6h
      timezone: UTC
      suppressAlerts: true
    
    # Daily backup window
    - name: backup-window
      schedule: "0 3 * * *"  # Every day at 3 AM
      duration: 1h
      timezone: America/Los_Angeles
      suppressAlerts: true

Timezone Support

Maintenance windows support timezone-aware scheduling:
maintenanceWindows:
  - name: us-east-maintenance
    schedule: "0 2 * * 0"
    duration: 4h
    timezone: America/New_York  # Eastern Time
  
  - name: europe-maintenance
    schedule: "0 2 * * 6"
    duration: 4h
    timezone: Europe/London  # British Time
  
  - name: asia-maintenance
    schedule: "0 2 * * 5"
    duration: 4h
    timezone: Asia/Tokyo  # Japan Time
1

Choose your timezone

Use IANA timezone names (e.g., America/New_York, Europe/London, Asia/Tokyo).
2

Define the schedule in local time

The cron expression is interpreted in the specified timezone, not UTC.
3

Verify the window times

Check the CronJobMonitor status to see when the next window is scheduled.

Cron Schedule Examples

DescriptionCron Expression
Every Sunday at 2 AM0 2 * * 0
Every day at midnight0 0 * * *
First day of month at 3 AM0 3 1 * *
Every weekday at 1 AM0 1 * * 1-5
Every Saturday at 4 AM0 4 * * 6
Twice a week (Wed, Sun) at 2 AM0 2 * * 0,3

Conditional Alert Suppression

You can optionally disable alert suppression while keeping monitoring active:
maintenanceWindows:
  - name: observation-window
    schedule: "0 2 * * 0"
    duration: 4h
    suppressAlerts: false  # Monitor but still send alerts
Setting suppressAlerts: false means alerts will still be sent during the window. This is useful for observation periods where you want to monitor but not perform destructive changes.

Real-World Examples

Weekly Database Maintenance

apiVersion: guardian.illenium.net/v1alpha1
kind: CronJobMonitor
metadata:
  name: database-backups
  namespace: databases
spec:
  selector:
    matchLabels:
      type: backup
  
  maintenanceWindows:
    - name: weekly-db-maintenance
      schedule: "0 2 * * 0"  # Sunday 2 AM
      duration: 4h
      timezone: America/New_York
      suppressAlerts: true
  
  deadManSwitch:
    enabled: true
    maxTimeSinceLastSuccess: 25h
  
  alerting:
    channelRefs:
      - name: pagerduty-dba

Deployment Windows

apiVersion: guardian.illenium.net/v1alpha1
kind: CronJobMonitor
metadata:
  name: api-cronjobs
  namespace: production
spec:
  maintenanceWindows:
    # Weekly deployment window
    - name: weekly-deploy
      schedule: "0 20 * * 2"  # Tuesday 8 PM
      duration: 2h
      timezone: UTC
      suppressAlerts: true
    
    # Emergency hotfix window
    - name: emergency-window
      schedule: "0 22 * * *"  # Daily at 10 PM
      duration: 30m
      timezone: UTC
      suppressAlerts: true

Multi-Region Maintenance

apiVersion: guardian.illenium.net/v1alpha1
kind: CronJobMonitor
metadata:
  name: global-monitor
  namespace: production
spec:
  selector:
    allNamespaces: true
  
  maintenanceWindows:
    # US region maintenance (Sunday 2 AM EST)
    - name: us-maintenance
      schedule: "0 2 * * 0"
      duration: 4h
      timezone: America/New_York
      suppressAlerts: true
    
    # EU region maintenance (Saturday 2 AM GMT)
    - name: eu-maintenance
      schedule: "0 2 * * 6"
      duration: 4h
      timezone: Europe/London
      suppressAlerts: true
    
    # APAC region maintenance (Friday 2 AM JST)
    - name: apac-maintenance
      schedule: "0 2 * * 5"
      duration: 4h
      timezone: Asia/Tokyo
      suppressAlerts: true

Checking Active Maintenance Windows

Via kubectl

kubectl describe cronjobmonitor my-monitor -n production
Look for the maintenance windows section in the status:
Status:
  Phase: Active
  Maintenance Windows:
    - Name: weekly-maintenance
      Next Start: 2024-01-21T02:00:00Z
      Duration: 4h
      Active: false

Via Dashboard

The web dashboard shows:
  • Active windows: Highlighted in yellow when a window is in progress
  • Next scheduled window: Countdown timer to next window
  • Window history: Past windows and their duration

Via API

curl http://localhost:8080/api/v1/monitors/production/my-monitor
Response includes maintenance window configuration and status.

Impact on Monitoring

What Happens During a Window

Alerts Suppressed

No notifications sent to alert channels (Slack, PagerDuty, email).

Monitoring Active

Guardian continues tracking job executions and updating metrics.

Status Updated

Dashboard and API show the window is active.

History Retained

All executions are recorded for post-maintenance analysis.

What Happens After a Window

  • Alert suppression ends: Normal alerting resumes immediately
  • Pending alerts sent: If issues persist after the window, alerts are sent
  • SLA calculations continue: Success rates and durations include maintenance period executions

Interaction with Other Features

Dead-Man’s Switch

Maintenance windows pause dead-man’s switch timers:
deadManSwitch:
  enabled: true
  maxTimeSinceLastSuccess: 25h

maintenanceWindows:
  - name: weekly-maintenance
    schedule: "0 2 * * 0"
    duration: 4h
    suppressAlerts: true
During the window, the dead-man’s switch won’t trigger even if jobs don’t run.

SLA Tracking

SLA calculations include maintenance window executions:
sla:
  enabled: true
  minSuccessRate: 95
  windowDays: 7

maintenanceWindows:
  - name: weekly-maintenance
    schedule: "0 2 * * 0"
    duration: 4h
    suppressAlerts: true
Failed jobs during maintenance count toward the success rate but don’t trigger alerts.
If you want to exclude maintenance windows from SLA calculations, consider using suspendedHandling.pauseMonitoring or temporarily suspending CronJobs during maintenance.

Suspended CronJobs

Maintenance windows work alongside suspended CronJob handling:
suspendedHandling:
  pauseMonitoring: true  # Pause monitoring when CronJob is suspended
  alertIfSuspendedFor: 168h  # Alert if suspended for 7 days

maintenanceWindows:
  - name: weekly-maintenance
    schedule: "0 2 * * 0"
    duration: 4h
    suppressAlerts: true
If a CronJob is suspended during a maintenance window, monitoring is paused (if pauseMonitoring: true).

Troubleshooting

Window Not Activating

1

Verify the cron expression

Test your cron expression using crontab.guru or similar tools.
# Validate the schedule
echo "0 2 * * 0" | crontab -
2

Check the timezone

Ensure the timezone is correct. Guardian logs the next scheduled window at startup.
kubectl logs -n cronjob-guardian deployment/cronjob-guardian-controller | grep "maintenance window"
3

Verify monitor status

Check if the monitor is in Active phase:
kubectl get cronjobmonitor my-monitor -o jsonpath='{.status.phase}'

Alerts Still Sent During Window

Ensure suppressAlerts: true is set. The default value is true, but if you explicitly set it to false, alerts will still be sent.
Check your configuration:
kubectl get cronjobmonitor my-monitor -o yaml | grep -A 5 maintenanceWindows

Wrong Timezone

If windows activate at the wrong time:
  1. Verify the timezone name is correct (use IANA format)
  2. Check Guardian’s timezone configuration
  3. Confirm the cron expression is in local time, not UTC
# List available timezones
timedatectl list-timezones | grep America

Best Practices

Use Descriptive Names

Name windows clearly: weekly-db-maintenance, not window-1.

Coordinate Across Teams

Share maintenance schedules with all teams to avoid conflicts.

Don't Overuse

Too many windows reduce monitoring effectiveness. Only suppress during actual maintenance.

Document Windows

Add comments explaining why each window exists and what maintenance occurs.

Next Steps

Dashboard

View active maintenance windows in the UI

Troubleshooting

Common issues and solutions

Build docs developers (and LLMs) love