Skip to main content

Overview

The most common way to use CronJob Guardian is to create a CronJobMonitor that watches specific CronJobs based on labels or names. This page shows simple, practical examples to get you started.

Monitor CronJobs by Label

The most flexible approach is to use label selectors. This allows you to add monitoring to any CronJob by simply adding a label.
monitors/basic.yaml
# Basic CronJobMonitor
# Monitors CronJobs matching specific labels in the same namespace
apiVersion: guardian.illenium.net/v1alpha1
kind: CronJobMonitor
metadata:
  name: critical-jobs
  namespace: production
spec:
  selector:
    matchLabels:
      tier: critical
  deadManSwitch:
    enabled: true
    maxTimeSinceLastSuccess: 25h
  sla:
    enabled: true
    minSuccessRate: 99
    windowDays: 7
  alerting:
    channelRefs:
      - name: slack-ops

What This Does

  • Monitors all CronJobs in the production namespace with label tier: critical
  • Triggers an alert if no successful run in 25 hours (dead-man’s switch)
  • Tracks success rate and alerts if it drops below 99% over 7 days
  • Sends alerts to the slack-ops AlertChannel

Setup Instructions

1

Label your CronJobs

Add the tier: critical label to CronJobs you want to monitor:
kubectl label cronjob my-backup tier=critical -n production
Or add it directly in your CronJob YAML:
apiVersion: batch/v1
kind: CronJob
metadata:
  name: my-backup
  namespace: production
  labels:
    tier: critical  # This makes it monitored
2

Create an AlertChannel

Before creating the monitor, you need at least one AlertChannel. See the Slack Alerts guide.
3

Apply the monitor

kubectl apply -f basic.yaml
4

Verify it's working

kubectl get cronjobmonitor critical-jobs -n production
kubectl describe cronjobmonitor critical-jobs -n production
The status should show which CronJobs are being monitored:
Status:
  Monitored Jobs: 3
  Conditions:
    - Type: Ready
      Status: True

Monitor All CronJobs in a Namespace

If you want to monitor every CronJob in a namespace without adding labels, use an empty selector.
monitors/all-in-namespace.yaml
# Monitor all CronJobs in the monitor's namespace
# Empty selector matches all CronJobs
apiVersion: guardian.illenium.net/v1alpha1
kind: CronJobMonitor
metadata:
  name: all-jobs
  namespace: production
spec:
  # Empty selector = all CronJobs in this namespace
  selector: {}
  deadManSwitch:
    enabled: true
    maxTimeSinceLastSuccess: 25h
  alerting:
    channelRefs:
      - name: slack-ops

What This Does

  • Monitors every CronJob in the production namespace
  • No labels required on your CronJobs
  • Simpler configuration for smaller teams or less critical environments

When to Use This

  • Small namespaces with only critical jobs
  • Dev/staging environments where you want blanket monitoring
  • When you trust all jobs in the namespace to be monitored the same way
This monitors ALL CronJobs in the namespace. If you have test jobs or jobs that are expected to fail, they will trigger alerts. Use label selectors for more control.

Monitor Specific CronJobs by Name

For precise control, specify exact CronJob names.
apiVersion: guardian.illenium.net/v1alpha1
kind: CronJobMonitor
metadata:
  name: financial-reports
  namespace: finance
spec:
  selector:
    matchNames:
      - daily-revenue-report
      - weekly-summary
  deadManSwitch:
    enabled: true
    maxTimeSinceLastSuccess: 25h
  alerting:
    channelRefs:
      - name: slack-finance

What This Does

  • Monitors only the two named CronJobs: daily-revenue-report and weekly-summary
  • Other CronJobs in the namespace are ignored
  • Useful when you have a small set of critical jobs
matchNames only works when monitoring a single namespace. For multi-namespace monitoring, use label selectors instead.

Advanced Label Selectors

For more complex filtering, use matchExpressions with operators like In, NotIn, Exists, and DoesNotExist.
apiVersion: guardian.illenium.net/v1alpha1
kind: CronJobMonitor
metadata:
  name: high-tier-jobs
  namespace: production
spec:
  selector:
    matchExpressions:
      - key: tier
        operator: In
        values: [critical, high]
      - key: monitoring
        operator: NotIn
        values: [disabled]
  deadManSwitch:
    enabled: true
    maxTimeSinceLastSuccess: 25h
  alerting:
    channelRefs:
      - name: slack-ops

What This Does

  • Monitors CronJobs with tier: critical OR tier: high
  • Excludes CronJobs with monitoring: disabled label
  • Gives you fine-grained control over what gets monitored

Adjusting Dead-Man’s Switch Timing

The maxTimeSinceLastSuccess should be set based on your CronJob’s schedule:
deadManSwitch:
  enabled: true
  maxTimeSinceLastSuccess: 2h  # 1h schedule + 1h buffer
Use autoFromSchedule when monitoring CronJobs with different schedules. Guardian will calculate the expected interval from each CronJob’s cron expression.

Tuning SLA Thresholds

Adjust success rate requirements based on your job’s criticality:
sla:
  enabled: true
  minSuccessRate: 99.9  # Must succeed nearly every time
  windowDays: 7

Next Steps

Namespace Monitoring

Monitor CronJobs across multiple namespaces

Advanced Features

SLA tracking, regression detection, and maintenance windows

Slack Alerts

Set up Slack notifications for your monitors

Alert Channels

Learn about all alert channel types

Build docs developers (and LLMs) love