Skip to main content
CronJob Guardian provides flexible monitoring options to track your CronJobs across different scopes and selectors.

Monitoring Strategies

You can monitor CronJobs in several ways:
  • Single namespace: Monitor all or selected CronJobs in one namespace
  • Multiple namespaces: Monitor specific namespaces by listing them
  • Namespace selector: Dynamically discover namespaces by labels
  • Cluster-wide: Monitor all CronJobs across all namespaces
  • Label selector: Filter CronJobs by labels

Basic Namespace Monitoring

The simplest way to monitor CronJobs is within a single namespace.
1

Create a CronJobMonitor in your namespace

Deploy a monitor in the same namespace as your CronJobs. An empty selector monitors all CronJobs in that namespace.
apiVersion: guardian.illenium.net/v1alpha1
kind: CronJobMonitor
metadata:
  name: critical-jobs
  namespace: production
spec:
  selector:
    matchLabels:
      tier: critical
  deadManSwitch:
    enabled: true
    maxTimeSinceLastSuccess: 25h
  sla:
    enabled: true
    minSuccessRate: 99
    windowDays: 7
  alerting:
    channelRefs:
      - name: slack-ops
2

Apply the monitor

kubectl apply -f monitor.yaml
3

Verify the monitor is active

kubectl get cronjobmonitor -n production
Check the status to see discovered CronJobs:
kubectl describe cronjobmonitor critical-jobs -n production

Label-Based Monitoring

Monitor CronJobs that match specific labels using matchLabels or matchExpressions.

Using matchLabels

apiVersion: guardian.illenium.net/v1alpha1
kind: CronJobMonitor
metadata:
  name: critical-jobs
  namespace: production
spec:
  selector:
    matchLabels:
      tier: critical

Using matchExpressions

For more advanced filtering:
apiVersion: guardian.illenium.net/v1alpha1
kind: CronJobMonitor
metadata:
  name: high-priority-jobs
  namespace: production
spec:
  selector:
    matchExpressions:
      - key: tier
        operator: In
        values: [critical, high]
      - key: backup
        operator: Exists
Supported operators: In, NotIn, Exists, DoesNotExist

Monitoring Specific CronJobs by Name

You can explicitly list CronJob names to monitor (only valid for single-namespace monitoring):
apiVersion: guardian.illenium.net/v1alpha1
kind: CronJobMonitor
metadata:
  name: backup-jobs
  namespace: databases
spec:
  selector:
    matchNames:
      - daily-backup
      - weekly-report
      - monthly-archive

Cluster-Wide Monitoring

Monitor all CronJobs across all namespaces (except globally ignored ones).
apiVersion: guardian.illenium.net/v1alpha1
kind: CronJobMonitor
metadata:
  name: cluster-wide-monitor
  namespace: cronjob-guardian
spec:
  selector:
    # Watch all namespaces
    allNamespaces: true
    # Optionally filter by labels
    matchLabels:
      tier: critical
  deadManSwitch:
    enabled: true
    maxTimeSinceLastSuccess: 25h
  alerting:
    channelRefs:
      - name: pagerduty-critical
        severities: [critical]
      - name: slack-ops
        severities: [critical, warning]
Cluster-wide monitoring requires appropriate RBAC permissions. The operator’s service account must have cluster-wide read access to CronJobs.

Namespace Selector Monitoring

Dynamically discover and monitor namespaces that match specific labels.
apiVersion: guardian.illenium.net/v1alpha1
kind: CronJobMonitor
metadata:
  name: production-jobs
  namespace: cronjob-guardian
spec:
  selector:
    # Select namespaces by their labels
    namespaceSelector:
      matchLabels:
        environment: production
    # Optionally filter CronJobs within matching namespaces
    matchLabels:
      monitored: "true"
  sla:
    enabled: true
    minSuccessRate: 95
    windowDays: 7
  alerting:
    channelRefs:
      - name: slack-ops
1

Label your namespaces

kubectl label namespace prod-app environment=production
kubectl label namespace prod-api environment=production
2

Create the monitor with namespace selector

The monitor will automatically discover all namespaces labeled environment=production and watch CronJobs within them.
3

Verify discovered namespaces

kubectl describe cronjobmonitor production-jobs -n cronjob-guardian

Multi-Namespace Monitoring

Explicitly list multiple namespaces to monitor:
apiVersion: guardian.illenium.net/v1alpha1
kind: CronJobMonitor
metadata:
  name: multi-namespace-monitor
  namespace: cronjob-guardian
spec:
  selector:
    namespaces:
      - production
      - staging
      - qa
    matchLabels:
      monitored: "true"

Real-World Example: Database Backups

Here’s a complete example monitoring critical database backup jobs:
apiVersion: guardian.illenium.net/v1alpha1
kind: CronJobMonitor
metadata:
  name: database-backups
  namespace: databases
spec:
  selector:
    matchLabels:
      type: backup
  deadManSwitch:
    enabled: true
    maxTimeSinceLastSuccess: 25h  # Daily backups with 1h buffer
  sla:
    enabled: true
    minSuccessRate: 100  # Backups must never fail
    maxDuration: 1h      # Alert if backup takes too long
  alerting:
    channelRefs:
      - name: pagerduty-dba
        severities: [critical]
    severityOverrides:
      jobFailed: critical
      deadManTriggered: critical
    # Custom fix suggestion for backup failures
    suggestedFixPatterns:
      - name: disk-full
        match:
          logPattern: "No space left on device|disk full"
        suggestion: "Backup storage is full. Check PVC usage: kubectl get pvc -n {\{.Namespace}\}"
        priority: 150

Monitoring Best Practices

Start Small

Begin with namespace-scoped monitors before moving to cluster-wide monitoring.

Use Labels

Label your CronJobs consistently (e.g., tier: critical, type: backup) for easier monitoring.

Avoid Overlap

Ensure monitors don’t overlap unnecessarily. If multiple monitors watch the same CronJob, you’ll get duplicate alerts.

Monitor the Monitor

Use kubectl get cronjobmonitor regularly to verify monitors are in Active phase.

Checking Monitor Status

View all monitors and their status:
kubectl get cronjobmonitor -A
Expected output:
NAMESPACE           NAME                    CRONJOBS   HEALTHY   WARNING   CRITICAL   ALERTS   AGE
production          critical-jobs           5          4         1         0          1        2d
cronjob-guardian    cluster-wide-monitor    42         39        2         1          3        5d
databases           database-backups        3          3         0         0          0        10d
View detailed status for a specific monitor:
kubectl describe cronjobmonitor critical-jobs -n production

Ignored Namespaces

By default, these namespaces are ignored (configured in config.yaml):
ignored-namespaces:
  - kube-system
  - kube-public
  - kube-node-lease
To override this globally, update the operator configuration:
# values.yaml for Helm chart
config:
  ignoredNamespaces:
    - kube-system
    - kube-public

Next Steps

Configure Alerts

Set up alert channels and routing

SLA Configuration

Configure success rate and duration tracking

Build docs developers (and LLMs) love