Skip to main content
This guide will get you monitoring CronJobs in under 5 minutes.

Prerequisites

  • Kubernetes cluster (1.26+)
  • kubectl configured
  • helm (optional, for Helm installation)

Step 1: Install CronJob Guardian

1

Install using Helm

The fastest way to install CronJob Guardian is with Helm:
helm install cronjob-guardian oci://ghcr.io/illeniumstudios/charts/cronjob-guardian \
  --namespace cronjob-guardian \
  --create-namespace
This installs the operator with default settings:
  • SQLite storage with 1Gi persistent volume
  • Dashboard enabled on port 8080
  • Prometheus metrics enabled
2

Verify installation

Check that the operator is running:
kubectl get pods -n cronjob-guardian
You should see:
NAME                                READY   STATUS    RESTARTS   AGE
cronjob-guardian-7d9f8c5b6d-x4k2m   1/1     Running   0          30s

Step 2: Set Up Alerts (Optional)

Skip this step if you just want to use the dashboard. You can configure alerts later.
1

Create a Slack webhook secret

First, create a Slack incoming webhook for your workspace, then create a secret:
kubectl create secret generic slack-webhook \
  --namespace cronjob-guardian \
  --from-literal=url=https://hooks.slack.com/services/YOUR/WEBHOOK/URL
2

Create an AlertChannel

Create a file named slack-channel.yaml:
slack-channel.yaml
apiVersion: guardian.illenium.net/v1alpha1
kind: AlertChannel
metadata:
  name: slack-alerts
  namespace: cronjob-guardian
spec:
  type: slack
  slack:
    webhookSecretRef:
      name: slack-webhook
      namespace: cronjob-guardian
      key: url
    defaultChannel: "#alerts"
  rateLimiting:
    maxAlertsPerHour: 100
    burstLimit: 10
Apply it:
kubectl apply -f slack-channel.yaml

Step 3: Create a Monitor

Now let’s create a monitor to watch CronJobs in your cluster.
Monitor all CronJobs in a specific namespace:
monitor-namespace.yaml
apiVersion: guardian.illenium.net/v1alpha1
kind: CronJobMonitor
metadata:
  name: production-monitor
  namespace: production  # namespace to monitor
spec:
  selector: {}  # empty selector = all CronJobs in this namespace
  deadManSwitch:
    enabled: true
    autoFromSchedule:
      enabled: true  # automatically calculate threshold from schedule
  alerting:
    channelRefs:
      - name: slack-alerts
Apply it:
kubectl apply -f monitor-namespace.yaml

Step 4: Access the Dashboard

The dashboard provides a visual interface for viewing metrics and execution history.
1

Port forward to the dashboard

kubectl port-forward -n cronjob-guardian svc/cronjob-guardian 8080:8080
2

Open in your browser

Navigate to http://localhost:8080You’ll see:
  • List of all monitored CronJobs
  • Success rates and execution counts
  • Recent execution history
  • Duration charts and heatmaps
For production use, configure an Ingress or LoadBalancer. See Installation for details.

Step 5: Test It Out

Let’s create a test CronJob to see Guardian in action.
1

Create a test CronJob

test-cronjob.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
  name: test-job
  namespace: production
  labels:
    tier: critical
spec:
  schedule: "*/5 * * * *"  # Every 5 minutes
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: test
              image: busybox:latest
              command: ["/bin/sh", "-c", "echo 'Hello from CronJob Guardian'"]
          restartPolicy: Never
Apply it:
kubectl apply -f test-cronjob.yaml
2

Wait for execution

The CronJob will run within 5 minutes. Watch for it:
kubectl get jobs -n production -w
3

Check the dashboard

Refresh the dashboard at http://localhost:8080 and you’ll see:
  • The new CronJob appears in the list
  • Execution history shows the successful run
  • Duration metrics start populating

What You’ve Accomplished

  • Installed CronJob Guardian operator
  • Configured Slack alerts (optional)
  • Created monitors for your CronJobs
  • Accessed the dashboard
  • Tested with a sample CronJob

What Happens Next?

CronJob Guardian is now:
  1. Watching all CronJobs matching your monitor selectors
  2. Recording execution history and calculating metrics
  3. Alerting when jobs fail or don’t run on schedule
  4. Tracking SLA compliance and performance trends

Trigger a Test Alert

Want to see how alerts work? Create a CronJob that fails:
failing-job.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
  name: failing-job
  namespace: production
spec:
  schedule: "*/2 * * * *"  # Every 2 minutes
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: fail
              image: busybox:latest
              command: ["/bin/sh", "-c", "exit 1"]  # Always fails
          restartPolicy: Never
kubectl apply -f failing-job.yaml
Within a few minutes, you’ll receive a Slack alert (if configured) with:
  • Failure reason
  • Pod logs
  • Kubernetes events
  • Link to dashboard

Next Steps

Installation Guide

Learn about all installation options and configuration

Configure Monitors

Deep dive into monitor configuration and selectors

Set Up SLA Tracking

Configure success rates and duration thresholds

More Examples

See real-world configuration examples

Getting Help

If you run into issues:

Build docs developers (and LLMs) love