Overview
The CronJob Guardian Helm chart provides declarative installation and configuration for Kubernetes clusters.
Chart Repository: https://illeniumstudios.github.io/cronjob-guardian
Installation:
helm repo add cronjob-guardian https://illeniumstudios.github.io/cronjob-guardian
helm repo update
helm install cronjob-guardian cronjob-guardian/cronjob-guardian
General Configuration
Number of operator replicas. For high availability, use multiple replicas with leaderElection.enabled=true.
Override the chart name in resource names.
Override the full resource name prefix.
Image Configuration
Image registry. If set, prepended to repository. Useful for using private registries.Example: "my-registry.example.com"
image.repository
string
default:"ghcr.io/illeniumstudios/cronjob-guardian"
Image repository.
image.pullPolicy
string
default:"IfNotPresent"
Image pull policy.Valid values: Always, IfNotPresent, Never
Image tag. Defaults to the chart’s appVersion if not specified.
Image pull secrets for private registries.imagePullSecrets:
- name: my-registry-secret
ServiceAccount & RBAC
Create a ServiceAccount for the operator.
Automatically mount ServiceAccount token.
serviceAccount.annotations
ServiceAccount annotations. Useful for cloud provider IAM integration.serviceAccount:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::ACCOUNT:role/guardian
ServiceAccount name. Auto-generated if not set.
Create ClusterRole and ClusterRoleBinding. Required for the operator to function.
Pod Configuration
Annotations to add to operator pods.podAnnotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8443"
Additional labels for operator pods.
Pod-level security context.podSecurityContext:
runAsNonRoot: true
runAsUser: 65532
fsGroup: 65532
Container-level security context.securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
terminationGracePeriodSeconds
Pod termination grace period in seconds.
Resource requests and limits.Default:resources:
limits:
cpu: 500m
memory: 256Mi
requests:
cpu: 10m
memory: 64Mi
Liveness probe configuration.Default:livenessProbe:
initialDelaySeconds: 15
periodSeconds: 20
timeoutSeconds: 1
failureThreshold: 3
Readiness probe configuration.Default:readinessProbe:
initialDelaySeconds: 5
periodSeconds: 10
timeoutSeconds: 1
failureThreshold: 3
Additional environment variables.extraEnv:
- name: GUARDIAN_LOG_LEVEL
value: debug
- name: GUARDIAN_STORAGE_POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-credentials
key: password
Additional volume mounts.extraVolumeMounts:
- name: custom-config
mountPath: /etc/guardian/custom
readOnly: true
Additional volumes.extraVolumes:
- name: custom-config
configMap:
name: guardian-custom-config
Scheduling
Node selector for pod scheduling.nodeSelector:
kubernetes.io/os: linux
node-role.kubernetes.io/control-plane: ""
Tolerations for pod scheduling.tolerations:
- key: node-role.kubernetes.io/control-plane
effect: NoSchedule
Affinity rules for pod scheduling.affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
app.kubernetes.io/name: cronjob-guardian
topologyKey: kubernetes.io/hostname
Operator Configuration
Maps directly to operator configuration options. See Operator Configuration Reference for detailed descriptions.
Log level.Valid values: debug, info, warn, error
Scheduler Configuration
config.scheduler.deadManSwitchInterval
Dead-man’s switch check interval.
config.scheduler.slaRecalculationInterval
SLA recalculation interval.
config.scheduler.pruneInterval
History prune interval.
config.scheduler.startupGracePeriod
Grace period after startup before sending alerts.
History Retention Configuration
config.historyRetention.defaultDays
Default retention period in days.
config.historyRetention.maxDays
Maximum retention period in days.
Rate Limits Configuration
config.rateLimits.maxAlertsPerMinute
Maximum alerts per minute across all channels.
config.rateLimits.burstLimit
Maximum burst of alerts allowed.
config.rateLimits.defaultSuppressDuplicatesFor
Default duration to suppress duplicate alerts.
Storage Configuration
Storage backend type.Valid values: sqlite, postgres, mysql
SQLite Storage
config.storage.sqlite.path
string
default:"/data/guardian.db"
Path to SQLite database file. Requires persistence to be enabled.
PostgreSQL Storage
config.storage.postgres.host
PostgreSQL host.
config.storage.postgres.port
PostgreSQL port.
config.storage.postgres.database
PostgreSQL database name.
config.storage.postgres.username
PostgreSQL username.
config.storage.postgres.password
PostgreSQL password. Ignored if existingSecret is set. Recommend using existingSecret instead.
config.storage.postgres.existingSecret
Name of existing Secret containing PostgreSQL password.
config.storage.postgres.existingSecretKey
Key in existing Secret containing password.
config.storage.postgres.sslMode
PostgreSQL SSL mode.Valid values: disable, require, verify-ca, verify-full
config.storage.postgres.pool.maxIdleConns
Maximum idle connections.
config.storage.postgres.pool.maxOpenConns
Maximum open connections.
config.storage.postgres.pool.connMaxLifetime
Maximum connection lifetime.
config.storage.postgres.pool.connMaxIdleTime
Maximum connection idle time.
MySQL Storage
MySQL configuration follows the same structure as PostgreSQL:
config.storage.mysql.host
config.storage.mysql.port (default: 3306)
config.storage.mysql.database
config.storage.mysql.username
config.storage.mysql.password
config.storage.mysql.existingSecret
config.storage.mysql.existingSecretKey
config.storage.mysql.pool.* (same pool settings as PostgreSQL)
Storage Features
config.storage.logStorageEnabled
Enable storing job logs in database.
config.storage.eventStorageEnabled
Enable storing Kubernetes events in database.
config.storage.maxLogSizeKB
Maximum log size to store per execution (KB).
config.storage.logRetentionDays
Log retention days. 0 means use history-retention.default-days.
Persistence
Required for SQLite storage backend.
Enable persistence for SQLite database.
Storage class name. Use "-" for default storage class, or specify a class name.
PVC access modes.Default:accessModes:
- ReadWriteOnce
PVC selector for binding to specific PVs.persistence:
selector:
matchLabels:
app: cronjob-guardian
UI & Ingress
Enable UI server (serves both web UI and REST API).
UI Service Configuration
ui.service.type
string
default:"ClusterIP"
Service type.Valid values: ClusterIP, NodePort, LoadBalancer
NodePort (only used if type=NodePort).
Service annotations.ui:
service:
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: nlb
Ingress Configuration
Ingress class name.Example: "nginx", "traefik"
Ingress annotations.ui:
ingress:
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/rewrite-target: /
Ingress hosts.Default:hosts:
- host: cronjob-guardian.local
paths:
- path: /
pathType: Prefix
Ingress TLS configuration.ui:
ingress:
tls:
- secretName: cronjob-guardian-tls
hosts:
- cronjob-guardian.example.com
OpenShift Route Configuration
Route hostname. Leave empty for auto-generation.
TLS termination type.Valid values: edge, passthrough, reencrypt
ui.route.tls.insecureEdgeTerminationPolicy
Insecure edge termination policy.Valid values: Allow, Redirect, None
Metrics & Monitoring
Enable Prometheus metrics endpoint.
Enable HTTPS for metrics.
Path to TLS certificate directory.
TLS certificate file name.
Health probes bind address.
ServiceMonitor Configuration
Enable Prometheus Operator ServiceMonitor.
ServiceMonitor labels for selector matching.serviceMonitor:
labels:
release: prometheus
serviceMonitor.scrapeTimeout
Scrape timeout.
serviceMonitor.metricRelabelings
Metric relabelings.serviceMonitor:
metricRelabelings:
- sourceLabels: [__name__]
regex: 'go_.*'
action: drop
serviceMonitor.relabelings
Relabelings.
High Availability
Enable leader election. Required for multiple replicas.
leaderElection.leaseDuration
Leader lease duration.
leaderElection.renewDeadline
Leader renew deadline.
leaderElection.retryPeriod
Leader retry period.
Webhook Configuration
Path to TLS certificate directory for webhook server.
TLS certificate file name.
Enable HTTP/2 for webhook server.
Complete Examples
Basic Installation
# values.yaml
replicaCount: 1
config:
logLevel: info
storage:
type: sqlite
persistence:
enabled: true
size: 5Gi
ui:
enabled: true
ingress:
enabled: true
className: nginx
hosts:
- host: guardian.example.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: guardian-tls
hosts:
- guardian.example.com
High Availability with PostgreSQL
# values.yaml
replicaCount: 3
config:
logLevel: info
storage:
type: postgres
postgres:
host: postgres.database.svc.cluster.local
port: 5432
database: cronjob_guardian
username: guardian
existingSecret: postgres-credentials
existingSecretKey: password
sslMode: require
pool:
maxIdleConns: 20
maxOpenConns: 200
logStorageEnabled: true
eventStorageEnabled: true
maxLogSizeKB: 200
historyRetention:
defaultDays: 60
maxDays: 180
rateLimits:
maxAlertsPerMinute: 100
burstLimit: 20
leaderElection:
enabled: true
persistence:
enabled: false # Not needed with PostgreSQL
resources:
limits:
cpu: 1000m
memory: 512Mi
requests:
cpu: 100m
memory: 128Mi
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
app.kubernetes.io/name: cronjob-guardian
topologyKey: kubernetes.io/hostname
serviceMonitor:
enabled: true
labels:
release: prometheus
interval: 30s
ui:
enabled: true
service:
type: ClusterIP
ingress:
enabled: true
className: nginx
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
hosts:
- host: guardian.example.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: guardian-tls
hosts:
- guardian.example.com
Enterprise Configuration
# values.yaml - Enterprise deployment
replicaCount: 5
image:
repository: ghcr.io/illeniumstudios/cronjob-guardian
pullPolicy: IfNotPresent
tag: "1.0.0"
imagePullSecrets:
- name: registry-credentials
serviceAccount:
create: true
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/guardian
podSecurityContext:
runAsNonRoot: true
runAsUser: 65532
fsGroup: 65532
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
config:
logLevel: info
scheduler:
deadManSwitchInterval: 30s
slaRecalculationInterval: 2m
pruneInterval: 30m
startupGracePeriod: 1m
storage:
type: postgres
postgres:
host: postgres-ha.database.svc.cluster.local
port: 5432
database: guardian_prod
username: guardian
existingSecret: postgres-credentials
sslMode: verify-full
pool:
maxIdleConns: 50
maxOpenConns: 500
connMaxLifetime: 30m
connMaxIdleTime: 5m
logStorageEnabled: true
eventStorageEnabled: true
maxLogSizeKB: 500
logRetentionDays: 90
historyRetention:
defaultDays: 90
maxDays: 365
rateLimits:
maxAlertsPerMinute: 200
burstLimit: 50
defaultSuppressDuplicatesFor: 30m
leaderElection:
enabled: true
leaseDuration: 30s
renewDeadline: 20s
retryPeriod: 5s
persistence:
enabled: false
resources:
limits:
cpu: 2000m
memory: 1Gi
requests:
cpu: 200m
memory: 256Mi
nodeSelector:
kubernetes.io/os: linux
node-role.kubernetes.io/worker: ""
tolerations:
- key: dedicated
operator: Equal
value: monitoring
effect: NoSchedule
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app.kubernetes.io/name: cronjob-guardian
topologyKey: kubernetes.io/hostname
serviceMonitor:
enabled: true
labels:
release: prometheus-operator
interval: 15s
scrapeTimeout: 10s
ui:
enabled: true
service:
type: ClusterIP
annotations:
service.beta.kubernetes.io/aws-load-balancer-internal: "true"
ingress:
enabled: true
className: nginx
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/auth-type: basic
nginx.ingress.kubernetes.io/auth-secret: guardian-basic-auth
hosts:
- host: guardian.prod.example.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: guardian-prod-tls
hosts:
- guardian.prod.example.com