Overview
Kubernetes provides an excellent platform for running Cadence in production, offering automated deployment, scaling, and management of containerized applications. This guide covers best practices for deploying Cadence on Kubernetes.
Architecture Overview
A production Cadence deployment on Kubernetes typically includes:
StatefulSets : For Cadence services requiring stable network identities
Deployments : For stateless Cadence services
Services : For service discovery and load balancing
ConfigMaps : For configuration management
Secrets : For sensitive credentials
PersistentVolumes : For database persistence (if running databases in-cluster)
Prerequisites
Kubernetes cluster (1.20+)
kubectl configured to access your cluster
Database (Cassandra, MySQL, or PostgreSQL) - managed or self-hosted
Optional: Helm 3.x for package management
Optional: ElasticSearch/OpenSearch for advanced visibility
Deployment Strategy
Service Separation
Deploy each Cadence service type separately for independent scaling:
Frontend : User-facing API endpoints
History : Workflow execution engines
Matching : Task list management
Worker : System workflows and replication
Namespace Design
Organize resources using Kubernetes namespaces:
kubectl create namespace cadence
kubectl create namespace cadence-system # For monitoring, operators
Configuration with ConfigMaps
Store Cadence configuration in ConfigMaps:
apiVersion : v1
kind : ConfigMap
metadata :
name : cadence-config
namespace : cadence
data :
config.yaml : |
log:
stdout: true
level: info
persistence:
defaultStore: cass-default
visibilityStore: cass-visibility
numHistoryShards: 4096
datastores:
cass-default:
nosql:
pluginName: "cassandra"
hosts: "cassandra.cadence.svc.cluster.local"
keyspace: "cadence"
consistency: LOCAL_QUORUM
cass-visibility:
nosql:
pluginName: "cassandra"
hosts: "cassandra.cadence.svc.cluster.local"
keyspace: "cadence_visibility"
ringpop:
name: cadence
bootstrapMode: dns
bootstrapHosts:
- "cadence-frontend-headless.cadence.svc.cluster.local:7933"
- "cadence-history-headless.cadence.svc.cluster.local:7934"
- "cadence-matching-headless.cadence.svc.cluster.local:7935"
maxJoinDuration: 30s
services:
frontend:
rpc:
port: 7933
grpcPort: 7833
bindOnLocalHost: false
metrics:
prometheus:
timerType: "histogram"
listenAddress: "0.0.0.0:8000"
history:
rpc:
port: 7934
grpcPort: 7834
bindOnLocalHost: false
metrics:
prometheus:
timerType: "histogram"
listenAddress: "0.0.0.0:8001"
matching:
rpc:
port: 7935
grpcPort: 7835
bindOnLocalHost: false
metrics:
prometheus:
timerType: "histogram"
listenAddress: "0.0.0.0:8002"
worker:
rpc:
port: 7939
bindOnLocalHost: false
metrics:
prometheus:
timerType: "histogram"
listenAddress: "0.0.0.0:8003"
Apply the ConfigMap:
kubectl apply -f configmap.yaml
Secrets Management
Store sensitive credentials in Kubernetes Secrets:
apiVersion : v1
kind : Secret
metadata :
name : cadence-secrets
namespace : cadence
type : Opaque
stringData :
cassandra-password : "your-secure-password"
mysql-password : "your-secure-password"
postgres-password : "your-secure-password"
Apply the Secret:
kubectl apply -f secrets.yaml
Never commit secrets to version control. Use tools like sealed-secrets, external-secrets, or cloud-native secret managers (AWS Secrets Manager, GCP Secret Manager, Azure Key Vault).
Frontend Deployment
Deploy the Frontend service with Deployment and Service:
apiVersion : apps/v1
kind : Deployment
metadata :
name : cadence-frontend
namespace : cadence
labels :
app : cadence-frontend
spec :
replicas : 3
selector :
matchLabels :
app : cadence-frontend
template :
metadata :
labels :
app : cadence-frontend
annotations :
prometheus.io/scrape : "true"
prometheus.io/port : "8000"
prometheus.io/path : "/metrics"
spec :
containers :
- name : cadence-frontend
image : ubercadence/server:1.2.7
ports :
- containerPort : 7933
name : tchannel
protocol : TCP
- containerPort : 7833
name : grpc
protocol : TCP
- containerPort : 8000
name : metrics
protocol : TCP
env :
- name : SERVICES
value : "frontend"
- name : LOG_LEVEL
value : "info"
- name : CASSANDRA_SEEDS
value : "cassandra.cadence.svc.cluster.local"
- name : CASSANDRA_PASSWORD
valueFrom :
secretKeyRef :
name : cadence-secrets
key : cassandra-password
- name : NUM_HISTORY_SHARDS
value : "4096"
- name : PROMETHEUS_ENDPOINT_0
value : "0.0.0.0:8000"
volumeMounts :
- name : config
mountPath : /etc/cadence/config
readOnly : true
livenessProbe :
httpGet :
path : /metrics
port : 8000
initialDelaySeconds : 60
periodSeconds : 30
timeoutSeconds : 5
readinessProbe :
httpGet :
path : /metrics
port : 8000
initialDelaySeconds : 30
periodSeconds : 10
resources :
requests :
cpu : 500m
memory : 1Gi
limits :
cpu : 2000m
memory : 4Gi
volumes :
- name : config
configMap :
name : cadence-config
---
apiVersion : v1
kind : Service
metadata :
name : cadence-frontend
namespace : cadence
labels :
app : cadence-frontend
spec :
type : ClusterIP
ports :
- port : 7933
targetPort : 7933
protocol : TCP
name : tchannel
- port : 7833
targetPort : 7833
protocol : TCP
name : grpc
selector :
app : cadence-frontend
---
apiVersion : v1
kind : Service
metadata :
name : cadence-frontend-headless
namespace : cadence
labels :
app : cadence-frontend
spec :
clusterIP : None
ports :
- port : 7933
name : tchannel
selector :
app : cadence-frontend
History Deployment
Deploy History service as StatefulSet for stable network identities:
apiVersion : apps/v1
kind : StatefulSet
metadata :
name : cadence-history
namespace : cadence
spec :
serviceName : cadence-history-headless
replicas : 6
selector :
matchLabels :
app : cadence-history
template :
metadata :
labels :
app : cadence-history
annotations :
prometheus.io/scrape : "true"
prometheus.io/port : "8001"
spec :
containers :
- name : cadence-history
image : ubercadence/server:1.2.7
ports :
- containerPort : 7934
name : tchannel
- containerPort : 7834
name : grpc
- containerPort : 8001
name : metrics
env :
- name : SERVICES
value : "history"
- name : LOG_LEVEL
value : "info"
- name : CASSANDRA_SEEDS
value : "cassandra.cadence.svc.cluster.local"
- name : NUM_HISTORY_SHARDS
value : "4096"
- name : PROMETHEUS_ENDPOINT_2
value : "0.0.0.0:8001"
volumeMounts :
- name : config
mountPath : /etc/cadence/config
resources :
requests :
cpu : 1000m
memory : 2Gi
limits :
cpu : 4000m
memory : 8Gi
volumes :
- name : config
configMap :
name : cadence-config
---
apiVersion : v1
kind : Service
metadata :
name : cadence-history-headless
namespace : cadence
spec :
clusterIP : None
ports :
- port : 7934
name : tchannel
- port : 7834
name : grpc
selector :
app : cadence-history
Matching Deployment
apiVersion : apps/v1
kind : Deployment
metadata :
name : cadence-matching
namespace : cadence
spec :
replicas : 3
selector :
matchLabels :
app : cadence-matching
template :
metadata :
labels :
app : cadence-matching
annotations :
prometheus.io/scrape : "true"
prometheus.io/port : "8002"
spec :
containers :
- name : cadence-matching
image : ubercadence/server:1.2.7
ports :
- containerPort : 7935
name : tchannel
- containerPort : 7835
name : grpc
- containerPort : 8002
name : metrics
env :
- name : SERVICES
value : "matching"
- name : LOG_LEVEL
value : "info"
- name : PROMETHEUS_ENDPOINT_1
value : "0.0.0.0:8002"
volumeMounts :
- name : config
mountPath : /etc/cadence/config
resources :
requests :
cpu : 500m
memory : 1Gi
limits :
cpu : 2000m
memory : 4Gi
volumes :
- name : config
configMap :
name : cadence-config
---
apiVersion : v1
kind : Service
metadata :
name : cadence-matching-headless
namespace : cadence
spec :
clusterIP : None
ports :
- port : 7935
name : tchannel
selector :
app : cadence-matching
Worker Deployment
apiVersion : apps/v1
kind : Deployment
metadata :
name : cadence-worker
namespace : cadence
spec :
replicas : 2
selector :
matchLabels :
app : cadence-worker
template :
metadata :
labels :
app : cadence-worker
spec :
containers :
- name : cadence-worker
image : ubercadence/server:1.2.7
ports :
- containerPort : 7939
name : tchannel
- containerPort : 8003
name : metrics
env :
- name : SERVICES
value : "worker"
- name : LOG_LEVEL
value : "info"
- name : PROMETHEUS_ENDPOINT_3
value : "0.0.0.0:8003"
volumeMounts :
- name : config
mountPath : /etc/cadence/config
resources :
requests :
cpu : 250m
memory : 512Mi
limits :
cpu : 1000m
memory : 2Gi
volumes :
- name : config
configMap :
name : cadence-config
Ingress Configuration
Expose Frontend service via Ingress:
apiVersion : networking.k8s.io/v1
kind : Ingress
metadata :
name : cadence-frontend-ingress
namespace : cadence
annotations :
nginx.ingress.kubernetes.io/backend-protocol : "GRPC"
cert-manager.io/cluster-issuer : "letsencrypt-prod"
spec :
ingressClassName : nginx
tls :
- hosts :
- cadence.example.com
secretName : cadence-tls
rules :
- host : cadence.example.com
http :
paths :
- path : /
pathType : Prefix
backend :
service :
name : cadence-frontend
port :
number : 7833
Horizontal Pod Autoscaling
Autoscale services based on CPU/memory:
apiVersion : autoscaling/v2
kind : HorizontalPodAutoscaler
metadata :
name : cadence-frontend-hpa
namespace : cadence
spec :
scaleTargetRef :
apiVersion : apps/v1
kind : Deployment
name : cadence-frontend
minReplicas : 3
maxReplicas : 10
metrics :
- type : Resource
resource :
name : cpu
target :
type : Utilization
averageUtilization : 70
- type : Resource
resource :
name : memory
target :
type : Utilization
averageUtilization : 80
---
apiVersion : autoscaling/v2
kind : HorizontalPodAutoscaler
metadata :
name : cadence-matching-hpa
namespace : cadence
spec :
scaleTargetRef :
apiVersion : apps/v1
kind : Deployment
name : cadence-matching
minReplicas : 3
maxReplicas : 10
metrics :
- type : Resource
resource :
name : cpu
target :
type : Utilization
averageUtilization : 70
History service uses StatefulSet and requires careful scaling. Coordinate with shard distribution when scaling history pods.
Monitoring with Prometheus
Deploy ServiceMonitor for Prometheus Operator:
apiVersion : monitoring.coreos.com/v1
kind : ServiceMonitor
metadata :
name : cadence-metrics
namespace : cadence
labels :
app : cadence
spec :
selector :
matchLabels :
app : cadence-frontend
endpoints :
- port : metrics
interval : 30s
path : /metrics
---
apiVersion : monitoring.coreos.com/v1
kind : ServiceMonitor
metadata :
name : cadence-history-metrics
namespace : cadence
spec :
selector :
matchLabels :
app : cadence-history
endpoints :
- port : metrics
interval : 30s
Production Best Practices
Use pod anti-affinity
Spread replicas across nodes: affinity :
podAntiAffinity :
preferredDuringSchedulingIgnoredDuringExecution :
- weight : 100
podAffinityTerm :
labelSelector :
matchExpressions :
- key : app
operator : In
values :
- cadence-frontend
topologyKey : kubernetes.io/hostname
Configure resource quotas
apiVersion : v1
kind : ResourceQuota
metadata :
name : cadence-quota
namespace : cadence
spec :
hard :
requests.cpu : "20"
requests.memory : 40Gi
limits.cpu : "40"
limits.memory : 80Gi
Enable pod disruption budgets
apiVersion : policy/v1
kind : PodDisruptionBudget
metadata :
name : cadence-frontend-pdb
namespace : cadence
spec :
minAvailable : 2
selector :
matchLabels :
app : cadence-frontend
Use network policies
apiVersion : networking.k8s.io/v1
kind : NetworkPolicy
metadata :
name : cadence-network-policy
namespace : cadence
spec :
podSelector :
matchLabels :
app : cadence-frontend
policyTypes :
- Ingress
- Egress
ingress :
- from :
- namespaceSelector :
matchLabels :
name : ingress-nginx
ports :
- protocol : TCP
port : 7833
Deployment Workflow
Deploy database (if self-hosted)
Use operators like:
K8ssandra for Cassandra
MySQL Operator
PostgreSQL Operator (Zalando, Crunchy)
Initialize database schemas
Run schema initialization as a Job: apiVersion : batch/v1
kind : Job
metadata :
name : cadence-schema-init
namespace : cadence
spec :
template :
spec :
containers :
- name : schema-init
image : ubercadence/server:1.2.7
command :
- /bin/sh
- -c
- |
cadence-cassandra-tool \
--ep cassandra.cadence.svc.cluster.local \
create --keyspace cadence
cadence-cassandra-tool \
--ep cassandra.cadence.svc.cluster.local \
--keyspace cadence \
setup-schema --version 0.0
restartPolicy : OnFailure
Deploy ConfigMap and Secrets
kubectl apply -f configmap.yaml
kubectl apply -f secrets.yaml
Deploy Cadence services
kubectl apply -f frontend-deployment.yaml
kubectl apply -f history-statefulset.yaml
kubectl apply -f matching-deployment.yaml
kubectl apply -f worker-deployment.yaml
Configure monitoring
kubectl apply -f servicemonitor.yaml
kubectl apply -f hpa.yaml
Verify deployment
kubectl get pods -n cadence
kubectl logs -n cadence -l app=cadence-frontend
Troubleshooting
Pods Not Starting
# Check pod status
kubectl get pods -n cadence
# Describe pod for events
kubectl describe pod < pod-nam e > -n cadence
# Check logs
kubectl logs < pod-nam e > -n cadence
# Check previous logs if pod crashed
kubectl logs < pod-nam e > -n cadence --previous
Service Discovery Issues
# Test DNS resolution
kubectl run -it --rm debug --image=busybox --restart=Never -- \
nslookup cadence-frontend.cadence.svc.cluster.local
# Check service endpoints
kubectl get endpoints -n cadence
Database Connectivity
# Test from pod
kubectl exec -it < pod-nam e > -n cadence -- \
telnet cassandra.cadence.svc.cluster.local 9042
While there’s no official Helm chart, community charts are available. Basic structure:
image :
repository : ubercadence/server
tag : 1.2.7
pullPolicy : IfNotPresent
frontend :
replicaCount : 3
resources :
requests :
cpu : 500m
memory : 1Gi
history :
replicaCount : 6
resources :
requests :
cpu : 1000m
memory : 2Gi
matching :
replicaCount : 3
resources :
requests :
cpu : 500m
memory : 1Gi
worker :
replicaCount : 2
cassandra :
enabled : false
external :
hosts :
- cassandra.cadence.svc.cluster.local
Next Steps
Configuration Fine-tune your Cadence configuration
Server Setup Learn about Cadence architecture