This guide covers deploying Temporal Server in a multi-node cluster configuration for high availability, fault tolerance, and horizontal scalability.
Cluster Architecture
A production Temporal cluster typically consists of:
Multiple Service Instances
Each Temporal service (Frontend, History, Matching, Worker) runs multiple instances across different nodes
Membership Protocol
Services use the Ringpop gossip protocol for cluster membership and peer discovery
Consistent Hashing
History shards are distributed across History service instances using consistent hashing
Shared Database
All instances connect to the same persistence and visibility databases
Service Roles and Scaling
Frontend Service
Stateless - Horizontal Scaling
The Frontend service is completely stateless and can be scaled horizontally.Scaling Guidelines:
- Start with 2-3 instances for HA
- Add instances based on request rate
- Each instance handles ~10k requests/sec
- Use load balancer for traffic distribution
services:
frontend:
rpc:
grpcPort: 7233
membershipPort: 6933
bindOnIP: "0.0.0.0"
httpPort: 7243
History Service
Stateful - Shard-Based Scaling
The History service manages workflow execution state using consistent hashing of history shards.Scaling Guidelines:
- Minimum 3 instances for HA
- Shards automatically redistribute on instance changes
- Each instance handles multiple shards
- More instances = better shard distribution
persistence:
numHistoryShards: 512 # More shards = better distribution
services:
history:
rpc:
grpcPort: 7234
membershipPort: 6934
bindOnIP: "0.0.0.0"
Shard Distribution Example:
512 shards, 3 History instances:
- Instance 1: ~170 shards
- Instance 2: ~171 shards
- Instance 3: ~171 shards
512 shards, 5 History instances:
- Each instance: ~102 shards
Matching Service
Semi-Stateful - Queue-Based
The Matching service routes tasks to workers and maintains task queue state.Scaling Guidelines:
- Start with 2-3 instances
- Scale based on task queue throughput
- Task queues are partitioned across instances
services:
matching:
rpc:
grpcPort: 7235
membershipPort: 6935
bindOnIP: "0.0.0.0"
Worker Service
Stateless - Background Operations
The Worker service executes internal system workflows like archival.Scaling Guidelines:
- Usually 1-2 instances sufficient
- Scale if system workflow backlog occurs
services:
worker:
rpc:
grpcPort: 7239
membershipPort: 6939
bindOnIP: "0.0.0.0"
Cluster Membership
Temporal services use Ringpop for cluster membership and service discovery.
Membership Configuration
global:
membership:
maxJoinDuration: 30s
broadcastAddress: "" # Auto-detected if empty
How Membership Works
Service Startup
Each service instance binds to its membership port and announces itself
Peer Discovery
Instances discover each other through the shared database (cluster_membership table)
Gossip Protocol
Ringpop gossip maintains consistent view of cluster membership
Health Checking
Services continuously monitor peer health and detect failures
Shard Rebalancing
When topology changes, History shards automatically redistribute
Broadcast Address
The broadcast address is how services advertise themselves to peers:
Auto-Detection
Explicit IP
Environment Variable
global:
membership:
broadcastAddress: "" # Auto-detect from hostname
Best for:
- Simple deployments
- Kubernetes with pod DNS
- Docker with proper networking
global:
membership:
broadcastAddress: "10.0.1.42"
Best for:
- Multiple network interfaces
- NAT environments
- Complex network topologies
global:
membership:
broadcastAddress: "{{ env \"POD_IP\" }}"
Best for:
- Kubernetes deployments
- Dynamic IP assignment
Multi-Node Deployment Examples
Docker Compose Cluster
services:
# Frontend instances
frontend-1:
image: temporalio/server:latest
container_name: temporal-frontend-1
environment:
- SERVICES=frontend
- BIND_ON_IP=0.0.0.0
- TEMPORAL_BROADCAST_ADDRESS=frontend-1
- NUM_HISTORY_SHARDS=128
- DB=postgres12
- POSTGRES_SEEDS=postgres
- POSTGRES_USER=temporal
- POSTGRES_PWD=temporal
networks:
- temporal-network
depends_on:
- postgres
frontend-2:
image: temporalio/server:latest
container_name: temporal-frontend-2
environment:
- SERVICES=frontend
- BIND_ON_IP=0.0.0.0
- TEMPORAL_BROADCAST_ADDRESS=frontend-2
- NUM_HISTORY_SHARDS=128
- DB=postgres12
- POSTGRES_SEEDS=postgres
- POSTGRES_USER=temporal
- POSTGRES_PWD=temporal
networks:
- temporal-network
depends_on:
- postgres
# History instances
history-1:
image: temporalio/server:latest
container_name: temporal-history-1
environment:
- SERVICES=history
- BIND_ON_IP=0.0.0.0
- TEMPORAL_BROADCAST_ADDRESS=history-1
- NUM_HISTORY_SHARDS=128
- DB=postgres12
- POSTGRES_SEEDS=postgres
- POSTGRES_USER=temporal
- POSTGRES_PWD=temporal
networks:
- temporal-network
depends_on:
- postgres
history-2:
image: temporalio/server:latest
container_name: temporal-history-2
environment:
- SERVICES=history
- BIND_ON_IP=0.0.0.0
- TEMPORAL_BROADCAST_ADDRESS=history-2
- NUM_HISTORY_SHARDS=128
- DB=postgres12
- POSTGRES_SEEDS=postgres
- POSTGRES_USER=temporal
- POSTGRES_PWD=temporal
networks:
- temporal-network
depends_on:
- postgres
history-3:
image: temporalio/server:latest
container_name: temporal-history-3
environment:
- SERVICES=history
- BIND_ON_IP=0.0.0.0
- TEMPORAL_BROADCAST_ADDRESS=history-3
- NUM_HISTORY_SHARDS=128
- DB=postgres12
- POSTGRES_SEEDS=postgres
- POSTGRES_USER=temporal
- POSTGRES_PWD=temporal
networks:
- temporal-network
depends_on:
- postgres
# Matching instances
matching-1:
image: temporalio/server:latest
container_name: temporal-matching-1
environment:
- SERVICES=matching
- BIND_ON_IP=0.0.0.0
- TEMPORAL_BROADCAST_ADDRESS=matching-1
- NUM_HISTORY_SHARDS=128
- DB=postgres12
- POSTGRES_SEEDS=postgres
- POSTGRES_USER=temporal
- POSTGRES_PWD=temporal
networks:
- temporal-network
depends_on:
- postgres
matching-2:
image: temporalio/server:latest
container_name: temporal-matching-2
environment:
- SERVICES=matching
- BIND_ON_IP=0.0.0.0
- TEMPORAL_BROADCAST_ADDRESS=matching-2
- NUM_HISTORY_SHARDS=128
- DB=postgres12
- POSTGRES_SEEDS=postgres
- POSTGRES_USER=temporal
- POSTGRES_PWD=temporal
networks:
- temporal-network
depends_on:
- postgres
# Worker instance
worker:
image: temporalio/server:latest
container_name: temporal-worker
environment:
- SERVICES=worker
- BIND_ON_IP=0.0.0.0
- TEMPORAL_BROADCAST_ADDRESS=worker
- NUM_HISTORY_SHARDS=128
- DB=postgres12
- POSTGRES_SEEDS=postgres
- POSTGRES_USER=temporal
- POSTGRES_PWD=temporal
networks:
- temporal-network
depends_on:
- postgres
# Load balancer for Frontend
nginx:
image: nginx:alpine
container_name: temporal-nginx
ports:
- "7233:7233"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
networks:
- temporal-network
depends_on:
- frontend-1
- frontend-2
postgres:
image: postgres:13
container_name: temporal-postgres
environment:
POSTGRES_USER: temporal
POSTGRES_PASSWORD: temporal
networks:
- temporal-network
volumes:
- postgres-data:/var/lib/postgresql/data
networks:
temporal-network:
driver: bridge
volumes:
postgres-data:
Kubernetes Deployment
For Kubernetes, use the official Helm charts:
# Add Temporal Helm repository
helm repo add temporalio https://go.temporal.io/helm-charts
helm repo update
# Create namespace
kubectl create namespace temporal
# Install with custom values
helm install temporal temporalio/temporal \
--namespace temporal \
--values values.yaml
server:
replicaCount: 1
config:
persistence:
default:
sql:
driver: "postgres"
host: postgres.temporal.svc.cluster.local
port: 5432
database: temporal
user: temporal
password: temporal
maxConns: 20
maxIdleConns: 20
visibility:
sql:
driver: "postgres"
host: postgres.temporal.svc.cluster.local
port: 5432
database: temporal_visibility
user: temporal
password: temporal
maxConns: 10
maxIdleConns: 10
numHistoryShards: 512
frontend:
replicaCount: 3
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: 2000m
memory: 2Gi
service:
type: LoadBalancer
port: 7233
history:
replicaCount: 5
resources:
requests:
cpu: 1000m
memory: 2Gi
limits:
cpu: 4000m
memory: 4Gi
matching:
replicaCount: 3
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: 2000m
memory: 2Gi
worker:
replicaCount: 1
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: 2000m
memory: 2Gi
prometheus:
enabled: true
grafana:
enabled: true
Multi-Cluster Replication
For geo-distributed deployments, Temporal supports multi-cluster replication.
Cluster Configuration
persistence:
numHistoryShards: 512
clusterMetadata:
enableGlobalNamespace: true
failoverVersionIncrement: 100
masterClusterName: "cluster-a"
currentClusterName: "cluster-a"
clusterInformation:
cluster-a:
enabled: true
initialFailoverVersion: 1
rpcName: "frontend"
rpcAddress: "cluster-a.example.com:7233"
httpAddress: "cluster-a.example.com:7243"
cluster-b:
enabled: true
initialFailoverVersion: 2
rpcName: "frontend"
rpcAddress: "cluster-b.example.com:7233"
httpAddress: "cluster-b.example.com:7243"
dcRedirectionPolicy:
policy: "all-apis-forwarding"
persistence:
numHistoryShards: 512
clusterMetadata:
enableGlobalNamespace: true
failoverVersionIncrement: 100
masterClusterName: "cluster-a"
currentClusterName: "cluster-b"
clusterInformation:
cluster-a:
enabled: true
initialFailoverVersion: 1
rpcName: "frontend"
rpcAddress: "cluster-a.example.com:7233"
httpAddress: "cluster-a.example.com:7243"
cluster-b:
enabled: true
initialFailoverVersion: 2
rpcName: "frontend"
rpcAddress: "cluster-b.example.com:7233"
httpAddress: "cluster-b.example.com:7243"
dcRedirectionPolicy:
policy: "all-apis-forwarding"
Registering Remote Clusters
After both clusters are running:
# From Cluster A, register Cluster B
tctl --address cluster-a.example.com:7233 \
admin cluster upsert-remote-cluster \
--frontend_address "cluster-b.example.com:7233"
# From Cluster B, register Cluster A
tctl --address cluster-b.example.com:7233 \
admin cluster upsert-remote-cluster \
--frontend_address "cluster-a.example.com:7233"
Global Namespaces
Create a namespace that replicates across clusters:
tctl --address cluster-a.example.com:7233 \
namespace register \
--global \
--namespace my-global-namespace \
--clusters cluster-a cluster-b \
--active_cluster cluster-a
Failover
Manual failover to switch active cluster:
tctl --address cluster-a.example.com:7233 \
namespace update \
--namespace my-global-namespace \
--active_cluster cluster-b
Load Balancing
Frontend Load Balancing
Nginx
HAProxy
Kubernetes Service
stream {
upstream temporal_grpc {
least_conn;
server frontend-1.temporal.svc.cluster.local:7233 max_fails=3 fail_timeout=30s;
server frontend-2.temporal.svc.cluster.local:7233 max_fails=3 fail_timeout=30s;
server frontend-3.temporal.svc.cluster.local:7233 max_fails=3 fail_timeout=30s;
}
server {
listen 7233;
proxy_pass temporal_grpc;
proxy_connect_timeout 5s;
proxy_timeout 300s;
}
}
defaults
mode tcp
timeout connect 5s
timeout client 300s
timeout server 300s
frontend temporal_grpc
bind *:7233
default_backend temporal_backend
backend temporal_backend
balance leastconn
option tcp-check
server frontend-1 frontend-1:7233 check
server frontend-2 frontend-2:7233 check
server frontend-3 frontend-3:7233 check
apiVersion: v1
kind: Service
metadata:
name: temporal-frontend
namespace: temporal
spec:
type: LoadBalancer
sessionAffinity: None
selector:
app: temporal-frontend
ports:
- name: grpc
protocol: TCP
port: 7233
targetPort: 7233
- name: http
protocol: TCP
port: 7243
targetPort: 7243
High Availability Checklist
Troubleshooting Cluster Issues
Shard ownership conflicts
Symptoms: History service logs show shard ownership conflictsCauses:
- Split-brain scenarios
- Network partitions
- Clock skew between nodes
Solutions:
- Ensure NTP is synchronized across all nodes
- Check network connectivity between services
- Verify database connectivity
- Review
maxJoinDuration setting
Uneven shard distribution
Symptoms: Some History instances have many more shards than othersSolutions:
- Restart History instances one at a time
- Wait for membership stabilization (30-60s)
- Check broadcast addresses are correct
- Verify all instances can reach the database
Service discovery failures
Symptoms: Services cannot find each otherSolutions:
- Check membership ports are accessible
- Verify broadcast addresses resolve correctly
- Check firewall rules
- Ensure database connectivity for membership table
Symptoms: Connection failures, timeouts, uneven loadSolutions:
- Configure health checks properly
- Use least-connections algorithm
- Set appropriate timeouts (5s connect, 300s idle)
- Monitor backend health status
Shard Count Guidelines
Small Scale
Medium Scale
Large Scale
# < 100 workflows/second
numHistoryShards: 128
history:
replicaCount: 3
# Result: ~43 shards per instance
# 100-1000 workflows/second
numHistoryShards: 512
history:
replicaCount: 5
# Result: ~102 shards per instance
# > 1000 workflows/second
numHistoryShards: 4096
history:
replicaCount: 10
# Result: ~410 shards per instance
Cannot change shard count after deployment! Choose carefully based on expected scale.
Resource Recommendations
| Component | Small | Medium | Large |
|---|
| Frontend | 2 CPU 4 GB RAM | 4 CPU 8 GB RAM | 8 CPU 16 GB RAM |
| History | 4 CPU 8 GB RAM | 8 CPU 16 GB RAM | 16 CPU 32 GB RAM |
| Matching | 2 CPU 4 GB RAM | 4 CPU 8 GB RAM | 8 CPU 16 GB RAM |
| Worker | 2 CPU 4 GB RAM | 2 CPU 4 GB RAM | 4 CPU 8 GB RAM |
These are starting points. Monitor actual usage and adjust based on workload patterns.