Multi-cluster allocation enables you to allocate GameServers across multiple Kubernetes clusters, providing:
Geographic distribution - Place players closer to game servers
Capacity expansion - Allocate from multiple clusters when one is at capacity
High availability - Failover to other clusters if one becomes unavailable
Cloud provider diversity - Spread workload across multiple providers
Architecture Overview
Multi-cluster allocation uses:
GameServerAllocationPolicy - Defines target clusters and priorities
Allocation endpoint - gRPC service for remote allocation
Client certificates - Mutual TLS authentication between clusters
Allocation always starts in the local cluster. If no GameServers are available, the allocator tries remote clusters based on configured policies.
Setup Prerequisites
Multiple Kubernetes clusters
At least 2 Kubernetes clusters with Agones installed: # Cluster A (primary)
kubectl config use-context cluster-a
helm install agones agones/agones -n agones-system --create-namespace
# Cluster B (secondary)
kubectl config use-context cluster-b
helm install agones agones/agones -n agones-system --create-namespace
Network connectivity
Ensure clusters can communicate:
Allocator service must be externally accessible (LoadBalancer or Ingress)
Firewall rules allow port 443 between clusters
DNS resolution between cluster endpoints
TLS certificates
Generate certificates for mutual TLS authentication (see below).
Certificate Setup
Multi-cluster allocation requires mutual TLS authentication.
Generate Certificates
Using OpenSSL
Using cert-manager
#!/bin/bash
# Generate CA
openssl req -x509 -newkey rsa:4096 -nodes \
-keyout ca.key -out ca.crt \
-days 3650 -subj "/CN=agones-allocator-ca"
# Generate client certificate for cluster A
openssl req -newkey rsa:4096 -nodes \
-keyout cluster-a-client.key -out cluster-a-client.csr \
-subj "/CN=cluster-a-client"
openssl x509 -req -in cluster-a-client.csr \
-CA ca.crt -CAkey ca.key -CAcreateserial \
-out cluster-a-client.crt -days 3650
# Generate client certificate for cluster B
openssl req -newkey rsa:4096 -nodes \
-keyout cluster-b-client.key -out cluster-b-client.csr \
-subj "/CN=cluster-b-client"
openssl x509 -req -in cluster-b-client.csr \
-CA ca.crt -CAkey ca.key -CAcreateserial \
-out cluster-b-client.crt -days 3650
# Install cert-manager first
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.0/cert-manager.yaml
# Create CA issuer
apiVersion : cert-manager.io/v1
kind : ClusterIssuer
metadata :
name : agones-ca-issuer
spec :
ca :
secretName : agones-ca-secret
---
# Create client certificate
apiVersion : cert-manager.io/v1
kind : Certificate
metadata :
name : cluster-a-client
namespace : default
spec :
secretName : cluster-a-client-secret
commonName : cluster-a-client
issuerRef :
name : agones-ca-issuer
kind : ClusterIssuer
usages :
- client auth
Create Kubernetes Secrets
# In Cluster A - create secret for accessing Cluster B
kubectl create secret tls cluster-b-client-secret \
--cert=cluster-a-client.crt \
--key=cluster-a-client.key \
--dry-run=client -o yaml | kubectl apply -f -
# In Cluster B - create secret for accessing Cluster A
kubectl create secret tls cluster-a-client-secret \
--cert=cluster-b-client.crt \
--key=cluster-b-client.key \
--dry-run=client -o yaml | kubectl apply -f -
Expose Allocator Service
Make the allocator service accessible from other clusters:
# Get allocator service in each cluster
kubectl get service agones-allocator -n agones-system
# Patch to LoadBalancer type
kubectl patch service agones-allocator -n agones-system \
-p '{"spec":{"type":"LoadBalancer"}}'
# Get external IP
kubectl get service agones-allocator -n agones-system \
-o jsonpath='{.status.loadBalancer.ingress[0].ip}'
apiVersion : networking.k8s.io/v1
kind : Ingress
metadata :
name : agones-allocator-ingress
namespace : agones-system
annotations :
nginx.ingress.kubernetes.io/ssl-passthrough : "true"
nginx.ingress.kubernetes.io/backend-protocol : "GRPCS"
spec :
ingressClassName : nginx
rules :
- host : allocator.cluster-b.example.com
http :
paths :
- path : /
pathType : Prefix
backend :
service :
name : agones-allocator
port :
number : 443
tls :
- hosts :
- allocator.cluster-b.example.com
Create GameServerAllocationPolicy resources to define remote clusters.
Basic Policy Example
apiVersion : multicluster.agones.dev/v1
kind : GameServerAllocationPolicy
metadata :
name : cluster-b-policy
namespace : default
spec :
priority : 1
weight : 100
connectionInfo :
clusterName : cluster-b
allocationEndpoints :
- allocator.cluster-b.example.com:443
secretName : cluster-b-client-secret
namespace : default
serverCa : LS0tLS1CRUdJTi... # Base64 encoded CA certificate
Understanding policy fields
priority - Lower values tried first (0 = highest priority). Policies with same priority use weighted random selection
weight - Relative weight for random selection within same priority
clusterName - Logical name for the cluster
allocationEndpoints - List of allocator endpoints (tries each until success)
secretName - Kubernetes secret containing client certificate and key
namespace - Namespace to allocate GameServers from in remote cluster
serverCa - Base64-encoded CA certificate to verify remote server
Get Server CA Certificate
# From Cluster B, extract allocator TLS certificate
kubectl get secret allocator-tls -n agones-system \
-o jsonpath='{.data.tls\.crt}' | base64 -d > allocator-tls.crt
# Extract CA from certificate chain (usually the last certificate)
openssl x509 -in allocator-tls.crt -text
# Base64 encode for policy
cat ca.crt | base64 -w 0
Apply Policy
# Apply in Cluster A to enable allocation from Cluster B
kubectl apply -f cluster-b-policy.yaml
# Verify policy
kubectl get gameserverallocationpolicy
kubectl describe gameserverallocationpolicy cluster-b-policy
Allocation Priority and Fallback
Priority-Based Allocation
Clusters are tried in priority order:
# Cluster B - priority 1 (tried first after local)
apiVersion : multicluster.agones.dev/v1
kind : GameServerAllocationPolicy
metadata :
name : cluster-b-primary
spec :
priority : 1
weight : 100
connectionInfo :
clusterName : cluster-b
allocationEndpoints :
- allocator.cluster-b.example.com:443
secretName : cluster-b-secret
namespace : default
---
# Cluster C - priority 2 (tried second)
apiVersion : multicluster.agones.dev/v1
kind : GameServerAllocationPolicy
metadata :
name : cluster-c-backup
spec :
priority : 2
weight : 100
connectionInfo :
clusterName : cluster-c
allocationEndpoints :
- allocator.cluster-c.example.com:443
secretName : cluster-c-secret
namespace : default
Weighted Distribution
Use weights to distribute load across clusters with same priority:
# US-East cluster - 70% of traffic
apiVersion : multicluster.agones.dev/v1
kind : GameServerAllocationPolicy
metadata :
name : us-east
spec :
priority : 1
weight : 70 # 70% of allocations
connectionInfo :
clusterName : us-east-cluster
allocationEndpoints :
- allocator.us-east.example.com:443
secretName : us-east-secret
namespace : default
---
# US-West cluster - 30% of traffic
apiVersion : multicluster.agones.dev/v1
kind : GameServerAllocationPolicy
metadata :
name : us-west
spec :
priority : 1
weight : 30 # 30% of allocations
connectionInfo :
clusterName : us-west-cluster
allocationEndpoints :
- allocator.us-west.example.com:443
secretName : us-west-secret
namespace : default
Weights are relative. In this example, US-East gets 70/(70+30) = 70% of allocation attempts to priority 1 clusters.
Making Allocation Requests
Allocations work the same way with multi-cluster:
apiVersion : allocation.agones.dev/v1
kind : GameServerAllocation
metadata :
generateName : multi-cluster-allocation-
spec :
# Local cluster tried first, then policies by priority
required :
matchLabels :
game : my-game
mode : battle-royale
# Preferred criteria (optional)
preferred :
- matchLabels :
agones.dev/fleet : preferred-fleet
# Metadata to set on allocated GameServer
metadata :
labels :
allocated-by : matchmaker
annotations :
player-id : "12345"
The allocation process:
Try local cluster first
If no GameServers available locally, try remote clusters by priority
Within same priority, select cluster using weighted random
Return first successfully allocated GameServer
Geographic-Based Allocation
Implement latency-based allocation by managing policies dynamically:
// Pseudo-code for geographic allocation
func AllocateNearestCluster ( playerLocation Location ) {
// Determine nearest clusters
nearestClusters := GetNearestClusters ( playerLocation )
// Create allocation request
allocation := & allocationv1 . GameServerAllocation {
Spec : allocationv1 . GameServerAllocationSpec {
Required : metav1 . LabelSelector {
MatchLabels : map [ string ] string {
"game" : "my-game" ,
"region" : nearestClusters [ 0 ]. Region ,
},
},
},
}
// Submit to nearest cluster
result := client . Allocate ( nearestClusters [ 0 ]. Name , allocation )
return result
}
Monitoring Multi-Cluster Allocation
Allocation Metrics by Cluster
# Allocation attempts by cluster (via fleet_name label)
rate(agones_gameserver_allocations_duration_seconds_count[5m]) by (fleet_name)
# Success rate by cluster
sum(rate(agones_gameserver_allocations_duration_seconds_count{status="Allocated"}[5m])) by (fleet_name) /
sum(rate(agones_gameserver_allocations_duration_seconds_count[5m])) by (fleet_name)
# Remote allocation latency
histogram_quantile(0.99,
sum(rate(agones_gameserver_allocations_duration_seconds_bucket[5m])) by (le, fleet_name)
)
Health Checks
# Test local allocator health
kubectl port-forward -n agones-system svc/agones-allocator 8443:443
curl -k https://localhost:8443/healthz
# Test remote connectivity from Cluster A to Cluster B
kubectl run -it --rm test-client --image=curlimages/curl --restart=Never \
-- curl -k https://allocator.cluster-b.example.com/healthz
Troubleshooting
Certificate Issues
# Check certificate validity
openssl x509 -in cluster-a-client.crt -text -noout | grep -A 2 Validity
# Verify certificate matches key
openssl x509 -noout -modulus -in cluster-a-client.crt | openssl md5
openssl rsa -noout -modulus -in cluster-a-client.key | openssl md5
# Test TLS connection
openssl s_client -connect allocator.cluster-b.example.com:443 \
-cert cluster-a-client.crt -key cluster-a-client.key \
-CAfile ca.crt
Verify serverCa in policy matches actual server certificate: # Get server CA from endpoint
echo | openssl s_client -connect allocator.cluster-b.example.com:443 2> /dev/null \
| openssl x509 -outform PEM > server-ca.pem
# Base64 encode
cat server-ca.pem | base64 -w 0
# Update policy with correct serverCa
kubectl edit gameserverallocationpolicy cluster-b-policy
Allocation Failures
# Check policy status
kubectl get gameserverallocationpolicy -o yaml
# View allocator logs in remote cluster
kubectl logs -n agones-system -l app=agones,component=allocator --tail=100
# Check for allocation errors
kubectl get events --all-namespaces | grep -i allocation
# Test allocation to specific cluster
kubectl apply -f - << EOF
apiVersion: allocation.agones.dev/v1
kind: GameServerAllocation
metadata:
generateName: test-allocation-
spec:
required:
matchLabels:
agones.dev/fleet: test-fleet
EOF
Network Connectivity
# Test DNS resolution
nslookup allocator.cluster-b.example.com
# Test network connectivity
telnet allocator.cluster-b.example.com 443
# Check firewall rules (GKE example)
gcloud compute firewall-rules list | grep 443
# Verify LoadBalancer is provisioned
kubectl get service agones-allocator -n agones-system
Best Practices
Use Priority for Fallback Set primary cluster to priority 0, fallback clusters to higher priorities. This ensures local allocation is always tried first.
Secure Certificates
Use short-lived certificates (90 days)
Rotate certificates before expiration
Store private keys in secret management system
Use cert-manager for automated rotation
Monitor Remote Allocation Track allocation latency and success rate per cluster to identify issues early.
Test Failover Regularly test cluster failover by:
Scaling down Fleets in primary cluster
Simulating network partitions
Verifying allocation falls back correctly
Certificate rotation strategy
# Automate certificate rotation with cert-manager
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: cluster-a-client
spec:
secretName: cluster-a-client-secret
renewBefore: 720h # Renew 30 days before expiration
issuerRef:
name: agones-ca-issuer
kind: ClusterIssuer
Multi-region deployment pattern
Deploy one cluster per major region (US-East, US-West, EU, Asia)
Use equal weights for clusters in same region
Set lower priority for cross-region fallback
Monitor cross-region allocation latency
Next Steps
Monitoring Set up monitoring for multi-cluster metrics
Best Practices Production deployment recommendations