Skip to main content
Multi-cluster allocation enables you to allocate GameServers across multiple Kubernetes clusters, providing:
  • Geographic distribution - Place players closer to game servers
  • Capacity expansion - Allocate from multiple clusters when one is at capacity
  • High availability - Failover to other clusters if one becomes unavailable
  • Cloud provider diversity - Spread workload across multiple providers

Architecture Overview

Multi-cluster allocation uses:
  • GameServerAllocationPolicy - Defines target clusters and priorities
  • Allocation endpoint - gRPC service for remote allocation
  • Client certificates - Mutual TLS authentication between clusters
Allocation always starts in the local cluster. If no GameServers are available, the allocator tries remote clusters based on configured policies.

Setup Prerequisites

1

Multiple Kubernetes clusters

At least 2 Kubernetes clusters with Agones installed:
# Cluster A (primary)
kubectl config use-context cluster-a
helm install agones agones/agones -n agones-system --create-namespace

# Cluster B (secondary)
kubectl config use-context cluster-b
helm install agones agones/agones -n agones-system --create-namespace
2

Network connectivity

Ensure clusters can communicate:
  • Allocator service must be externally accessible (LoadBalancer or Ingress)
  • Firewall rules allow port 443 between clusters
  • DNS resolution between cluster endpoints
3

TLS certificates

Generate certificates for mutual TLS authentication (see below).

Certificate Setup

Multi-cluster allocation requires mutual TLS authentication.

Generate Certificates

#!/bin/bash
# Generate CA
openssl req -x509 -newkey rsa:4096 -nodes \
  -keyout ca.key -out ca.crt \
  -days 3650 -subj "/CN=agones-allocator-ca"

# Generate client certificate for cluster A
openssl req -newkey rsa:4096 -nodes \
  -keyout cluster-a-client.key -out cluster-a-client.csr \
  -subj "/CN=cluster-a-client"

openssl x509 -req -in cluster-a-client.csr \
  -CA ca.crt -CAkey ca.key -CAcreateserial \
  -out cluster-a-client.crt -days 3650

# Generate client certificate for cluster B
openssl req -newkey rsa:4096 -nodes \
  -keyout cluster-b-client.key -out cluster-b-client.csr \
  -subj "/CN=cluster-b-client"

openssl x509 -req -in cluster-b-client.csr \
  -CA ca.crt -CAkey ca.key -CAcreateserial \
  -out cluster-b-client.crt -days 3650

Create Kubernetes Secrets

# In Cluster A - create secret for accessing Cluster B
kubectl create secret tls cluster-b-client-secret \
  --cert=cluster-a-client.crt \
  --key=cluster-a-client.key \
  --dry-run=client -o yaml | kubectl apply -f -

# In Cluster B - create secret for accessing Cluster A
kubectl create secret tls cluster-a-client-secret \
  --cert=cluster-b-client.crt \
  --key=cluster-b-client.key \
  --dry-run=client -o yaml | kubectl apply -f -

Expose Allocator Service

Make the allocator service accessible from other clusters:
# Get allocator service in each cluster
kubectl get service agones-allocator -n agones-system

# Patch to LoadBalancer type
kubectl patch service agones-allocator -n agones-system \
  -p '{"spec":{"type":"LoadBalancer"}}'

# Get external IP
kubectl get service agones-allocator -n agones-system \
  -o jsonpath='{.status.loadBalancer.ingress[0].ip}'

Configure Allocation Policies

Create GameServerAllocationPolicy resources to define remote clusters.

Basic Policy Example

cluster-b-policy.yaml
apiVersion: multicluster.agones.dev/v1
kind: GameServerAllocationPolicy
metadata:
  name: cluster-b-policy
  namespace: default
spec:
  priority: 1
  weight: 100
  connectionInfo:
    clusterName: cluster-b
    allocationEndpoints:
      - allocator.cluster-b.example.com:443
    secretName: cluster-b-client-secret
    namespace: default
    serverCa: LS0tLS1CRUdJTi...  # Base64 encoded CA certificate
  • priority - Lower values tried first (0 = highest priority). Policies with same priority use weighted random selection
  • weight - Relative weight for random selection within same priority
  • clusterName - Logical name for the cluster
  • allocationEndpoints - List of allocator endpoints (tries each until success)
  • secretName - Kubernetes secret containing client certificate and key
  • namespace - Namespace to allocate GameServers from in remote cluster
  • serverCa - Base64-encoded CA certificate to verify remote server

Get Server CA Certificate

# From Cluster B, extract allocator TLS certificate
kubectl get secret allocator-tls -n agones-system \
  -o jsonpath='{.data.tls\.crt}' | base64 -d > allocator-tls.crt

# Extract CA from certificate chain (usually the last certificate)
openssl x509 -in allocator-tls.crt -text

# Base64 encode for policy
cat ca.crt | base64 -w 0

Apply Policy

# Apply in Cluster A to enable allocation from Cluster B
kubectl apply -f cluster-b-policy.yaml

# Verify policy
kubectl get gameserverallocationpolicy
kubectl describe gameserverallocationpolicy cluster-b-policy

Allocation Priority and Fallback

Priority-Based Allocation

Clusters are tried in priority order:
# Cluster B - priority 1 (tried first after local)
apiVersion: multicluster.agones.dev/v1
kind: GameServerAllocationPolicy
metadata:
  name: cluster-b-primary
spec:
  priority: 1
  weight: 100
  connectionInfo:
    clusterName: cluster-b
    allocationEndpoints:
      - allocator.cluster-b.example.com:443
    secretName: cluster-b-secret
    namespace: default
---
# Cluster C - priority 2 (tried second)
apiVersion: multicluster.agones.dev/v1
kind: GameServerAllocationPolicy
metadata:
  name: cluster-c-backup
spec:
  priority: 2
  weight: 100
  connectionInfo:
    clusterName: cluster-c
    allocationEndpoints:
      - allocator.cluster-c.example.com:443
    secretName: cluster-c-secret
    namespace: default

Weighted Distribution

Use weights to distribute load across clusters with same priority:
# US-East cluster - 70% of traffic
apiVersion: multicluster.agones.dev/v1
kind: GameServerAllocationPolicy
metadata:
  name: us-east
spec:
  priority: 1
  weight: 70  # 70% of allocations
  connectionInfo:
    clusterName: us-east-cluster
    allocationEndpoints:
      - allocator.us-east.example.com:443
    secretName: us-east-secret
    namespace: default
---
# US-West cluster - 30% of traffic
apiVersion: multicluster.agones.dev/v1
kind: GameServerAllocationPolicy
metadata:
  name: us-west
spec:
  priority: 1
  weight: 30  # 30% of allocations
  connectionInfo:
    clusterName: us-west-cluster
    allocationEndpoints:
      - allocator.us-west.example.com:443
    secretName: us-west-secret
    namespace: default
Weights are relative. In this example, US-East gets 70/(70+30) = 70% of allocation attempts to priority 1 clusters.

Making Allocation Requests

Allocations work the same way with multi-cluster:
apiVersion: allocation.agones.dev/v1
kind: GameServerAllocation
metadata:
  generateName: multi-cluster-allocation-
spec:
  # Local cluster tried first, then policies by priority
  required:
    matchLabels:
      game: my-game
      mode: battle-royale
  # Preferred criteria (optional)
  preferred:
    - matchLabels:
        agones.dev/fleet: preferred-fleet
  # Metadata to set on allocated GameServer
  metadata:
    labels:
      allocated-by: matchmaker
    annotations:
      player-id: "12345"
The allocation process:
  1. Try local cluster first
  2. If no GameServers available locally, try remote clusters by priority
  3. Within same priority, select cluster using weighted random
  4. Return first successfully allocated GameServer

Geographic-Based Allocation

Implement latency-based allocation by managing policies dynamically:
// Pseudo-code for geographic allocation
func AllocateNearestCluster(playerLocation Location) {
    // Determine nearest clusters
    nearestClusters := GetNearestClusters(playerLocation)
    
    // Create allocation request
    allocation := &allocationv1.GameServerAllocation{
        Spec: allocationv1.GameServerAllocationSpec{
            Required: metav1.LabelSelector{
                MatchLabels: map[string]string{
                    "game": "my-game",
                    "region": nearestClusters[0].Region,
                },
            },
        },
    }
    
    // Submit to nearest cluster
    result := client.Allocate(nearestClusters[0].Name, allocation)
    return result
}

Monitoring Multi-Cluster Allocation

Allocation Metrics by Cluster

# Allocation attempts by cluster (via fleet_name label)
rate(agones_gameserver_allocations_duration_seconds_count[5m]) by (fleet_name)

# Success rate by cluster
sum(rate(agones_gameserver_allocations_duration_seconds_count{status="Allocated"}[5m])) by (fleet_name) /
sum(rate(agones_gameserver_allocations_duration_seconds_count[5m])) by (fleet_name)

# Remote allocation latency
histogram_quantile(0.99,
  sum(rate(agones_gameserver_allocations_duration_seconds_bucket[5m])) by (le, fleet_name)
)

Health Checks

# Test local allocator health
kubectl port-forward -n agones-system svc/agones-allocator 8443:443
curl -k https://localhost:8443/healthz

# Test remote connectivity from Cluster A to Cluster B
kubectl run -it --rm test-client --image=curlimages/curl --restart=Never \
  -- curl -k https://allocator.cluster-b.example.com/healthz

Troubleshooting

Certificate Issues

# Check certificate validity
openssl x509 -in cluster-a-client.crt -text -noout | grep -A 2 Validity

# Verify certificate matches key
openssl x509 -noout -modulus -in cluster-a-client.crt | openssl md5
openssl rsa -noout -modulus -in cluster-a-client.key | openssl md5

# Test TLS connection
openssl s_client -connect allocator.cluster-b.example.com:443 \
  -cert cluster-a-client.crt -key cluster-a-client.key \
  -CAfile ca.crt
Verify serverCa in policy matches actual server certificate:
# Get server CA from endpoint
echo | openssl s_client -connect allocator.cluster-b.example.com:443 2>/dev/null \
  | openssl x509 -outform PEM > server-ca.pem

# Base64 encode
cat server-ca.pem | base64 -w 0

# Update policy with correct serverCa
kubectl edit gameserverallocationpolicy cluster-b-policy

Allocation Failures

# Check policy status
kubectl get gameserverallocationpolicy -o yaml

# View allocator logs in remote cluster
kubectl logs -n agones-system -l app=agones,component=allocator --tail=100

# Check for allocation errors
kubectl get events --all-namespaces | grep -i allocation

# Test allocation to specific cluster
kubectl apply -f - <<EOF
apiVersion: allocation.agones.dev/v1
kind: GameServerAllocation
metadata:
  generateName: test-allocation-
spec:
  required:
    matchLabels:
      agones.dev/fleet: test-fleet
EOF

Network Connectivity

# Test DNS resolution
nslookup allocator.cluster-b.example.com

# Test network connectivity
telnet allocator.cluster-b.example.com 443

# Check firewall rules (GKE example)
gcloud compute firewall-rules list | grep 443

# Verify LoadBalancer is provisioned
kubectl get service agones-allocator -n agones-system

Best Practices

Use Priority for Fallback

Set primary cluster to priority 0, fallback clusters to higher priorities. This ensures local allocation is always tried first.

Secure Certificates

  • Use short-lived certificates (90 days)
  • Rotate certificates before expiration
  • Store private keys in secret management system
  • Use cert-manager for automated rotation

Monitor Remote Allocation

Track allocation latency and success rate per cluster to identify issues early.

Test Failover

Regularly test cluster failover by:
  • Scaling down Fleets in primary cluster
  • Simulating network partitions
  • Verifying allocation falls back correctly
# Automate certificate rotation with cert-manager
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: cluster-a-client
spec:
  secretName: cluster-a-client-secret
  renewBefore: 720h  # Renew 30 days before expiration
  issuerRef:
    name: agones-ca-issuer
    kind: ClusterIssuer
  • Deploy one cluster per major region (US-East, US-West, EU, Asia)
  • Use equal weights for clusters in same region
  • Set lower priority for cross-region fallback
  • Monitor cross-region allocation latency

Next Steps

Monitoring

Set up monitoring for multi-cluster metrics

Best Practices

Production deployment recommendations

Build docs developers (and LLMs) love