Performance Tuning

Overview

This guide covers performance optimization techniques for running Agones at scale, including controller tuning, resource management, and cluster optimization.

Controller Performance

API Server QPS Tuning

The Agones controller can be configured to adjust its rate of requests to the Kubernetes API server:

helm install agones agones/agones \
  --set agones.controller.apiServerQPS=400 \
  --set agones.controller.apiServerQPSBurst=500

Default values are QPS=400 and Burst=500. Increase these for larger clusters with thousands of game servers.

From the allocator source code (cmd/allocator/main.go:99-100):

viper.SetDefault(apiServerSustainedQPSFlag, 400)
viper.SetDefault(apiServerBurstQPSFlag, 500)

Worker Queue Configuration

Agones uses multiple specialized worker queues for different operations:

# Controller deployment configuration
apiVersion: apps/v1
kind: Deployment
metadata:
  name: agones-controller
spec:
  template:
    spec:
      containers:
      - name: agones-controller
        env:
        # Number of workers for general operations
        - name: NUM_WORKERS
          value: "100"
        # Separate workers for creation operations
        - name: CREATION_WORKERS
          value: "50"
        # Separate workers for deletion operations  
        - name: DELETION_WORKERS
          value: "50"

Increasing workers improves parallelism but also increases API server load. Balance based on your cluster capacity.

Allocation Batch Processing

The allocator batches allocation requests to improve throughput:

helm install agones agones/agones \
  --set agones.allocator.allocationBatchWaitTime=500ms

From cmd/allocator/main.go:110:

viper.SetDefault(allocationBatchWaitTime, 500*time.Millisecond)

Lower values decrease latency but reduce batching efficiency. Higher values increase throughput but add latency.

Resource Optimization

Controller Resources

Optimize controller resource allocation based on cluster size:

# For small clusters (< 100 game servers)
resources:
  requests:
    cpu: 100m
    memory: 256Mi
  limits:
    cpu: 500m
    memory: 512Mi

# For medium clusters (100-1000 game servers)
resources:
  requests:
    cpu: 500m
    memory: 512Mi
  limits:
    cpu: 1000m
    memory: 1Gi

# For large clusters (1000+ game servers)
resources:
  requests:
    cpu: 1000m
    memory: 1Gi
  limits:
    cpu: 2000m
    memory: 2Gi

Helm configuration:

helm install agones agones/agones \
  --set agones.controller.resources.requests.cpu=1000m \
  --set agones.controller.resources.requests.memory=1Gi \
  --set agones.controller.resources.limits.cpu=2000m \
  --set agones.controller.resources.limits.memory=2Gi

Sidecar Resource Tuning

The SDK sidecar runs alongside every game server. Optimize its resources:

helm install agones agones/agones \
  --set agones.sdkServer.sidecar.resources.requests.cpu=50m \
  --set agones.sdkServer.sidecar.resources.requests.memory=64Mi \
  --set agones.sdkServer.sidecar.resources.limits.cpu=100m \
  --set agones.sdkServer.sidecar.resources.limits.memory=128Mi

For minimal overhead:

sidecar:
  resources:
    requests:
      cpu: 30m      # Minimum viable
      memory: 32Mi  # Minimum viable
    limits:
      cpu: 50m
      memory: 64Mi

SDK Rate Limiting

Limit SDK request rate to prevent sidecar overload:

helm install agones agones/agones \
  --set agones.sdkServer.sidecar.requestsRateLimit=100

This sets a limit of 100 requests per second per sidecar.

Port Allocation Performance

Port Range Configuration

From pkg/portallocator/portallocator.go:64-84, the port allocator manages dynamic port assignment:

func New(portRanges map[string]PortRange,
    kubeInformerFactory informers.SharedInformerFactory,
    agonesInformerFactory externalversions.SharedInformerFactory) Interface {
    return newAllocator(portRanges, kubeInformerFactory, agonesInformerFactory)
}

type PortRange struct {
    MinPort int32
    MaxPort int32
}

Optimize port ranges for your workload:

# Default range
helm install agones agones/agones \
  --set agones.controller.portRange=7000-8000  # 1000 ports

# Large deployment
helm install agones agones/agones \
  --set agones.controller.portRange=7000-17000  # 10000 ports

Each node can support hundreds of game servers with the right port range. Calculate: (MaxPort - MinPort) / PortsPerGameServer = Max GameServers per Node.

Static Port Policy

Use Static port policy to skip dynamic allocation:

apiVersion: agones.dev/v1
kind: GameServer
spec:
  ports:
  - name: default
    portPolicy: Static  # No dynamic allocation overhead
    hostPort: 7654
    containerPort: 7654

Benefits:

No port allocator overhead
Predictable port numbers
Faster GameServer creation

Drawbacks:

Manual port management
Port conflicts possible
Less flexible scaling

Network Performance

Pod Network Optimization

Use host networking for maximum performance:

apiVersion: agones.dev/v1
kind: GameServer
spec:
  template:
    spec:
      hostNetwork: true  # Bypass pod network overlay
      dnsPolicy: ClusterFirstWithHostNet

Host networking limits one GameServer per port per node and has security implications. Use cautiously.

Bypass kube-proxy

For latency-sensitive workloads, use PortPolicy None to bypass kube-proxy:

apiVersion: agones.dev/v1
kind: GameServer
spec:
  ports:
  - name: game
    portPolicy: None  # No hostPort, direct to containerPort
    containerPort: 7654
    protocol: UDP

Clients connect directly to the pod IP, bypassing NodePort overhead.

Allocation Performance

Allocation Strategy

Choose the right scheduling strategy for your use case:

apiVersion: allocation.agones.dev/v1
kind: GameServerAllocation
spec:
  scheduling: Packed  # Bin-packing for cloud (default)

From pkg/apis/scheduling.go:18-30:

const (
    // Packed scheduling strategy will prioritise allocating GameServers
    // on Nodes with the most Allocated, and then Ready GameServers
    // to bin pack as many Allocated GameServers on a single node.
    // This is most useful for dynamic Kubernetes clusters - such as on Cloud Providers.
    Packed SchedulingStrategy = "Packed"

    // Distributed scheduling strategy will prioritise allocating GameServers
    // on Nodes with the least Allocated, and then Ready GameServers
    // to distribute Allocated GameServers across many nodes.
    // This is most useful for statically sized Kubernetes clusters - such as on physical hardware.
    Distributed SchedulingStrategy = "Distributed"
)

Packed (Cloud environments):

Maximizes node utilization
Enables aggressive scale-down
Reduces infrastructure costs

Distributed (On-premises/bare metal):

Spreads load across all nodes
Better fault tolerance
More consistent performance

Allocation Caching

The allocator maintains a cache of Ready game servers:

helm install agones agones/agones \
  --set agones.allocator.remoteAllocationTimeout=10s \
  --set agones.allocator.totalRemoteAllocationTimeout=30s

From cmd/allocator/main.go:107-108:

viper.SetDefault(remoteAllocationTimeoutFlag, 10*time.Second)
viper.SetDefault(totalRemoteAllocationTimeoutFlag, 30*time.Second)

Fleet Scaling Performance

Buffer Size Optimization

Maintain a buffer of Ready game servers:

apiVersion: agones.dev/v1
kind: Fleet
metadata:
  name: game-fleet
spec:
  replicas: 100
  # Keep 20% Ready for instant allocation
  # 80 Allocated + 20 Ready = 100 total

With autoscaling:

apiVersion: autoscaling.agones.dev/v1
kind: FleetAutoscaler
metadata:
  name: game-fleet-autoscaler
spec:
  fleetName: game-fleet
  policy:
    type: Buffer
    buffer:
      bufferSize: 20      # Keep 20 Ready servers
      minReplicas: 10     # Never scale below 10
      maxReplicas: 1000   # Never scale above 1000

Rolling Update Strategy

Optimize Fleet updates:

apiVersion: agones.dev/v1
kind: Fleet
spec:
  replicas: 100
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 25%        # Create 25 new before deleting old
      maxUnavailable: 25%  # Allow 25 to be unavailable during update

For zero-downtime updates:

strategy:
  type: RollingUpdate
  rollingUpdate:
    maxSurge: 100%      # Double capacity during rollout
    maxUnavailable: 0%  # Never reduce capacity

Metrics and Monitoring

Enable Prometheus Metrics

helm install agones agones/agones \
  --set agones.metrics.prometheusEnabled=true \
  --set agones.metrics.prometheusServiceDiscovery=true

Key metrics to monitor:

# Controller queue depth
workqueue_depth{name="gameservers"}

# Allocation latency
allocation_duration_seconds

# GameServer state distribution
агones_gameservers_count{state="Ready"}
агones_gameservers_count{state="Allocated"}

# Fleet desired vs current
агones_fleet_replicas_total
агones_gameservers_total{fleet="my-fleet"}

# Node utilization
агones_nodes_count
агones_gameservers_node_count

Performance Profiling

Enable pprof for the controller:

env:
- name: ENABLE_PPROF
  value: "true"

Access profiling endpoints:

# CPU profile
kubectl port-forward -n agones-system deploy/agones-controller 6060:6060
curl http://localhost:6060/debug/pprof/profile > cpu.prof
go tool pprof cpu.prof

# Memory profile
curl http://localhost:6060/debug/pprof/heap > mem.prof
go tool pprof mem.prof

# Goroutine profile
curl http://localhost:6060/debug/pprof/goroutine > goroutine.prof

Cluster-Level Optimization

Node Configuration

Optimize nodes for game server workloads:

# Node labels for game server placement
kubectl label nodes <node-name> \
  agones.dev/gameserver=true \
  node.kubernetes.io/instance-type=c5.2xlarge

Use taints to dedicate nodes:

kubectl taint nodes <node-name> \
  agones.dev/gameserver=true:NoSchedule

Then configure GameServers with tolerations:

spec:
  template:
    spec:
      tolerations:
      - key: agones.dev/gameserver
        operator: Equal
        value: "true"
        effect: NoSchedule

Cluster Autoscaling

Configure cluster autoscaler for game server nodes:

# GKE example
gcloud container node-pools create game-servers \
  --cluster=my-cluster \
  --enable-autoscaling \
  --min-nodes=3 \
  --max-nodes=100 \
  --machine-type=c2-standard-4 \
  --node-labels=agones.dev/gameserver=true

Set appropriate scale-down delay:

apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-autoscaler
data:
  scale-down-delay-after-add: "10m"
  scale-down-unneeded-time: "10m"

etcd Performance

For large Agones deployments, tune etcd:

# Increase etcd quota (Kubernetes control plane)
--quota-backend-bytes=8589934592  # 8GB (default is 2GB)

# Enable etcd metrics
--metrics=extensive

Monitor etcd health:

ETCDCTL_API=3 etcdctl endpoint status --cluster
ETCDCTL_API=3 etcdctl endpoint health --cluster

Performance Testing

Load Testing Allocations

#!/bin/bash
# Stress test allocations
for i in {1..1000}; do
  kubectl create -f - <<EOF &
apiVersion: allocation.agones.dev/v1
kind: GameServerAllocation
metadata:
  generateName: load-test-
spec:
  selectors:
  - matchLabels:
      agones.dev/fleet: game-fleet
EOF
done
wait

Measure Allocation Latency

time kubectl create -f gameserverallocation.yaml

Fleet Scale Testing

# Scale to 1000 game servers
kubectl scale fleet game-fleet --replicas=1000

# Measure time to Ready
watch kubectl get fleet game-fleet

Performance Checklist

Right-size Controller

Set appropriate CPU/memory based on cluster size

Tune API QPS

Increase QPS limits for large clusters (>1000 game servers)

Optimize Sidecar

Minimize sidecar resources while maintaining stability

Choose Strategy

Use Packed for cloud, Distributed for on-premises

Buffer Sizing

Maintain adequate Ready buffer for instant allocations

Monitor Metrics

Set up Prometheus and alert on queue depth, allocation latency

Cluster Autoscaling

Configure node autoscaling with appropriate delays

Load Test

Test allocation throughput before production launch

Get Started

Core Concepts

Installation

Game Server Integration

Client SDKs

Operations

Advanced

​Overview

​Controller Performance

​API Server QPS Tuning

​Worker Queue Configuration

​Allocation Batch Processing

​Resource Optimization

​Controller Resources

​Sidecar Resource Tuning

​SDK Rate Limiting

​Port Allocation Performance

​Port Range Configuration

​Static Port Policy

​Network Performance

​Pod Network Optimization

​Bypass kube-proxy

​Allocation Performance

​Allocation Strategy

​Allocation Caching

​Fleet Scaling Performance

​Buffer Size Optimization

​Rolling Update Strategy

​Metrics and Monitoring

​Enable Prometheus Metrics

​Performance Profiling

​Cluster-Level Optimization

​Node Configuration

​Cluster Autoscaling

​etcd Performance

​Performance Testing

​Load Testing Allocations

​Measure Allocation Latency

​Fleet Scale Testing

​Performance Checklist

​Related Resources

Build docs developers (and LLMs) love

Overview

Controller Performance

API Server QPS Tuning

Worker Queue Configuration

Allocation Batch Processing

Resource Optimization

Controller Resources

Sidecar Resource Tuning

SDK Rate Limiting

Port Allocation Performance

Port Range Configuration

Static Port Policy

Network Performance

Pod Network Optimization

Bypass kube-proxy

Allocation Performance

Allocation Strategy

Allocation Caching

Fleet Scaling Performance

Buffer Size Optimization

Rolling Update Strategy

Metrics and Monitoring

Enable Prometheus Metrics

Performance Profiling

Cluster-Level Optimization

Node Configuration

Cluster Autoscaling

etcd Performance

Performance Testing

Load Testing Allocations

Measure Allocation Latency

Fleet Scale Testing

Performance Checklist

Related Resources