Fleet Autoscaling

FleetAutoscalers automatically adjust the number of GameServer replicas in a Fleet based on various policies. This enables efficient resource utilization while maintaining availability for player demand.

FleetAutoscaler Resource

A FleetAutoscaler watches a Fleet and adjusts its replicas field according to the configured policy: Source: pkg/apis/autoscaling/v1/fleetautoscaler.go:39-67

apiVersion: autoscaling.agones.dev/v1
kind: FleetAutoscaler
metadata:
  name: fleet-autoscaler-example
spec:
  fleetName: fleet-example  # Target Fleet
  policy:
    type: Buffer              # Policy type
    buffer:
      bufferSize: 5
      minReplicas: 10
      maxReplicas: 50
  sync:
    type: FixedInterval
    fixedInterval:
      seconds: 30             # Check every 30 seconds

FleetAutoscalers operate independently of Kubernetes HorizontalPodAutoscaler. They scale based on game-specific metrics rather than CPU/memory.

Autoscaler Policies

FleetAutoscaler supports multiple policy types for different scaling needs: Source: fleetautoscaler.go:105-157

Buffer Policy

Maintain a buffer of Ready GameServers: Source: fleetautoscaler.go:122-125, fleetautoscaler.go:159-181

Absolute Buffer
Percentage Buffer

policy:
  type: Buffer
  buffer:
    bufferSize: 5      # Keep 5 Ready GameServers
    minReplicas: 10    # Never go below 10
    maxReplicas: 50    # Never go above 50

Calculation:

desired = allocated + bufferSize
clamped = max(minReplicas, min(desired, maxReplicas))

Example:

Allocated: 15
BufferSize: 5
Desired: 15 + 5 = 20 replicas

policy:
  type: Buffer
  buffer:
    bufferSize: 20%    # Keep 20% Ready GameServers
    minReplicas: 10    # Required for percentage
    maxReplicas: 50

Calculation:

allocated = 30
buffer = ceil(30 * 0.20) = 6
desired = 30 + 6 = 36 replicas

Validation (fleetautoscaler.go:461-473):

Percentage must be 1-99%
minReplicas must be >= 1 (cannot be 0 with percentage)

How it works:

Count Allocated GameServers

Sum GameServers in Allocated state (and Reserved, which cannot be scaled down).

Calculate Buffer

Determine how many Ready GameServers are needed based on bufferSize.

Compute Desired Replicas

desired = allocated + reserved + buffer

Apply Limits

Clamp to [minReplicas, maxReplicas] range.

Update Fleet

Set fleet.spec.replicas = desired.

Best for: Maintaining consistent availability with predictable buffer

Webhook Policy

Delegate scaling decisions to an external webhook: Source: fleetautoscaler.go:126-128, fleetautoscaler.go:183-185

policy:
  type: Webhook
  webhook:
    service:
      name: autoscaler-webhook-service
      namespace: default
      path: /scale
    # OR use URL for external endpoint
    # url: https://autoscaler.example.com/scale
    caBundle: LS0tLS...  # Optional: CA cert for HTTPS

Request payload (fleetautoscaler.go:338-356):

{
  "uid": "abc-123",
  "name": "fleet-example",
  "namespace": "default",
  "status": {
    "replicas": 20,
    "readyReplicas": 8,
    "allocatedReplicas": 12,
    "reservedReplicas": 0
  },
  "labels": {
    "version": "v1.0"
  },
  "annotations": {}
}

Expected response (fleetautoscaler.go:357-366):

{
  "uid": "abc-123",
  "scale": true,
  "replicas": 25
}

Response fields:

uid: Must match request UID
scale: true to scale, false to skip
replicas: Desired replica count

Validation (fleetautoscaler.go:413-440):

Must specify either service or url, not both
caBundle must be valid PEM-encoded certificate (if provided)
URL must be valid (if using URL)

Best for: Custom scaling logic based on external metrics (matchmaking queue, player count, business hours)

Counter Policy (Beta)

Scale based on aggregated Counter capacity across the Fleet: Source: fleetautoscaler.go:130-133, fleetautoscaler.go:187-206

policy:
  type: Counter
  counter:
    key: rooms              # Counter name
    bufferSize: 10          # Available capacity buffer
    minCapacity: 100        # Min aggregate capacity
    maxCapacity: 1000       # Max aggregate capacity

How it works:

Aggregate Counter capacity across all GameServers in Fleet:

totalCapacity = sum(gs.status.counters["rooms"].capacity)
totalCount = sum(gs.status.counters["rooms"].count)
available = totalCapacity - totalCount

Calculate desired capacity:

desired = totalCount + bufferSize
clamped = max(minCapacity, min(desired, maxCapacity))

Determine replicas needed:

avgCapacity = totalCapacity / currentReplicas
newReplicas = ceil(clamped / avgCapacity)

Validation (fleetautoscaler.go:477-516):

minCapacity must be < maxCapacity
bufferSize must be > 0
If percentage buffer: minCapacity must be > 0
maxCapacity must be >= bufferSize

Feature Gate: CountsAndLists Best for: Scaling based on rooms, sessions, or custom capacity metrics

List Policy (Beta)

Scale based on aggregated List capacity: Source: fleetautoscaler.go:134-137, fleetautoscaler.go:208-227

policy:
  type: List
  list:
    key: players            # List name
    bufferSize: 50          # Available capacity buffer
    minCapacity: 200        # Min aggregate capacity
    maxCapacity: 2000       # Max aggregate capacity

How it works: Similar to Counter policy, but uses List capacity:

Aggregate List capacity:

totalCapacity = sum(gs.status.lists["players"].capacity)
totalCount = sum(len(gs.status.lists["players"].values))
available = totalCapacity - totalCount

Calculate desired capacity (same as Counter)
Determine replicas needed (same as Counter)

Validation (fleetautoscaler.go:518-553):

Same rules as Counter policy

Feature Gate: CountsAndLists Best for: Scaling based on player lists, session participants, or slot-based capacity

Schedule Policy (Beta)

Apply different policies at different times: Source: fleetautoscaler.go:139-143, fleetautoscaler.go:229-270

policy:
  type: Schedule
  schedule:
    between:
      start: "2026-03-10T08:00:00Z"  # RFC3339 format
      end: "2026-03-10T20:00:00Z"
    activePeriod:
      timezone: "America/Los_Angeles"
      startCron: "0 8 * * *"          # Every day at 8 AM
      duration: "12h"                 # Active for 12 hours
    policy:
      type: Buffer
      buffer:
        bufferSize: 10
        minReplicas: 50   # Higher capacity during active hours
        maxReplicas: 100

Fields:

between

Defines the overall time window when this policy can be active.

start: Policy can activate (RFC3339 format)
end: Policy stops (RFC3339 format)
If start is empty or in the past, policy is immediately eligible
If end is empty, policy never expires

activePeriod

Defines recurring activation schedule using cron syntax.

timezone: Timezone for cron evaluation (e.g., “America/Los_Angeles”, “UTC”)
startCron: When to activate (UNIX cron syntax)
duration: How long to stay active (e.g., “12h”, “90m”)

Cron format: minute hour day month weekdayExamples:

"0 8 * * *": Every day at 8:00 AM
"0 */4 * * *": Every 4 hours
"0 8 * * 1-5": Weekdays at 8:00 AM

Validation (fleetautoscaler.go:556-600):

end must be after current time
end must be after start
startCron must be valid cron syntax
Cannot use CRON_TZ or TZ in cron (use timezone field)
duration must be valid Go duration format
timezone must be valid (checked against IANA database)

Feature Gate: ScheduledAutoscaler Best for: Predictable load patterns (business hours, weekend events, maintenance windows)

Chain Policy (Beta)

Combine multiple policies with fallback logic: Source: fleetautoscaler.go:145-148, fleetautoscaler.go:272-282

policy:
  type: Chain
  chain:
    - id: peak-hours
      type: Schedule
      schedule:
        activePeriod:
          startCron: "0 16 * * *"  # 4 PM daily
          duration: "4h"
        policy:
          type: Buffer
          buffer:
            bufferSize: 20
            minReplicas: 50
            maxReplicas: 200
    
    - id: off-peak
      type: Buffer
      buffer:
        bufferSize: 5
        minReplicas: 10
        maxReplicas: 50

How it works:

Evaluate in Order

Policies are evaluated in array order (top to bottom).

First Match Wins

The first policy that is active determines scaling.

Fallback

If no earlier policies are active, use the last policy.

Use case: Different scaling during peak/off-peak hours, maintenance windows, or special events Validation (fleetautoscaler.go:602-629):

All id fields must be unique
Each entry must have a valid policy
Nested chain policies are allowed

Feature Gate: ScheduledAutoscaler Best for: Complex scheduling scenarios with multiple time-based policies

Wasm Policy (Alpha)

Run WebAssembly code for custom scaling logic: Source: fleetautoscaler.go:149-152, fleetautoscaler.go:284-304

policy:
  type: Wasm
  wasm:
    function: scale              # Exported function name
    config:
      min_replicas: "10"
      max_replicas: "50"
      custom_metric: "player_count"
    from:
      url:
        service:
          name: wasm-module-server
          namespace: default
          path: /autoscaler.wasm
    hash: "sha256:abc123..."     # Optional: verify integrity

Wasm module interface: Your Wasm module should export a function that:

Receives Fleet status and config as input
Returns desired replica count

Validation (fleetautoscaler.go:631-649):

Must specify from.url
URL must be valid
function defaults to “scale” if not specified

Feature Gate: WasmAutoscaler Best for: Highly custom scaling logic without deploying external services

Sync Configuration

Control how often autoscaling runs: Source: fleetautoscaler.go:109-120, fleetautoscaler.go:306-310

spec:
  sync:
    type: FixedInterval
    fixedInterval:
      seconds: 30  # Check every 30 seconds

Default (fleetautoscaler.go:156, fleetautoscaler.go:663-674):

Type: FixedInterval
Seconds: 30

Validation (fleetautoscaler.go:652-661):

seconds must be > 0

Lower intervals provide faster response to load changes but increase API server load. For most use cases, 30-60 seconds is appropriate.

Autoscaler Status

Monitor autoscaler decisions and state: Source: fleetautoscaler.go:312-336

status:
  currentReplicas: 25        # Current Fleet size
  desiredReplicas: 30        # Calculated desired size
  lastScaleTime: "2026-03-10T10:15:30Z"
  ableToScale: true          # Can access Fleet
  scalingLimited: false      # Hit min/max limit
  lastAppliedPolicy: Buffer  # Which policy is active

Fields:

currentReplicas: Latest known Fleet replica count
desiredReplicas: What autoscaler calculated
lastScaleTime: When Fleet was last scaled
ableToScale: Whether autoscaler can read/update Fleet
scalingLimited: Whether scaling was clamped by min/max
lastAppliedPolicy: Active policy type (for Chain policy)

Query autoscaler status:

# Check if scaling is working
kubectl get fleetautoscaler <name> -o jsonpath='{.status.ableToScale}'

# See desired vs current
kubectl get fleetautoscaler <name> -o jsonpath='{.status.desiredReplicas} / {.status.currentReplicas}'

# Check if limited by min/max
kubectl get fleetautoscaler <name> -o jsonpath='{.status.scalingLimited}'

Working with FleetAutoscalers

Create a FleetAutoscaler

kubectl apply -f fleetautoscaler.yaml
kubectl get fleetautoscaler
kubectl get fas  # Short name

Monitor Autoscaling

# Watch autoscaler status
kubectl get fas <name> -w

# See scaling events
kubectl describe fas <name>

# Check controller logs
kubectl logs -n agones-system deploy/agones-controller -c agones-controller | grep autoscal

Update Autoscaler Policy

# Change buffer size
kubectl patch fas <name> --type=merge -p '
{
  "spec": {
    "policy": {
      "buffer": {
        "bufferSize": 10
      }
    }
  }
}'

# Change min/max replicas
kubectl patch fas <name> --type=merge -p '
{
  "spec": {
    "policy": {
      "buffer": {
        "minReplicas": 20,
        "maxReplicas": 100
      }
    }
  }
}'

Delete FleetAutoscaler

kubectl delete fas <name>

Deleting a FleetAutoscaler does not change the Fleet’s replica count. The Fleet will remain at its current size until manually scaled.

Common Patterns

Gradual Scale-Up, Fast Scale-Down

# Use webhook to implement custom logic
policy:
  type: Webhook
  webhook:
    url: https://autoscaler.example.com/scale

Webhook logic:

def scale(request):
    allocated = request['status']['allocatedReplicas']
    ready = request['status']['readyReplicas']
    current = request['status']['replicas']
    
    # Fast scale down: remove excess ready GameServers quickly
    if ready > 10:
        return {'scale': True, 'replicas': allocated + 5}
    
    # Gradual scale up: add slowly to avoid over-provisioning
    if ready < 5:
        return {'scale': True, 'replicas': current + 2}
    
    # No change
    return {'scale': False, 'replicas': current}

Business Hours Scaling

policy:
  type: Chain
  chain:
    - id: business-hours
      type: Schedule
      schedule:
        between:
          start: "2026-01-01T00:00:00Z"
          end: "2026-12-31T23:59:59Z"
        activePeriod:
          timezone: "America/New_York"
          startCron: "0 9 * * 1-5"  # Weekdays at 9 AM
          duration: "9h"             # Until 6 PM
        policy:
          type: Buffer
          buffer:
            bufferSize: 20
            minReplicas: 50
            maxReplicas: 200
    
    - id: after-hours
      type: Buffer
      buffer:
        bufferSize: 5
        minReplicas: 5
        maxReplicas: 20

Multi-Metric Scaling

Use webhook to combine multiple metrics:

def scale(fleet_status):
    # Combine allocation rate + queue depth + time of day
    allocation_rate = calculate_rate(fleet_status)
    queue_depth = get_matchmaking_queue_depth()
    hour = datetime.now().hour
    
    # Peak hours (6 PM - 10 PM)
    if 18 <= hour <= 22:
        target_buffer = 20
    else:
        target_buffer = 10
    
    # Adjust for queue
    if queue_depth > 50:
        target_buffer += 10
    
    desired = fleet_status['allocatedReplicas'] + target_buffer
    return {'scale': True, 'replicas': clamp(desired, 10, 100)}

Best Practices

Set appropriate min/max bounds

Protect against runaway scaling and ensure minimum availability.

Start with Buffer policy

Simple and effective for most use cases before moving to complex policies.

Monitor scaling metrics

Track allocation rate, ready count, and scaling events to tune policies.

Test policies under load

Simulate traffic patterns to verify autoscaler behavior before production.

Troubleshooting

Autoscaler not scaling Fleet

# Check ableToScale status
kubectl get fas <name> -o jsonpath='{.status.ableToScale}'

# Verify Fleet exists
kubectl get fleet <fleet-name>

# Check RBAC permissions
kubectl auth can-i update fleets --as=system:serviceaccount:agones-system:agones-controller

# View autoscaler events
kubectl describe fas <name>

Scaling too aggressively

# Increase sync interval
kubectl patch fas <name> --type=merge -p '{"spec":{"sync":{"fixedInterval":{"seconds":60}}}}'

# Adjust min/max bounds
kubectl patch fas <name> --type=merge -p '{"spec":{"policy":{"buffer":{"maxReplicas":50}}}}'

# Review buffer size
kubectl get fas <name> -o jsonpath='{.spec.policy.buffer.bufferSize}'

Webhook policy not working

# Test webhook manually
curl -X POST https://webhook-url/scale \
  -H "Content-Type: application/json" \
  -d @test-request.json

# Check webhook service exists
kubectl get svc <webhook-service-name>

# View webhook logs
kubectl logs deploy/<webhook-deployment>

# Verify CA bundle (if using HTTPS)
kubectl get fas <name> -o jsonpath='{.spec.policy.webhook.caBundle}' | base64 -d | openssl x509 -text

Counter/List policy not scaling

# Verify Counter/List exists on GameServers
kubectl get gs -l agones.dev/fleet=<fleet> -o jsonpath='{.items[0].status.counters}'
kubectl get gs -l agones.dev/fleet=<fleet> -o jsonpath='{.items[0].status.lists}'

# Check aggregated values in Fleet status
kubectl get fleet <fleet> -o jsonpath='{.status.counters}'
kubectl get fleet <fleet> -o jsonpath='{.status.lists}'

# Verify feature gate is enabled
kubectl get pod -n agones-system -l app=agones-controller -o yaml | grep CountsAndLists

Next Steps

Webhook Autoscaler Guide

Build a custom webhook autoscaler

Metrics and Monitoring

Monitor autoscaler performance

Counter/List Features

Deep dive into Counters and Lists

Cost Optimization

Use autoscaling to reduce infrastructure costs

Get Started

Core Concepts

Installation

Game Server Integration

Client SDKs

Operations

Advanced

​FleetAutoscaler Resource

​Autoscaler Policies

​Buffer Policy

​Webhook Policy

​Counter Policy (Beta)

​List Policy (Beta)

​Schedule Policy (Beta)

​Chain Policy (Beta)

​Wasm Policy (Alpha)

​Sync Configuration

​Autoscaler Status

​Working with FleetAutoscalers

​Create a FleetAutoscaler

​Monitor Autoscaling

​Update Autoscaler Policy

​Delete FleetAutoscaler

​Common Patterns

​Gradual Scale-Up, Fast Scale-Down

​Business Hours Scaling

​Multi-Metric Scaling

​Best Practices

Set appropriate min/max bounds

Start with Buffer policy

Monitor scaling metrics

Test policies under load

​Troubleshooting

​Autoscaler not scaling Fleet

​Scaling too aggressively

​Webhook policy not working

​Counter/List policy not scaling

​Next Steps

Webhook Autoscaler Guide

Metrics and Monitoring

Counter/List Features

Cost Optimization

Build docs developers (and LLMs) love

FleetAutoscaler Resource

Autoscaler Policies

Buffer Policy

Webhook Policy

Counter Policy (Beta)

List Policy (Beta)

Schedule Policy (Beta)

Chain Policy (Beta)

Wasm Policy (Alpha)

Sync Configuration

Autoscaler Status

Working with FleetAutoscalers

Create a FleetAutoscaler

Monitor Autoscaling

Update Autoscaler Policy

Delete FleetAutoscaler

Common Patterns

Gradual Scale-Up, Fast Scale-Down

Business Hours Scaling

Multi-Metric Scaling

Best Practices

Troubleshooting

Autoscaler not scaling Fleet

Scaling too aggressively

Webhook policy not working

Counter/List policy not scaling

Next Steps