Skip to main content
FleetAutoscalers automatically adjust the number of GameServer replicas in a Fleet based on various policies. This enables efficient resource utilization while maintaining availability for player demand.

FleetAutoscaler Resource

A FleetAutoscaler watches a Fleet and adjusts its replicas field according to the configured policy: Source: pkg/apis/autoscaling/v1/fleetautoscaler.go:39-67
apiVersion: autoscaling.agones.dev/v1
kind: FleetAutoscaler
metadata:
  name: fleet-autoscaler-example
spec:
  fleetName: fleet-example  # Target Fleet
  policy:
    type: Buffer              # Policy type
    buffer:
      bufferSize: 5
      minReplicas: 10
      maxReplicas: 50
  sync:
    type: FixedInterval
    fixedInterval:
      seconds: 30             # Check every 30 seconds
FleetAutoscalers operate independently of Kubernetes HorizontalPodAutoscaler. They scale based on game-specific metrics rather than CPU/memory.

Autoscaler Policies

FleetAutoscaler supports multiple policy types for different scaling needs: Source: fleetautoscaler.go:105-157

Buffer Policy

Maintain a buffer of Ready GameServers: Source: fleetautoscaler.go:122-125, fleetautoscaler.go:159-181
policy:
  type: Buffer
  buffer:
    bufferSize: 5      # Keep 5 Ready GameServers
    minReplicas: 10    # Never go below 10
    maxReplicas: 50    # Never go above 50
Calculation:
desired = allocated + bufferSize
clamped = max(minReplicas, min(desired, maxReplicas))
Example:
  • Allocated: 15
  • BufferSize: 5
  • Desired: 15 + 5 = 20 replicas
How it works:
1

Count Allocated GameServers

Sum GameServers in Allocated state (and Reserved, which cannot be scaled down).
2

Calculate Buffer

Determine how many Ready GameServers are needed based on bufferSize.
3

Compute Desired Replicas

desired = allocated + reserved + buffer
4

Apply Limits

Clamp to [minReplicas, maxReplicas] range.
5

Update Fleet

Set fleet.spec.replicas = desired.
Best for: Maintaining consistent availability with predictable buffer

Webhook Policy

Delegate scaling decisions to an external webhook: Source: fleetautoscaler.go:126-128, fleetautoscaler.go:183-185
policy:
  type: Webhook
  webhook:
    service:
      name: autoscaler-webhook-service
      namespace: default
      path: /scale
    # OR use URL for external endpoint
    # url: https://autoscaler.example.com/scale
    caBundle: LS0tLS...  # Optional: CA cert for HTTPS
Request payload (fleetautoscaler.go:338-356):
{
  "uid": "abc-123",
  "name": "fleet-example",
  "namespace": "default",
  "status": {
    "replicas": 20,
    "readyReplicas": 8,
    "allocatedReplicas": 12,
    "reservedReplicas": 0
  },
  "labels": {
    "version": "v1.0"
  },
  "annotations": {}
}
Expected response (fleetautoscaler.go:357-366):
{
  "uid": "abc-123",
  "scale": true,
  "replicas": 25
}
Response fields:
  • uid: Must match request UID
  • scale: true to scale, false to skip
  • replicas: Desired replica count
Validation (fleetautoscaler.go:413-440):
  • Must specify either service or url, not both
  • caBundle must be valid PEM-encoded certificate (if provided)
  • URL must be valid (if using URL)
Best for: Custom scaling logic based on external metrics (matchmaking queue, player count, business hours)

Counter Policy (Beta)

Scale based on aggregated Counter capacity across the Fleet: Source: fleetautoscaler.go:130-133, fleetautoscaler.go:187-206
policy:
  type: Counter
  counter:
    key: rooms              # Counter name
    bufferSize: 10          # Available capacity buffer
    minCapacity: 100        # Min aggregate capacity
    maxCapacity: 1000       # Max aggregate capacity
How it works:
  1. Aggregate Counter capacity across all GameServers in Fleet:
    totalCapacity = sum(gs.status.counters["rooms"].capacity)
    totalCount = sum(gs.status.counters["rooms"].count)
    available = totalCapacity - totalCount
    
  2. Calculate desired capacity:
    desired = totalCount + bufferSize
    clamped = max(minCapacity, min(desired, maxCapacity))
    
  3. Determine replicas needed:
    avgCapacity = totalCapacity / currentReplicas
    newReplicas = ceil(clamped / avgCapacity)
    
Validation (fleetautoscaler.go:477-516):
  • minCapacity must be < maxCapacity
  • bufferSize must be > 0
  • If percentage buffer: minCapacity must be > 0
  • maxCapacity must be >= bufferSize
Feature Gate: CountsAndLists Best for: Scaling based on rooms, sessions, or custom capacity metrics

List Policy (Beta)

Scale based on aggregated List capacity: Source: fleetautoscaler.go:134-137, fleetautoscaler.go:208-227
policy:
  type: List
  list:
    key: players            # List name
    bufferSize: 50          # Available capacity buffer
    minCapacity: 200        # Min aggregate capacity
    maxCapacity: 2000       # Max aggregate capacity
How it works: Similar to Counter policy, but uses List capacity:
  1. Aggregate List capacity:
    totalCapacity = sum(gs.status.lists["players"].capacity)
    totalCount = sum(len(gs.status.lists["players"].values))
    available = totalCapacity - totalCount
    
  2. Calculate desired capacity (same as Counter)
  3. Determine replicas needed (same as Counter)
Validation (fleetautoscaler.go:518-553):
  • Same rules as Counter policy
Feature Gate: CountsAndLists Best for: Scaling based on player lists, session participants, or slot-based capacity

Schedule Policy (Beta)

Apply different policies at different times: Source: fleetautoscaler.go:139-143, fleetautoscaler.go:229-270
policy:
  type: Schedule
  schedule:
    between:
      start: "2026-03-10T08:00:00Z"  # RFC3339 format
      end: "2026-03-10T20:00:00Z"
    activePeriod:
      timezone: "America/Los_Angeles"
      startCron: "0 8 * * *"          # Every day at 8 AM
      duration: "12h"                 # Active for 12 hours
    policy:
      type: Buffer
      buffer:
        bufferSize: 10
        minReplicas: 50   # Higher capacity during active hours
        maxReplicas: 100
Fields:
Defines the overall time window when this policy can be active.
  • start: Policy can activate (RFC3339 format)
  • end: Policy stops (RFC3339 format)
  • If start is empty or in the past, policy is immediately eligible
  • If end is empty, policy never expires
Defines recurring activation schedule using cron syntax.
  • timezone: Timezone for cron evaluation (e.g., “America/Los_Angeles”, “UTC”)
  • startCron: When to activate (UNIX cron syntax)
  • duration: How long to stay active (e.g., “12h”, “90m”)
Cron format: minute hour day month weekdayExamples:
  • "0 8 * * *": Every day at 8:00 AM
  • "0 */4 * * *": Every 4 hours
  • "0 8 * * 1-5": Weekdays at 8:00 AM
Validation (fleetautoscaler.go:556-600):
  • end must be after current time
  • end must be after start
  • startCron must be valid cron syntax
  • Cannot use CRON_TZ or TZ in cron (use timezone field)
  • duration must be valid Go duration format
  • timezone must be valid (checked against IANA database)
Feature Gate: ScheduledAutoscaler Best for: Predictable load patterns (business hours, weekend events, maintenance windows)

Chain Policy (Beta)

Combine multiple policies with fallback logic: Source: fleetautoscaler.go:145-148, fleetautoscaler.go:272-282
policy:
  type: Chain
  chain:
    - id: peak-hours
      type: Schedule
      schedule:
        activePeriod:
          startCron: "0 16 * * *"  # 4 PM daily
          duration: "4h"
        policy:
          type: Buffer
          buffer:
            bufferSize: 20
            minReplicas: 50
            maxReplicas: 200
    
    - id: off-peak
      type: Buffer
      buffer:
        bufferSize: 5
        minReplicas: 10
        maxReplicas: 50
How it works:
1

Evaluate in Order

Policies are evaluated in array order (top to bottom).
2

First Match Wins

The first policy that is active determines scaling.
3

Fallback

If no earlier policies are active, use the last policy.
Use case: Different scaling during peak/off-peak hours, maintenance windows, or special events Validation (fleetautoscaler.go:602-629):
  • All id fields must be unique
  • Each entry must have a valid policy
  • Nested chain policies are allowed
Feature Gate: ScheduledAutoscaler Best for: Complex scheduling scenarios with multiple time-based policies

Wasm Policy (Alpha)

Run WebAssembly code for custom scaling logic: Source: fleetautoscaler.go:149-152, fleetautoscaler.go:284-304
policy:
  type: Wasm
  wasm:
    function: scale              # Exported function name
    config:
      min_replicas: "10"
      max_replicas: "50"
      custom_metric: "player_count"
    from:
      url:
        service:
          name: wasm-module-server
          namespace: default
          path: /autoscaler.wasm
    hash: "sha256:abc123..."     # Optional: verify integrity
Wasm module interface: Your Wasm module should export a function that:
  • Receives Fleet status and config as input
  • Returns desired replica count
Validation (fleetautoscaler.go:631-649):
  • Must specify from.url
  • URL must be valid
  • function defaults to “scale” if not specified
Feature Gate: WasmAutoscaler Best for: Highly custom scaling logic without deploying external services

Sync Configuration

Control how often autoscaling runs: Source: fleetautoscaler.go:109-120, fleetautoscaler.go:306-310
spec:
  sync:
    type: FixedInterval
    fixedInterval:
      seconds: 30  # Check every 30 seconds
Default (fleetautoscaler.go:156, fleetautoscaler.go:663-674):
  • Type: FixedInterval
  • Seconds: 30
Validation (fleetautoscaler.go:652-661):
  • seconds must be > 0
Lower intervals provide faster response to load changes but increase API server load. For most use cases, 30-60 seconds is appropriate.

Autoscaler Status

Monitor autoscaler decisions and state: Source: fleetautoscaler.go:312-336
status:
  currentReplicas: 25        # Current Fleet size
  desiredReplicas: 30        # Calculated desired size
  lastScaleTime: "2026-03-10T10:15:30Z"
  ableToScale: true          # Can access Fleet
  scalingLimited: false      # Hit min/max limit
  lastAppliedPolicy: Buffer  # Which policy is active
Fields:
  • currentReplicas: Latest known Fleet replica count
  • desiredReplicas: What autoscaler calculated
  • lastScaleTime: When Fleet was last scaled
  • ableToScale: Whether autoscaler can read/update Fleet
  • scalingLimited: Whether scaling was clamped by min/max
  • lastAppliedPolicy: Active policy type (for Chain policy)
Query autoscaler status:
# Check if scaling is working
kubectl get fleetautoscaler <name> -o jsonpath='{.status.ableToScale}'

# See desired vs current
kubectl get fleetautoscaler <name> -o jsonpath='{.status.desiredReplicas} / {.status.currentReplicas}'

# Check if limited by min/max
kubectl get fleetautoscaler <name> -o jsonpath='{.status.scalingLimited}'

Working with FleetAutoscalers

Create a FleetAutoscaler

kubectl apply -f fleetautoscaler.yaml
kubectl get fleetautoscaler
kubectl get fas  # Short name

Monitor Autoscaling

# Watch autoscaler status
kubectl get fas <name> -w

# See scaling events
kubectl describe fas <name>

# Check controller logs
kubectl logs -n agones-system deploy/agones-controller -c agones-controller | grep autoscal

Update Autoscaler Policy

# Change buffer size
kubectl patch fas <name> --type=merge -p '
{
  "spec": {
    "policy": {
      "buffer": {
        "bufferSize": 10
      }
    }
  }
}'

# Change min/max replicas
kubectl patch fas <name> --type=merge -p '
{
  "spec": {
    "policy": {
      "buffer": {
        "minReplicas": 20,
        "maxReplicas": 100
      }
    }
  }
}'

Delete FleetAutoscaler

kubectl delete fas <name>
Deleting a FleetAutoscaler does not change the Fleet’s replica count. The Fleet will remain at its current size until manually scaled.

Common Patterns

Gradual Scale-Up, Fast Scale-Down

# Use webhook to implement custom logic
policy:
  type: Webhook
  webhook:
    url: https://autoscaler.example.com/scale
Webhook logic:
def scale(request):
    allocated = request['status']['allocatedReplicas']
    ready = request['status']['readyReplicas']
    current = request['status']['replicas']
    
    # Fast scale down: remove excess ready GameServers quickly
    if ready > 10:
        return {'scale': True, 'replicas': allocated + 5}
    
    # Gradual scale up: add slowly to avoid over-provisioning
    if ready < 5:
        return {'scale': True, 'replicas': current + 2}
    
    # No change
    return {'scale': False, 'replicas': current}

Business Hours Scaling

policy:
  type: Chain
  chain:
    - id: business-hours
      type: Schedule
      schedule:
        between:
          start: "2026-01-01T00:00:00Z"
          end: "2026-12-31T23:59:59Z"
        activePeriod:
          timezone: "America/New_York"
          startCron: "0 9 * * 1-5"  # Weekdays at 9 AM
          duration: "9h"             # Until 6 PM
        policy:
          type: Buffer
          buffer:
            bufferSize: 20
            minReplicas: 50
            maxReplicas: 200
    
    - id: after-hours
      type: Buffer
      buffer:
        bufferSize: 5
        minReplicas: 5
        maxReplicas: 20

Multi-Metric Scaling

Use webhook to combine multiple metrics:
def scale(fleet_status):
    # Combine allocation rate + queue depth + time of day
    allocation_rate = calculate_rate(fleet_status)
    queue_depth = get_matchmaking_queue_depth()
    hour = datetime.now().hour
    
    # Peak hours (6 PM - 10 PM)
    if 18 <= hour <= 22:
        target_buffer = 20
    else:
        target_buffer = 10
    
    # Adjust for queue
    if queue_depth > 50:
        target_buffer += 10
    
    desired = fleet_status['allocatedReplicas'] + target_buffer
    return {'scale': True, 'replicas': clamp(desired, 10, 100)}

Best Practices

Set appropriate min/max bounds

Protect against runaway scaling and ensure minimum availability.

Start with Buffer policy

Simple and effective for most use cases before moving to complex policies.

Monitor scaling metrics

Track allocation rate, ready count, and scaling events to tune policies.

Test policies under load

Simulate traffic patterns to verify autoscaler behavior before production.

Troubleshooting

Autoscaler not scaling Fleet

# Check ableToScale status
kubectl get fas <name> -o jsonpath='{.status.ableToScale}'

# Verify Fleet exists
kubectl get fleet <fleet-name>

# Check RBAC permissions
kubectl auth can-i update fleets --as=system:serviceaccount:agones-system:agones-controller

# View autoscaler events
kubectl describe fas <name>

Scaling too aggressively

# Increase sync interval
kubectl patch fas <name> --type=merge -p '{"spec":{"sync":{"fixedInterval":{"seconds":60}}}}'

# Adjust min/max bounds
kubectl patch fas <name> --type=merge -p '{"spec":{"policy":{"buffer":{"maxReplicas":50}}}}'

# Review buffer size
kubectl get fas <name> -o jsonpath='{.spec.policy.buffer.bufferSize}'

Webhook policy not working

# Test webhook manually
curl -X POST https://webhook-url/scale \
  -H "Content-Type: application/json" \
  -d @test-request.json

# Check webhook service exists
kubectl get svc <webhook-service-name>

# View webhook logs
kubectl logs deploy/<webhook-deployment>

# Verify CA bundle (if using HTTPS)
kubectl get fas <name> -o jsonpath='{.spec.policy.webhook.caBundle}' | base64 -d | openssl x509 -text

Counter/List policy not scaling

# Verify Counter/List exists on GameServers
kubectl get gs -l agones.dev/fleet=<fleet> -o jsonpath='{.items[0].status.counters}'
kubectl get gs -l agones.dev/fleet=<fleet> -o jsonpath='{.items[0].status.lists}'

# Check aggregated values in Fleet status
kubectl get fleet <fleet> -o jsonpath='{.status.counters}'
kubectl get fleet <fleet> -o jsonpath='{.status.lists}'

# Verify feature gate is enabled
kubectl get pod -n agones-system -l app=agones-controller -o yaml | grep CountsAndLists

Next Steps

Webhook Autoscaler Guide

Build a custom webhook autoscaler

Metrics and Monitoring

Monitor autoscaler performance

Counter/List Features

Deep dive into Counters and Lists

Cost Optimization

Use autoscaling to reduce infrastructure costs

Build docs developers (and LLMs) love