Scheduling Strategies

Overview

Agones provides two scheduling strategies that determine how game servers are distributed across your Kubernetes cluster: Packed and Distributed. The choice of strategy significantly impacts resource utilization, costs, and fault tolerance.

Scheduling Strategies Defined

From the Agones source code (pkg/apis/scheduling.go:18-30):

const (
    // Packed scheduling strategy will prioritise allocating GameServers
    // on Nodes with the most Allocated, and then Ready GameServers
    // to bin pack as many Allocated GameServers on a single node.
    // This is most useful for dynamic Kubernetes clusters - such as on Cloud Providers.
    // In future versions, this will also impact Fleet scale down, and Pod Scheduling.
    Packed SchedulingStrategy = "Packed"

    // Distributed scheduling strategy will prioritise allocating GameServers
    // on Nodes with the least Allocated, and then Ready GameServers
    // to distribute Allocated GameServers across many nodes.
    // This is most useful for statically sized Kubernetes clusters - such as on physical hardware.
    // In future versions, this will also impact Fleet scale down, and Pod Scheduling.
    Distributed SchedulingStrategy = "Distributed"
)

Packed Strategy

How It Works

Packed strategy prioritizes bin-packing game servers onto the fewest possible nodes:

Allocates from nodes with the most already-allocated game servers
Fills nodes completely before moving to the next node
Enables aggressive cluster scale-down
Minimizes the number of active nodes

Implementation Details

From pkg/gameserverallocations/find.go:44-50:

switch gsa.Spec.Scheduling {
case apis.Packed:
    loop = func(list []*agonesv1.GameServer, f func(i int, gs *agonesv1.GameServer)) {
        for i, gs := range list {
            f(i, gs)
        }
    }

Packed scheduling iterates through the game server list in order, which has already been sorted to prioritize fuller nodes.

Use Cases

Best for:

Cloud environments (GKE, EKS, AKS)
Auto-scaling clusters
Cost optimization
Development/testing environments

Example configuration:

apiVersion: agones.dev/v1
kind: Fleet
metadata:
  name: cloud-fleet
spec:
  replicas: 100
  scheduling: Packed  # Default
  template:
    spec:
      ports:
      - name: default
        containerPort: 7654
        portPolicy: Dynamic
      template:
        spec:
          containers:
          - name: game-server
            image: my-game:latest

Allocation Example

apiVersion: allocation.agones.dev/v1
kind: GameServerAllocation
metadata:
  name: packed-allocation
spec:
  scheduling: Packed
  selectors:
  - matchLabels:
      agones.dev/fleet: cloud-fleet

Visual Representation

Before Allocation:
Node 1: [GS1] [GS2] [GS3] [ ]  # 3 allocated
Node 2: [GS4] [GS5] [ ] [ ]     # 2 allocated  
Node 3: [ ] [ ] [ ] [ ]         # 0 allocated

After Packed Allocation:
Node 1: [GS1] [GS2] [GS3] [NEW] # Fills Node 1 first
Node 2: [GS4] [GS5] [ ] [ ]     # Unchanged
Node 3: [ ] [ ] [ ] [ ]         # Unchanged

Benefit: Node 3 can be scaled down

Performance Characteristics

Node Utilization: Very high (80-95%)
Scale-down Efficiency: Excellent
Fault Impact: Higher (more game servers per node)
Network Locality: Better (more servers co-located)
Allocation Speed: Fast (linear scan)

Distributed Strategy

How It Works

Distributed strategy prioritizes spreading game servers evenly across all nodes:

Allocates from nodes with the least already-allocated game servers
Distributes load evenly across the entire cluster
Maximizes fault tolerance
Optimizes for consistent performance

Implementation Details

From pkg/gameserverallocations/find.go:51-78:

case apis.Distributed:
    // randomised looping - make a list of indices, and then randomise them
    // as we don't want to change the order of the gameserver slice
    if !runtime.FeatureEnabled(runtime.FeatureCountsAndLists) || len(gsa.Spec.Priorities) == 0 {
        l := len(list)
        indices := make([]int, l)
        for i := 0; i < l; i++ {
            indices[i] = i
        }
        rand.Shuffle(l, func(i, j int) {
            indices[i], indices[j] = indices[j], indices[i]
        })

        loop = func(list []*agonesv1.GameServer, f func(i int, gs *agonesv1.GameServer)) {
            for _, i := range indices {
                f(i, list[i])
            }
        }
    }

Distributed scheduling uses randomization to spread allocations, or priority-based ordering when custom priorities are defined.

Use Cases

Best for:

On-premises clusters
Bare metal infrastructure
Fixed-size clusters
High-availability requirements
Latency-sensitive workloads

Example configuration:

apiVersion: agones.dev/v1
kind: Fleet
metadata:
  name: onprem-fleet
spec:
  replicas: 100
  scheduling: Distributed
  template:
    spec:
      ports:
      - name: default
        containerPort: 7654
        portPolicy: Dynamic
      template:
        spec:
          containers:
          - name: game-server
            image: my-game:latest

Allocation Example

apiVersion: allocation.agones.dev/v1
kind: GameServerAllocation
metadata:
  name: distributed-allocation
spec:
  scheduling: Distributed
  selectors:
  - matchLabels:
      agones.dev/fleet: onprem-fleet

Visual Representation

Before Allocation:
Node 1: [GS1] [GS2] [GS3] [ ]  # 3 allocated
Node 2: [GS4] [GS5] [ ] [ ]     # 2 allocated
Node 3: [ ] [ ] [ ] [ ]         # 0 allocated

After Distributed Allocation:
Node 1: [GS1] [GS2] [GS3] [ ]  # Unchanged
Node 2: [GS4] [GS5] [ ] [ ]     # Unchanged
Node 3: [NEW] [ ] [ ] [ ]       # Fills Node 3 (least utilized)

Benefit: Load is evenly distributed

Performance Characteristics

Node Utilization: Moderate (40-60%)
Scale-down Efficiency: Poor (all nodes partially utilized)
Fault Impact: Lower (fewer game servers per node)
Network Locality: Worse (servers spread out)
Allocation Speed: Moderate (randomized scan)

Comparison Table

Aspect	Packed	Distributed
Primary Goal	Minimize costs	Maximize availability
Infrastructure	Cloud / Auto-scaling	On-premises / Fixed
Node Utilization	80-95%	40-60%
Fault Tolerance	Lower	Higher
Scale Down	Aggressive	Conservative
Cost	Lower	Higher
Node Failure Impact	Many servers affected	Few servers affected
Best For	Dev, Test, Cloud	Production, Bare Metal

Fleet-Level Configuration

Set the default scheduling strategy for a Fleet:

apiVersion: agones.dev/v1
kind: Fleet
metadata:
  name: my-fleet
spec:
  replicas: 100
  scheduling: Packed  # or Distributed
  template:
    # GameServer template

Allocation-Level Override

Override the Fleet’s default at allocation time:

apiVersion: allocation.agones.dev/v1
kind: GameServerAllocation
spec:
  # Override Fleet default
  scheduling: Distributed
  selectors:
  - matchLabels:
      agones.dev/fleet: my-fleet

The scheduling strategy in GameServerAllocation takes precedence over the Fleet’s default.

Advanced: Priority-Based Scheduling

With the CountsAndLists feature, you can define custom priorities:

apiVersion: allocation.agones.dev/v1
kind: GameServerAllocation
spec:
  scheduling: Distributed
  selectors:
  - matchLabels:
      agones.dev/fleet: my-fleet
  # Custom priorities for fine-grained control
  priorities:
  - type: Counter
    key: rooms
    order: Ascending  # Prefer servers with fewer rooms
  - type: List
    key: players
    order: Descending  # Then prefer servers with more players

From pkg/gameserverallocations/find.go:69-77:

// For FeatureCountsAndLists we do not do randomized looping -- instead choose the game
// server based on the list of Priorities. (The order in which the game servers were sorted
// in ListSortedGameServersPriorities.)
loop = func(list []*agonesv1.GameServer, f func(i int, gs *agonesv1.GameServer)) {
    for i, gs := range list {
        f(i, gs)
    }
}

Priorities override the random distribution in Distributed mode.

Strategy Selection Guide

Assess Infrastructure

Cloud with autoscaling? → PackedOn-premises or fixed cluster? → Distributed

Evaluate Cost Sensitivity

Cost optimization critical? → PackedAvailability more important? → Distributed

Consider Fault Tolerance

Can tolerate node failures affecting many servers? → PackedNeed minimal impact per failure? → Distributed

Test and Measure

Run load tests with both strategiesMeasure costs, availability, and performance

Best Practices

Use Packed for Cloud Environments

# GKE/EKS/AKS recommended configuration
apiVersion: agones.dev/v1
kind: Fleet
spec:
  scheduling: Packed
  # Enable cluster autoscaler to scale nodes down

Use Distributed for On-Premises

# Bare metal recommended configuration
apiVersion: agones.dev/v1
kind: Fleet
spec:
  scheduling: Distributed
  # Spread load across all available nodes

Combine with Node Affinity

apiVersion: agones.dev/v1
kind: Fleet
spec:
  scheduling: Packed
  template:
    spec:
      template:
        spec:
          affinity:
            nodeAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                nodeSelectorTerms:
                - matchExpressions:
                  - key: agones.dev/gameserver
                    operator: In
                    values:
                    - "true"

Monitor Strategy Effectiveness

# Prometheus queries

# Average game servers per node
avg(agones_gameservers_node_count) by (node)

# Node utilization distribution
histogram_quantile(0.95, 
  sum(rate(agones_gameservers_node_count[5m])) by (le)
)

# Allocation latency by strategy
avg(allocation_duration_seconds) by (scheduling_strategy)

Changing Strategies

Changing a Fleet’s scheduling strategy does not immediately redistribute existing game servers. It only affects new allocations.

To redistribute:

# Update Fleet
kubectl patch fleet my-fleet -p '{"spec":{"scheduling":"Distributed"}}'

# Trigger recreation (if needed)
kubectl scale fleet my-fleet --replicas=0
kubectl scale fleet my-fleet --replicas=100

Get Started

Core Concepts

Installation

Game Server Integration

Client SDKs

Operations

Advanced

Scheduling Strategies

Overview