Skip to main content
Agones exports comprehensive metrics through OpenCensus to Prometheus and Stackdriver. All metrics are prefixed with agones_ and include labels for filtering by namespace, fleet, and other dimensions.

GameServer Metrics

Metrics tracking individual GameServer instances and their lifecycle.

agones_gameservers_count

Type: Gauge
Description: Current number of GameServers by state
Labels:
  • type - GameServer state: Creating, Starting, Scheduled, RequestReady, Ready, Shutdown, Error, Unhealthy, Reserved, Allocated
  • fleet_name - Name of the parent Fleet (or none for standalone GameServers)
  • namespace - Kubernetes namespace
Example queries:
# Total ready GameServers
agones_gameservers_count{type="Ready"}

# Allocated GameServers in a specific fleet
agones_gameservers_count{type="Allocated", fleet_name="my-fleet", namespace="default"}

# Total GameServers across all states
sum(agones_gameservers_count)

agones_gameservers_total

Type: Counter
Description: Total count of GameServer state transitions (incremented each time a GameServer changes state)
Labels:
  • type - State transitioned to
  • fleet_name - Parent Fleet name
  • namespace - Kubernetes namespace
Example queries:
# Rate of GameServers entering Ready state
rate(agones_gameservers_total{type="Ready"}[5m])

# Error rate per fleet
rate(agones_gameservers_total{type="Error"}[5m]) by (fleet_name)
Unlike agones_gameservers_count (gauge), agones_gameservers_total is a counter that only increases, making it ideal for calculating rates of state transitions.

agones_gameserver_state_duration

Type: Histogram
Description: Time (in seconds) that GameServers spend in each state
Labels:
  • type - GameServer state
  • fleet_name - Parent Fleet name
  • namespace - Kubernetes namespace
Buckets: 0, 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384 seconds Example queries:
# Average time in Ready state
sum(rate(agones_gameserver_state_duration_sum{type="Ready"}[5m])) /
sum(rate(agones_gameserver_state_duration_count{type="Ready"}[5m]))

# 95th percentile time in Starting state
histogram_quantile(0.95, 
  sum(rate(agones_gameserver_state_duration_bucket{type="Starting"}[5m])) by (le)
)

agones_gameserver_player_connected_total

Type: Gauge
Description: Current number of players connected (requires PlayerTracking feature)
Labels:
  • fleet_name - Parent Fleet name
  • name - GameServer name
  • namespace - Kubernetes namespace

agones_gameserver_player_capacity_total

Type: Gauge
Description: Available player capacity (max - current) per GameServer
Labels:
  • fleet_name - Parent Fleet name
  • name - GameServer name
  • namespace - Kubernetes namespace

Fleet Metrics

Metrics for Fleet resources and replica management.

agones_fleets_replicas_count

Type: Gauge
Description: Number of replicas in a Fleet by type
Labels:
  • name - Fleet name
  • type - Replica type: total, allocated, ready, desired, reserved
  • namespace - Kubernetes namespace
Example queries:
# Total replicas vs desired
agones_fleets_replicas_count{name="my-fleet", type="total"}
agones_fleets_replicas_count{name="my-fleet", type="desired"}

# Allocation percentage
(agones_fleets_replicas_count{type="allocated"} /
 agones_fleets_replicas_count{type="total"}) * 100

# Available capacity
agones_fleets_replicas_count{type="ready"} - 
agones_fleets_replicas_count{type="allocated"}

agones_fleet_rollout_percent

Type: Gauge
Description: Current progress of Fleet rollout
Labels:
  • name - Fleet name
  • type - current_replicas or desired_replicas
  • namespace - Kubernetes namespace
Example queries:
# Rollout completion percentage
(sum(agones_fleet_rollout_percent{type="current_replicas"}) /
 sum(agones_fleet_rollout_percent{type="desired_replicas"})) * 100

agones_fleet_counters

Type: Gauge
Description: Aggregated Counter values across GameServers in a Fleet (CountersAndLists feature)
Labels:
  • name - Fleet name
  • namespace - Kubernetes namespace
  • type - allocated_count, allocated_capacity, total_count, total_capacity
  • counter - Counter name

agones_fleet_lists

Type: Gauge
Description: Aggregated List values across GameServers in a Fleet (CountersAndLists feature)
Labels:
  • name - Fleet name
  • namespace - Kubernetes namespace
  • type - allocated_count, allocated_capacity, total_count, total_capacity
  • list - List name

Allocation Metrics

Metrics for GameServer allocation requests and performance.

agones_gameserver_allocations_duration_seconds

Type: Histogram
Description: Time (in seconds) to complete allocation requests
Labels:
  • fleet_name - Target Fleet name
  • status - Allocation result: Allocated, Contention, NoGameServers
  • namespace - Kubernetes namespace (optional)
Buckets: Standard exponential buckets Example queries:
# Average allocation latency
sum(rate(agones_gameserver_allocations_duration_seconds_sum[5m])) /
sum(rate(agones_gameserver_allocations_duration_seconds_count[5m]))

# 99th percentile latency
histogram_quantile(0.99,
  sum(rate(agones_gameserver_allocations_duration_seconds_bucket[5m])) by (le)
)

# Allocation success rate
sum(rate(agones_gameserver_allocations_duration_seconds_count{status="Allocated"}[5m])) /
sum(rate(agones_gameserver_allocations_duration_seconds_count[5m]))

# Error rate by status
rate(agones_gameserver_allocations_duration_seconds_count{status!="Allocated"}[5m])
High allocation latency (>3 seconds at 99th percentile) may indicate insufficient ready GameServers or resource constraints.

FleetAutoscaler Metrics

Metrics for FleetAutoscaler resources and scaling behavior.

agones_fleet_autoscalers_current_replicas_count

Type: Gauge
Description: Current number of replicas as seen by the autoscaler
Labels:
  • name - FleetAutoscaler name
  • fleet_name - Target Fleet name
  • namespace - Kubernetes namespace

agones_fleet_autoscalers_desired_replicas_count

Type: Gauge
Description: Desired number of replicas calculated by the autoscaler
Labels:
  • name - FleetAutoscaler name
  • fleet_name - Target Fleet name
  • namespace - Kubernetes namespace

agones_fleet_autoscalers_able_to_scale

Type: Gauge
Description: Whether the autoscaler can access the Fleet (1 = true, 0 = false)
Labels:
  • name - FleetAutoscaler name
  • fleet_name - Target Fleet name
  • namespace - Kubernetes namespace

agones_fleet_autoscalers_limited

Type: Gauge
Description: Whether the autoscaler is limited by min/max constraints (1 = true, 0 = false)
Labels:
  • name - FleetAutoscaler name
  • fleet_name - Target Fleet name
  • namespace - Kubernetes namespace

agones_fleet_autoscalers_buffer_limits

Type: Gauge
Description: Configured min/max replica limits
Labels:
  • name - FleetAutoscaler name
  • type - min or max
  • fleet_name - Target Fleet name
  • namespace - Kubernetes namespace

agones_fleet_autoscalers_buffer_size

Type: Gauge
Description: Configured buffer size
Labels:
  • name - FleetAutoscaler name
  • type - count (absolute) or percentage
  • fleet_name - Target Fleet name
  • namespace - Kubernetes namespace
Example queries:
# Autoscaler scaling gap
agones_fleet_autoscalers_desired_replicas_count -
agones_fleet_autoscalers_current_replicas_count

# Time autoscaler has been limited
changes(agones_fleet_autoscalers_limited[1h])

Node Metrics

Metrics for Kubernetes node utilization.

agones_nodes_count

Type: Gauge
Description: Number of nodes in the cluster
Labels:
  • empty - true (no GameServers) or false (has GameServers)
Example queries:
# Total nodes available for GameServers
sum(agones_nodes_count)

# Node utilization percentage
(agones_nodes_count{empty="false"} / sum(agones_nodes_count)) * 100

agones_gameservers_node_count

Type: Histogram
Description: Distribution of GameServers per node
Buckets: 0.00001, 1.00001, 2.00001, …, 120.00001 Example queries:
# Maximum GameServers on any node
histogram_quantile(1.0, sum(rate(agones_gameservers_node_count_bucket[1m])) by (le))

# Average GameServers per node
avg(delta(agones_gameservers_node_count_sum[1m]) / 
    delta(agones_gameservers_node_count_count[1m]))

# 90th percentile node density
histogram_quantile(0.90, sum(rate(agones_gameservers_node_count_bucket[1m])) by (le))

System Metrics

Standard Prometheus metrics for Agones controller processes.

Process Metrics

  • process_cpu_seconds_total - Total user and system CPU time
  • process_resident_memory_bytes - Resident memory size
  • process_open_fds - Number of open file descriptors
  • process_max_fds - Maximum allowed file descriptors

Go Runtime Metrics

  • go_goroutines - Number of goroutines
  • go_threads - Number of OS threads
  • go_memstats_alloc_bytes - Bytes allocated and in use
  • go_memstats_heap_objects - Number of allocated objects
  • go_gc_duration_seconds - GC pause duration

HTTP Metrics

  • promhttp_metric_handler_requests_total - Total scrape requests
  • promhttp_metric_handler_requests_in_flight - Current scrape requests

Metric Retention

Agones metrics are reported with the following characteristics:

Reporting Period

  • Prometheus: 15 seconds (default)
  • Stackdriver: 60 seconds (minimum)

Cardinality

  • Fleet-level metrics: ~10-20 per Fleet
  • GameServer-level: ~5-10 per GameServer
  • Node-level: ~2-5 per node

Metric Labels

Common labels across Agones metrics:
LabelDescriptionExample Values
nameResource namemy-fleet, my-autoscaler
fleet_nameParent Fleet namegame-servers, none
namespaceKubernetes namespacedefault, production
typeState or categoryReady, Allocated, allocated
statusOperation resultAllocated, NoGameServers
counterCounter nameplayers, rooms
listList namesessions, maps

Querying Best Practices

Always use rate() or increase() when querying counter metrics:
# Good
rate(agones_gameservers_total[5m])

# Bad - raw counter value is not useful
agones_gameservers_total
  • Use 5m for real-time monitoring
  • Use 1h for trending
  • Use 1d+ for capacity planning
# Real-time allocation rate
rate(agones_gameserver_allocations_duration_seconds_count[5m])

# Daily allocation trend
rate(agones_gameserver_allocations_duration_seconds_count[1d])
Use aggregation operators to reduce cardinality:
# Total ready GameServers across all fleets
sum(agones_gameservers_count{type="Ready"})

# Ready GameServers per fleet
sum by (fleet_name) (agones_gameservers_count{type="Ready"})

# Ready GameServers per namespace
sum by (namespace) (agones_gameservers_count{type="Ready"})

Next Steps

Monitoring Setup

Configure Prometheus and Grafana

Troubleshooting

Debug issues using metrics

Build docs developers (and LLMs) love