agones_ and include labels for filtering by namespace, fleet, and other dimensions.
GameServer Metrics
Metrics tracking individual GameServer instances and their lifecycle.agones_gameservers_count
Type: GaugeDescription: Current number of GameServers by state Labels:
type- GameServer state:Creating,Starting,Scheduled,RequestReady,Ready,Shutdown,Error,Unhealthy,Reserved,Allocatedfleet_name- Name of the parent Fleet (ornonefor standalone GameServers)namespace- Kubernetes namespace
agones_gameservers_total
Type: CounterDescription: Total count of GameServer state transitions (incremented each time a GameServer changes state) Labels:
type- State transitioned tofleet_name- Parent Fleet namenamespace- Kubernetes namespace
Unlike
agones_gameservers_count (gauge), agones_gameservers_total is a counter that only increases, making it ideal for calculating rates of state transitions.agones_gameserver_state_duration
Type: HistogramDescription: Time (in seconds) that GameServers spend in each state Labels:
type- GameServer statefleet_name- Parent Fleet namenamespace- Kubernetes namespace
agones_gameserver_player_connected_total
Type: GaugeDescription: Current number of players connected (requires PlayerTracking feature) Labels:
fleet_name- Parent Fleet namename- GameServer namenamespace- Kubernetes namespace
agones_gameserver_player_capacity_total
Type: GaugeDescription: Available player capacity (max - current) per GameServer Labels:
fleet_name- Parent Fleet namename- GameServer namenamespace- Kubernetes namespace
Fleet Metrics
Metrics for Fleet resources and replica management.agones_fleets_replicas_count
Type: GaugeDescription: Number of replicas in a Fleet by type Labels:
name- Fleet nametype- Replica type:total,allocated,ready,desired,reservednamespace- Kubernetes namespace
agones_fleet_rollout_percent
Type: GaugeDescription: Current progress of Fleet rollout Labels:
name- Fleet nametype-current_replicasordesired_replicasnamespace- Kubernetes namespace
agones_fleet_counters
Type: GaugeDescription: Aggregated Counter values across GameServers in a Fleet (CountersAndLists feature) Labels:
name- Fleet namenamespace- Kubernetes namespacetype-allocated_count,allocated_capacity,total_count,total_capacitycounter- Counter name
agones_fleet_lists
Type: GaugeDescription: Aggregated List values across GameServers in a Fleet (CountersAndLists feature) Labels:
name- Fleet namenamespace- Kubernetes namespacetype-allocated_count,allocated_capacity,total_count,total_capacitylist- List name
Allocation Metrics
Metrics for GameServer allocation requests and performance.agones_gameserver_allocations_duration_seconds
Type: HistogramDescription: Time (in seconds) to complete allocation requests Labels:
fleet_name- Target Fleet namestatus- Allocation result:Allocated,Contention,NoGameServersnamespace- Kubernetes namespace (optional)
FleetAutoscaler Metrics
Metrics for FleetAutoscaler resources and scaling behavior.agones_fleet_autoscalers_current_replicas_count
Type: GaugeDescription: Current number of replicas as seen by the autoscaler Labels:
name- FleetAutoscaler namefleet_name- Target Fleet namenamespace- Kubernetes namespace
agones_fleet_autoscalers_desired_replicas_count
Type: GaugeDescription: Desired number of replicas calculated by the autoscaler Labels:
name- FleetAutoscaler namefleet_name- Target Fleet namenamespace- Kubernetes namespace
agones_fleet_autoscalers_able_to_scale
Type: GaugeDescription: Whether the autoscaler can access the Fleet (1 = true, 0 = false) Labels:
name- FleetAutoscaler namefleet_name- Target Fleet namenamespace- Kubernetes namespace
agones_fleet_autoscalers_limited
Type: GaugeDescription: Whether the autoscaler is limited by min/max constraints (1 = true, 0 = false) Labels:
name- FleetAutoscaler namefleet_name- Target Fleet namenamespace- Kubernetes namespace
agones_fleet_autoscalers_buffer_limits
Type: GaugeDescription: Configured min/max replica limits Labels:
name- FleetAutoscaler nametype-minormaxfleet_name- Target Fleet namenamespace- Kubernetes namespace
agones_fleet_autoscalers_buffer_size
Type: GaugeDescription: Configured buffer size Labels:
name- FleetAutoscaler nametype-count(absolute) orpercentagefleet_name- Target Fleet namenamespace- Kubernetes namespace
Node Metrics
Metrics for Kubernetes node utilization.agones_nodes_count
Type: GaugeDescription: Number of nodes in the cluster Labels:
empty-true(no GameServers) orfalse(has GameServers)
agones_gameservers_node_count
Type: HistogramDescription: Distribution of GameServers per node Buckets: 0.00001, 1.00001, 2.00001, …, 120.00001 Example queries:
System Metrics
Standard Prometheus metrics for Agones controller processes.Process Metrics
process_cpu_seconds_total- Total user and system CPU timeprocess_resident_memory_bytes- Resident memory sizeprocess_open_fds- Number of open file descriptorsprocess_max_fds- Maximum allowed file descriptors
Go Runtime Metrics
go_goroutines- Number of goroutinesgo_threads- Number of OS threadsgo_memstats_alloc_bytes- Bytes allocated and in usego_memstats_heap_objects- Number of allocated objectsgo_gc_duration_seconds- GC pause duration
HTTP Metrics
promhttp_metric_handler_requests_total- Total scrape requestspromhttp_metric_handler_requests_in_flight- Current scrape requests
Metric Retention
Agones metrics are reported with the following characteristics:Reporting Period
- Prometheus: 15 seconds (default)
- Stackdriver: 60 seconds (minimum)
Cardinality
- Fleet-level metrics: ~10-20 per Fleet
- GameServer-level: ~5-10 per GameServer
- Node-level: ~2-5 per node
Metric Labels
Common labels across Agones metrics:| Label | Description | Example Values |
|---|---|---|
name | Resource name | my-fleet, my-autoscaler |
fleet_name | Parent Fleet name | game-servers, none |
namespace | Kubernetes namespace | default, production |
type | State or category | Ready, Allocated, allocated |
status | Operation result | Allocated, NoGameServers |
counter | Counter name | players, rooms |
list | List name | sessions, maps |
Querying Best Practices
Use rate() for counters
Use rate() for counters
Always use
rate() or increase() when querying counter metrics:Choose appropriate time ranges
Choose appropriate time ranges
- Use 5m for real-time monitoring
- Use 1h for trending
- Use 1d+ for capacity planning
Aggregate across dimensions
Aggregate across dimensions
Use aggregation operators to reduce cardinality:
Next Steps
Monitoring Setup
Configure Prometheus and Grafana
Troubleshooting
Debug issues using metrics
