Overview
Gate is designed to scale horizontally, allowing you to handle thousands of concurrent players by adding more proxy instances. This guide covers autoscaling strategies and load balancing configurations.
Horizontal Pod Autoscaling
Prerequisites
Ensure the Metrics Server is installed:
# Check if metrics-server is running
kubectl get deployment metrics-server -n kube-system
# If not installed, deploy it
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
CPU-Based Autoscaling
Create an HPA based on CPU utilization:
apiVersion : autoscaling/v2
kind : HorizontalPodAutoscaler
metadata :
name : gate-hpa
labels :
app.kubernetes.io/name : gate
spec :
scaleTargetRef :
apiVersion : apps/v1
kind : Deployment
name : gate
minReplicas : 2
maxReplicas : 10
metrics :
- type : Resource
resource :
name : cpu
target :
type : Utilization
averageUtilization : 70
behavior :
scaleDown :
stabilizationWindowSeconds : 300
policies :
- type : Percent
value : 50
periodSeconds : 60
scaleUp :
stabilizationWindowSeconds : 60
policies :
- type : Percent
value : 100
periodSeconds : 30
- type : Pods
value : 2
periodSeconds : 30
selectPolicy : Max
Apply the HPA:
kubectl apply -f gate-hpa.yaml
# Monitor autoscaling
kubectl get hpa gate-hpa --watch
Memory-Based Autoscaling
Scale based on memory usage:
apiVersion : autoscaling/v2
kind : HorizontalPodAutoscaler
metadata :
name : gate-hpa-memory
spec :
scaleTargetRef :
apiVersion : apps/v1
kind : Deployment
name : gate
minReplicas : 2
maxReplicas : 10
metrics :
- type : Resource
resource :
name : memory
target :
type : Utilization
averageUtilization : 80
Combined Metrics
Scale based on multiple metrics:
apiVersion : autoscaling/v2
kind : HorizontalPodAutoscaler
metadata :
name : gate-hpa-combined
spec :
scaleTargetRef :
apiVersion : apps/v1
kind : Deployment
name : gate
minReplicas : 2
maxReplicas : 20
metrics :
# CPU metric
- type : Resource
resource :
name : cpu
target :
type : Utilization
averageUtilization : 70
# Memory metric
- type : Resource
resource :
name : memory
target :
type : Utilization
averageUtilization : 80
# Advanced scaling behavior
behavior :
scaleDown :
stabilizationWindowSeconds : 300
policies :
- type : Percent
value : 50
periodSeconds : 60
- type : Pods
value : 1
periodSeconds : 120
selectPolicy : Min
scaleUp :
stabilizationWindowSeconds : 0
policies :
- type : Percent
value : 100
periodSeconds : 15
- type : Pods
value : 4
periodSeconds : 15
selectPolicy : Max
Custom Metrics
Using Prometheus Adapter
Scale based on custom metrics like active player count:
Install Prometheus and the adapter:
# Add Prometheus Helm repo
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
# Install Prometheus
helm install prometheus prometheus-community/kube-prometheus-stack
# Install Prometheus Adapter
helm install prometheus-adapter prometheus-community/prometheus-adapter
Configure custom metrics:
apiVersion : autoscaling/v2
kind : HorizontalPodAutoscaler
metadata :
name : gate-hpa-players
spec :
scaleTargetRef :
apiVersion : apps/v1
kind : Deployment
name : gate
minReplicas : 2
maxReplicas : 15
metrics :
# Scale based on active players per pod
- type : Pods
pods :
metric :
name : gate_active_players
target :
type : AverageValue
averageValue : "100" # 100 players per pod
Load Balancing Strategies
NodePort Service
Basic load balancing using NodePort:
apiVersion : v1
kind : Service
metadata :
name : gate
labels :
app.kubernetes.io/name : gate
spec :
type : NodePort
selector :
app.kubernetes.io/component : proxy
app.kubernetes.io/name : gate
ports :
- port : 25565
targetPort : minecraft
protocol : TCP
name : minecraft
nodePort : 32556
# Session affinity for player connections
sessionAffinity : ClientIP
sessionAffinityConfig :
clientIP :
timeoutSeconds : 3600
NodePort exposes the service on each node’s IP at a static port. Use this for testing or when you have external load balancing.
LoadBalancer Service
Cloud provider load balancer:
apiVersion : v1
kind : Service
metadata :
name : gate
labels :
app.kubernetes.io/name : gate
annotations :
# Preserve client source IP
service.beta.kubernetes.io/aws-load-balancer-type : "nlb"
service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled : "true"
spec :
type : LoadBalancer
selector :
app.kubernetes.io/component : proxy
app.kubernetes.io/name : gate
ports :
- port : 25565
targetPort : minecraft
protocol : TCP
name : minecraft
# Preserve client IP for proper player tracking
externalTrafficPolicy : Local
sessionAffinity : ClientIP
sessionAffinityConfig :
clientIP :
timeoutSeconds : 7200 # 2 hours
Using externalTrafficPolicy: Local preserves client IPs but may cause uneven load distribution. Consider this trade-off based on your requirements.
Cloud-Specific Configurations
AWS Network Load Balancer
apiVersion : v1
kind : Service
metadata :
name : gate
annotations :
service.beta.kubernetes.io/aws-load-balancer-type : "nlb"
service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled : "true"
service.beta.kubernetes.io/aws-load-balancer-backend-protocol : "tcp"
service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout : "3600"
# Use internal NLB for private networks
# service.beta.kubernetes.io/aws-load-balancer-internal: "true"
spec :
type : LoadBalancer
externalTrafficPolicy : Local
selector :
app.kubernetes.io/component : proxy
ports :
- port : 25565
targetPort : minecraft
protocol : TCP
Google Cloud Load Balancer
apiVersion : v1
kind : Service
metadata :
name : gate
annotations :
cloud.google.com/load-balancer-type : "External"
# For internal load balancer
# cloud.google.com/load-balancer-type: "Internal"
spec :
type : LoadBalancer
externalTrafficPolicy : Local
selector :
app.kubernetes.io/component : proxy
ports :
- port : 25565
targetPort : minecraft
protocol : TCP
Azure Load Balancer
apiVersion : v1
kind : Service
metadata :
name : gate
annotations :
service.beta.kubernetes.io/azure-load-balancer-tcp-idle-timeout : "30"
# For internal load balancer
# service.beta.kubernetes.io/azure-load-balancer-internal: "true"
spec :
type : LoadBalancer
externalTrafficPolicy : Local
selector :
app.kubernetes.io/component : proxy
ports :
- port : 25565
targetPort : minecraft
protocol : TCP
Pod Disruption Budget
Ensure high availability during maintenance:
apiVersion : policy/v1
kind : PodDisruptionBudget
metadata :
name : gate-pdb
labels :
app.kubernetes.io/name : gate
spec :
minAvailable : 2
selector :
matchLabels :
app.kubernetes.io/component : proxy
app.kubernetes.io/name : gate
Alternatively, specify maximum unavailable:
apiVersion : policy/v1
kind : PodDisruptionBudget
metadata :
name : gate-pdb
spec :
maxUnavailable : 1
selector :
matchLabels :
app.kubernetes.io/component : proxy
app.kubernetes.io/name : gate
Pod Anti-Affinity
Distribute pods across nodes for better resilience:
apiVersion : apps/v1
kind : Deployment
metadata :
name : gate
spec :
replicas : 3
template :
spec :
affinity :
# Prefer spreading across nodes
podAntiAffinity :
preferredDuringSchedulingIgnoredDuringExecution :
- weight : 100
podAffinityTerm :
labelSelector :
matchExpressions :
- key : app.kubernetes.io/component
operator : In
values :
- proxy
topologyKey : kubernetes.io/hostname
# Require spreading across availability zones
podAntiAffinity :
requiredDuringSchedulingIgnoredDuringExecution :
- labelSelector :
matchExpressions :
- key : app.kubernetes.io/component
operator : In
values :
- proxy
topologyKey : topology.kubernetes.io/zone
containers :
- name : gate
image : ghcr.io/minekube/gate:latest
# ... rest of container spec
Topology Spread Constraints
Modern alternative to pod anti-affinity:
apiVersion : apps/v1
kind : Deployment
metadata :
name : gate
spec :
template :
spec :
topologySpreadConstraints :
# Spread across nodes
- maxSkew : 1
topologyKey : kubernetes.io/hostname
whenUnsatisfiable : DoNotSchedule
labelSelector :
matchLabels :
app.kubernetes.io/component : proxy
# Spread across zones
- maxSkew : 1
topologyKey : topology.kubernetes.io/zone
whenUnsatisfiable : ScheduleAnyway
labelSelector :
matchLabels :
app.kubernetes.io/component : proxy
containers :
- name : gate
# ... container spec
Monitoring Scaling
View HPA Status
# Watch HPA in real-time
kubectl get hpa --watch
# Describe HPA for detailed information
kubectl describe hpa gate-hpa
# View HPA events
kubectl get events --field-selector involvedObject.name=gate-hpa
Metrics
# View current resource usage
kubectl top pods -l app.kubernetes.io/component=proxy
# View node resource usage
kubectl top nodes
Resource Requests and Limits
Set appropriate resource requests for autoscaling:
resources :
requests :
memory : "1Gi"
cpu : "500m" # Critical for CPU-based HPA
limits :
memory : "2Gi"
cpu : "2000m"
HPA calculates target replicas based on resource requests , not limits. Ensure requests accurately reflect your baseline requirements.
Quality of Service (QoS)
Ensure Guaranteed QoS for stable performance:
resources :
requests :
memory : "2Gi"
cpu : "1000m"
limits :
memory : "2Gi" # Same as requests
cpu : "1000m" # Same as requests
Scaling Best Practices
Start Conservative
Begin with 2-3 replicas and adjust based on actual load patterns.
Monitor Metrics
Track CPU, memory, and player count metrics over time.
Set Appropriate Thresholds
Target 60-70% CPU utilization for optimal scaling headroom.
Configure Session Affinity
Use ClientIP session affinity to keep players on the same proxy.
Implement PDB
Ensure minimum availability during node maintenance.
Test Scaling
Simulate load to verify autoscaling behavior before production.
Production Scaling Example
Complete production-ready configuration:
# Deployment with resource limits
apiVersion : apps/v1
kind : Deployment
metadata :
name : gate
spec :
replicas : 3 # Will be managed by HPA
template :
spec :
topologySpreadConstraints :
- maxSkew : 1
topologyKey : kubernetes.io/hostname
whenUnsatisfiable : DoNotSchedule
labelSelector :
matchLabels :
app.kubernetes.io/component : proxy
containers :
- name : gate
image : ghcr.io/minekube/gate:latest
resources :
requests :
memory : "1Gi"
cpu : "500m"
limits :
memory : "2Gi"
cpu : "1500m"
---
# HPA configuration
apiVersion : autoscaling/v2
kind : HorizontalPodAutoscaler
metadata :
name : gate-hpa
spec :
scaleTargetRef :
apiVersion : apps/v1
kind : Deployment
name : gate
minReplicas : 3
maxReplicas : 20
metrics :
- type : Resource
resource :
name : cpu
target :
type : Utilization
averageUtilization : 65
- type : Resource
resource :
name : memory
target :
type : Utilization
averageUtilization : 75
behavior :
scaleDown :
stabilizationWindowSeconds : 300
policies :
- type : Percent
value : 25
periodSeconds : 60
scaleUp :
stabilizationWindowSeconds : 60
policies :
- type : Percent
value : 50
periodSeconds : 30
- type : Pods
value : 3
periodSeconds : 30
selectPolicy : Max
---
# Pod Disruption Budget
apiVersion : policy/v1
kind : PodDisruptionBudget
metadata :
name : gate-pdb
spec :
minAvailable : 2
selector :
matchLabels :
app.kubernetes.io/component : proxy
---
# LoadBalancer Service
apiVersion : v1
kind : Service
metadata :
name : gate
annotations :
service.beta.kubernetes.io/aws-load-balancer-type : "nlb"
spec :
type : LoadBalancer
externalTrafficPolicy : Local
sessionAffinity : ClientIP
sessionAffinityConfig :
clientIP :
timeoutSeconds : 3600
selector :
app.kubernetes.io/component : proxy
ports :
- port : 25565
targetPort : minecraft
protocol : TCP
Troubleshooting
HPA Not Scaling
# Check metrics availability
kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes
# Verify resource requests are set
kubectl get deployment gate -o jsonpath='{.spec.template.spec.containers[0].resources}'
# Check HPA conditions
kubectl describe hpa gate-hpa
Uneven Load Distribution
Verify session affinity is configured
Check external traffic policy setting
Review pod anti-affinity rules
Examine load balancer configuration
Pods Not Scaling Down
Check stabilization window settings
Review PDB configuration
Verify scale-down policies
Check for active player connections
Next Steps
Monitoring Set up monitoring and alerting
Configuration Advanced configuration options