Overview
Load balancers distribute network traffic across multiple pods, providing high availability, scalability, and external access to your Kubernetes services.
Service Types
Kubernetes provides several service types for exposing applications:
ClusterIP (Default)
Exposes the service on an internal IP within the cluster. Only accessible from within the cluster.
apiVersion : v1
kind : Service
metadata :
name : internal-service
spec :
type : ClusterIP
selector :
app : backend
ports :
- protocol : TCP
port : 80
targetPort : 8080
Use cases : Internal microservices communication, databases, caches
NodePort
Exposes the service on each node’s IP at a static port (30000-32767 by default).
apiVersion : v1
kind : Service
metadata :
name : nodeport-service
spec :
type : NodePort
selector :
app : backend
ports :
- protocol : TCP
port : 80
targetPort : 8080
nodePort : 30080 # Optional, auto-assigned if omitted
Use cases : Development environments, direct node access needed
LoadBalancer
Provisions an external load balancer (cloud provider dependent) with a stable external IP.
apiVersion : v1
kind : Service
metadata :
name : loadbalancer-service
spec :
type : LoadBalancer
selector :
app : backend
ports :
- protocol : TCP
port : 80
targetPort : 8080
Use cases : Production services requiring external access, when Ingress is not suitable
GKE Load Balancer Configuration
Network Load Balancer (L4)
GKE creates a regional Network Load Balancer by default for LoadBalancer services:
apiVersion : v1
kind : Service
metadata :
name : exchange-api
annotations :
# Use TCP load balancing (default)
cloud.google.com/load-balancer-type : "External"
spec :
type : LoadBalancer
selector :
app : exchange-api
ports :
- name : http
protocol : TCP
port : 80
targetPort : 8080
- name : https
protocol : TCP
port : 443
targetPort : 8443
Internal Load Balancer
For services that should only be accessible from within your VPC:
apiVersion : v1
kind : Service
metadata :
name : internal-api
annotations :
cloud.google.com/load-balancer-type : "Internal"
spec :
type : LoadBalancer
selector :
app : internal-api
ports :
- protocol : TCP
port : 80
targetPort : 8080
Static External IP
Assign a reserved static IP to your load balancer:
# Reserve a static IP
gcloud compute addresses create exchange-api-ip --region=us-central1
# Get the IP address
gcloud compute addresses describe exchange-api-ip --region=us-central1 --format= "get(address)"
apiVersion : v1
kind : Service
metadata :
name : exchange-api
spec :
type : LoadBalancer
loadBalancerIP : 34.123.45.67 # Your reserved IP
selector :
app : exchange-api
ports :
- protocol : TCP
port : 80
targetPort : 8080
Using Ingress with Load Balancer
The recommended approach is to use a single LoadBalancer for the NGINX Ingress Controller:
# Ingress Controller exposes a LoadBalancer service
apiVersion : v1
kind : Service
metadata :
name : ingress-nginx-controller
namespace : ingress-nginx
spec :
type : LoadBalancer
selector :
app.kubernetes.io/name : ingress-nginx
ports :
- name : http
port : 80
targetPort : http
- name : https
port : 443
targetPort : https
Then route traffic to multiple services using Ingress rules:
apiVersion : networking.k8s.io/v1
kind : Ingress
metadata :
name : app-ingress
spec :
ingressClassName : nginx
rules :
- host : api.example.com
http :
paths :
- path : /
pathType : Prefix
backend :
service :
name : api-service
port :
number : 80
Using Ingress with a single LoadBalancer is more cost-effective than creating multiple LoadBalancer services, as cloud providers charge per load balancer.
High Availability Patterns
Multi-Zone Deployment
Deploy pods across multiple availability zones:
apiVersion : apps/v1
kind : Deployment
metadata :
name : exchange-api
spec :
replicas : 3
selector :
matchLabels :
app : exchange-api
template :
metadata :
labels :
app : exchange-api
spec :
# Anti-affinity to spread pods across zones
affinity :
podAntiAffinity :
preferredDuringSchedulingIgnoredDuringExecution :
- weight : 100
podAffinityTerm :
labelSelector :
matchExpressions :
- key : app
operator : In
values :
- exchange-api
topologyKey : topology.kubernetes.io/zone
containers :
- name : api
image : exchange-api:v1.0
ports :
- containerPort : 8080
Health Checks
Configure proper health checks for reliable load balancing:
apiVersion : v1
kind : Service
metadata :
name : exchange-api
annotations :
# Custom health check configuration
cloud.google.com/neg : '{"ingress": true}'
spec :
type : LoadBalancer
selector :
app : exchange-api
ports :
- protocol : TCP
port : 80
targetPort : 8080
---
apiVersion : apps/v1
kind : Deployment
metadata :
name : exchange-api
spec :
template :
spec :
containers :
- name : api
image : exchange-api:v1.0
ports :
- containerPort : 8080
livenessProbe :
httpGet :
path : /health
port : 8080
initialDelaySeconds : 30
periodSeconds : 10
readinessProbe :
httpGet :
path : /ready
port : 8080
initialDelaySeconds : 5
periodSeconds : 5
Session Affinity
Maintain client connections to the same pod:
apiVersion : v1
kind : Service
metadata :
name : stateful-service
spec :
type : LoadBalancer
sessionAffinity : ClientIP
sessionAffinityConfig :
clientIP :
timeoutSeconds : 3600
selector :
app : stateful-app
ports :
- protocol : TCP
port : 80
targetPort : 8080
Load Balancing Algorithms
Round Robin (Default)
Distributes requests evenly across all healthy pods.
Connection-Based
For long-lived connections (WebSockets, gRPC):
apiVersion : v1
kind : Service
metadata :
name : websocket-service
annotations :
# Use connection-based load balancing
service.kubernetes.io/topology-aware-hints : "auto"
spec :
type : LoadBalancer
selector :
app : websocket
ports :
- name : ws
protocol : TCP
port : 80
targetPort : 8080
Monitoring and Verification
Check Service Status
# List all services
kubectl get svc
# Get external IP (may take a few minutes)
kubectl get svc exchange-api -w
# Describe service for detailed information
kubectl describe svc exchange-api
Test Load Balancer
# Get the external IP
EXTERNAL_IP = $( kubectl get svc exchange-api -o jsonpath='{.status.loadBalancer.ingress[0].ip}' )
# Test connectivity
curl http:// $EXTERNAL_IP
# Test with specific headers
curl -H "Host: api.example.com" http:// $EXTERNAL_IP
GKE-Specific Checks
# List GCP load balancers
gcloud compute forwarding-rules list
# View backend health
gcloud compute backend-services get-health < backend-service-nam e > --global
# Check firewall rules
gcloud compute firewall-rules list --filter= "name~gke"
Cost Optimization
Use Ingress Share a single LoadBalancer across multiple services using Ingress instead of creating multiple LoadBalancer services
Internal Services Use ClusterIP for internal services that don’t need external access
Regional Load Balancers Use regional load balancers instead of global when possible to reduce costs
Right-Size Remove unused LoadBalancer services to avoid unnecessary charges
Best Practices
Use Ingress for HTTP/HTTPS : For most web applications, use Ingress with a single LoadBalancer instead of multiple LoadBalancer services
Configure Health Checks : Always define proper liveness and readiness probes
Enable Connection Draining : Ensure graceful shutdown with proper termination grace periods
Use Static IPs for Production : Reserve static IPs for production load balancers to maintain consistent DNS records
Implement TLS Termination : Use Ingress with cert-manager for automatic TLS certificate management
Monitor Traffic : Set up monitoring for load balancer metrics and backend health
Troubleshooting
External IP Shows “Pending”
# Check service events
kubectl describe svc < service-nam e >
# Verify quota limits
gcloud compute project-info describe --project= < project-id >
Connection Refused
Verify pod selector matches deployment labels
Check if pods are running: kubectl get pods -l app=<app-name>
Verify target port matches container port
Check pod logs: kubectl logs <pod-name>
Intermittent Failures
Check readiness probe configuration
Verify backend health: kubectl get endpoints <service-name>
Review pod resource limits and usage