Skip to main content

Overview

Load balancers distribute network traffic across multiple pods, providing high availability, scalability, and external access to your Kubernetes services.

Service Types

Kubernetes provides several service types for exposing applications:

ClusterIP (Default)

Exposes the service on an internal IP within the cluster. Only accessible from within the cluster.
apiVersion: v1
kind: Service
metadata:
  name: internal-service
spec:
  type: ClusterIP
  selector:
    app: backend
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080
Use cases: Internal microservices communication, databases, caches

NodePort

Exposes the service on each node’s IP at a static port (30000-32767 by default).
apiVersion: v1
kind: Service
metadata:
  name: nodeport-service
spec:
  type: NodePort
  selector:
    app: backend
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080
      nodePort: 30080  # Optional, auto-assigned if omitted
Use cases: Development environments, direct node access needed

LoadBalancer

Provisions an external load balancer (cloud provider dependent) with a stable external IP.
apiVersion: v1
kind: Service
metadata:
  name: loadbalancer-service
spec:
  type: LoadBalancer
  selector:
    app: backend
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080
Use cases: Production services requiring external access, when Ingress is not suitable

GKE Load Balancer Configuration

Network Load Balancer (L4)

GKE creates a regional Network Load Balancer by default for LoadBalancer services:
apiVersion: v1
kind: Service
metadata:
  name: exchange-api
  annotations:
    # Use TCP load balancing (default)
    cloud.google.com/load-balancer-type: "External"
spec:
  type: LoadBalancer
  selector:
    app: exchange-api
  ports:
    - name: http
      protocol: TCP
      port: 80
      targetPort: 8080
    - name: https
      protocol: TCP
      port: 443
      targetPort: 8443

Internal Load Balancer

For services that should only be accessible from within your VPC:
apiVersion: v1
kind: Service
metadata:
  name: internal-api
  annotations:
    cloud.google.com/load-balancer-type: "Internal"
spec:
  type: LoadBalancer
  selector:
    app: internal-api
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080

Static External IP

Assign a reserved static IP to your load balancer:
# Reserve a static IP
gcloud compute addresses create exchange-api-ip --region=us-central1

# Get the IP address
gcloud compute addresses describe exchange-api-ip --region=us-central1 --format="get(address)"
apiVersion: v1
kind: Service
metadata:
  name: exchange-api
spec:
  type: LoadBalancer
  loadBalancerIP: 34.123.45.67  # Your reserved IP
  selector:
    app: exchange-api
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080

Using Ingress with Load Balancer

The recommended approach is to use a single LoadBalancer for the NGINX Ingress Controller:
# Ingress Controller exposes a LoadBalancer service
apiVersion: v1
kind: Service
metadata:
  name: ingress-nginx-controller
  namespace: ingress-nginx
spec:
  type: LoadBalancer
  selector:
    app.kubernetes.io/name: ingress-nginx
  ports:
    - name: http
      port: 80
      targetPort: http
    - name: https
      port: 443
      targetPort: https
Then route traffic to multiple services using Ingress rules:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: app-ingress
spec:
  ingressClassName: nginx
  rules:
    - host: api.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: api-service
                port:
                  number: 80
Using Ingress with a single LoadBalancer is more cost-effective than creating multiple LoadBalancer services, as cloud providers charge per load balancer.

High Availability Patterns

Multi-Zone Deployment

Deploy pods across multiple availability zones:
apiVersion: apps/v1
kind: Deployment
metadata:
  name: exchange-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: exchange-api
  template:
    metadata:
      labels:
        app: exchange-api
    spec:
      # Anti-affinity to spread pods across zones
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                labelSelector:
                  matchExpressions:
                    - key: app
                      operator: In
                      values:
                        - exchange-api
                topologyKey: topology.kubernetes.io/zone
      containers:
        - name: api
          image: exchange-api:v1.0
          ports:
            - containerPort: 8080

Health Checks

Configure proper health checks for reliable load balancing:
apiVersion: v1
kind: Service
metadata:
  name: exchange-api
  annotations:
    # Custom health check configuration
    cloud.google.com/neg: '{"ingress": true}'
spec:
  type: LoadBalancer
  selector:
    app: exchange-api
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: exchange-api
spec:
  template:
    spec:
      containers:
        - name: api
          image: exchange-api:v1.0
          ports:
            - containerPort: 8080
          livenessProbe:
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 30
            periodSeconds: 10
          readinessProbe:
            httpGet:
              path: /ready
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 5

Session Affinity

Maintain client connections to the same pod:
apiVersion: v1
kind: Service
metadata:
  name: stateful-service
spec:
  type: LoadBalancer
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 3600
  selector:
    app: stateful-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080

Load Balancing Algorithms

Round Robin (Default)

Distributes requests evenly across all healthy pods.

Connection-Based

For long-lived connections (WebSockets, gRPC):
apiVersion: v1
kind: Service
metadata:
  name: websocket-service
  annotations:
    # Use connection-based load balancing
    service.kubernetes.io/topology-aware-hints: "auto"
spec:
  type: LoadBalancer
  selector:
    app: websocket
  ports:
    - name: ws
      protocol: TCP
      port: 80
      targetPort: 8080

Monitoring and Verification

Check Service Status

# List all services
kubectl get svc

# Get external IP (may take a few minutes)
kubectl get svc exchange-api -w

# Describe service for detailed information
kubectl describe svc exchange-api

Test Load Balancer

# Get the external IP
EXTERNAL_IP=$(kubectl get svc exchange-api -o jsonpath='{.status.loadBalancer.ingress[0].ip}')

# Test connectivity
curl http://$EXTERNAL_IP

# Test with specific headers
curl -H "Host: api.example.com" http://$EXTERNAL_IP

GKE-Specific Checks

# List GCP load balancers
gcloud compute forwarding-rules list

# View backend health
gcloud compute backend-services get-health <backend-service-name> --global

# Check firewall rules
gcloud compute firewall-rules list --filter="name~gke"

Cost Optimization

Use Ingress

Share a single LoadBalancer across multiple services using Ingress instead of creating multiple LoadBalancer services

Internal Services

Use ClusterIP for internal services that don’t need external access

Regional Load Balancers

Use regional load balancers instead of global when possible to reduce costs

Right-Size

Remove unused LoadBalancer services to avoid unnecessary charges

Best Practices

  1. Use Ingress for HTTP/HTTPS: For most web applications, use Ingress with a single LoadBalancer instead of multiple LoadBalancer services
  2. Configure Health Checks: Always define proper liveness and readiness probes
  3. Enable Connection Draining: Ensure graceful shutdown with proper termination grace periods
  4. Use Static IPs for Production: Reserve static IPs for production load balancers to maintain consistent DNS records
  5. Implement TLS Termination: Use Ingress with cert-manager for automatic TLS certificate management
  6. Monitor Traffic: Set up monitoring for load balancer metrics and backend health

Troubleshooting

External IP Shows “Pending”

# Check service events
kubectl describe svc <service-name>

# Verify quota limits
gcloud compute project-info describe --project=<project-id>

Connection Refused

  • Verify pod selector matches deployment labels
  • Check if pods are running: kubectl get pods -l app=<app-name>
  • Verify target port matches container port
  • Check pod logs: kubectl logs <pod-name>

Intermittent Failures

  • Check readiness probe configuration
  • Verify backend health: kubectl get endpoints <service-name>
  • Review pod resource limits and usage

Build docs developers (and LLMs) love