Load Balancers

Overview

Load balancers distribute network traffic across multiple pods, providing high availability, scalability, and external access to your Kubernetes services.

Service Types

Kubernetes provides several service types for exposing applications:

ClusterIP (Default)

Exposes the service on an internal IP within the cluster. Only accessible from within the cluster.

apiVersion: v1
kind: Service
metadata:
  name: internal-service
spec:
  type: ClusterIP
  selector:
    app: backend
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080

Use cases: Internal microservices communication, databases, caches

NodePort

Exposes the service on each node’s IP at a static port (30000-32767 by default).

apiVersion: v1
kind: Service
metadata:
  name: nodeport-service
spec:
  type: NodePort
  selector:
    app: backend
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080
      nodePort: 30080  # Optional, auto-assigned if omitted

Use cases: Development environments, direct node access needed

LoadBalancer

Provisions an external load balancer (cloud provider dependent) with a stable external IP.

apiVersion: v1
kind: Service
metadata:
  name: loadbalancer-service
spec:
  type: LoadBalancer
  selector:
    app: backend
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080

Use cases: Production services requiring external access, when Ingress is not suitable

GKE Load Balancer Configuration

Network Load Balancer (L4)

GKE creates a regional Network Load Balancer by default for LoadBalancer services:

apiVersion: v1
kind: Service
metadata:
  name: exchange-api
  annotations:
    # Use TCP load balancing (default)
    cloud.google.com/load-balancer-type: "External"
spec:
  type: LoadBalancer
  selector:
    app: exchange-api
  ports:
    - name: http
      protocol: TCP
      port: 80
      targetPort: 8080
    - name: https
      protocol: TCP
      port: 443
      targetPort: 8443

Internal Load Balancer

For services that should only be accessible from within your VPC:

apiVersion: v1
kind: Service
metadata:
  name: internal-api
  annotations:
    cloud.google.com/load-balancer-type: "Internal"
spec:
  type: LoadBalancer
  selector:
    app: internal-api
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080

Static External IP

Assign a reserved static IP to your load balancer:

# Reserve a static IP
gcloud compute addresses create exchange-api-ip --region=us-central1

# Get the IP address
gcloud compute addresses describe exchange-api-ip --region=us-central1 --format="get(address)"

apiVersion: v1
kind: Service
metadata:
  name: exchange-api
spec:
  type: LoadBalancer
  loadBalancerIP: 34.123.45.67  # Your reserved IP
  selector:
    app: exchange-api
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080

Using Ingress with Load Balancer

The recommended approach is to use a single LoadBalancer for the NGINX Ingress Controller:

# Ingress Controller exposes a LoadBalancer service
apiVersion: v1
kind: Service
metadata:
  name: ingress-nginx-controller
  namespace: ingress-nginx
spec:
  type: LoadBalancer
  selector:
    app.kubernetes.io/name: ingress-nginx
  ports:
    - name: http
      port: 80
      targetPort: http
    - name: https
      port: 443
      targetPort: https

Then route traffic to multiple services using Ingress rules:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: app-ingress
spec:
  ingressClassName: nginx
  rules:
    - host: api.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: api-service
                port:
                  number: 80

Using Ingress with a single LoadBalancer is more cost-effective than creating multiple LoadBalancer services, as cloud providers charge per load balancer.

High Availability Patterns

Multi-Zone Deployment

Deploy pods across multiple availability zones:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: exchange-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: exchange-api
  template:
    metadata:
      labels:
        app: exchange-api
    spec:
      # Anti-affinity to spread pods across zones
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                labelSelector:
                  matchExpressions:
                    - key: app
                      operator: In
                      values:
                        - exchange-api
                topologyKey: topology.kubernetes.io/zone
      containers:
        - name: api
          image: exchange-api:v1.0
          ports:
            - containerPort: 8080

Health Checks

Configure proper health checks for reliable load balancing:

apiVersion: v1
kind: Service
metadata:
  name: exchange-api
  annotations:
    # Custom health check configuration
    cloud.google.com/neg: '{"ingress": true}'
spec:
  type: LoadBalancer
  selector:
    app: exchange-api
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: exchange-api
spec:
  template:
    spec:
      containers:
        - name: api
          image: exchange-api:v1.0
          ports:
            - containerPort: 8080
          livenessProbe:
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 30
            periodSeconds: 10
          readinessProbe:
            httpGet:
              path: /ready
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 5

Session Affinity

Maintain client connections to the same pod:

apiVersion: v1
kind: Service
metadata:
  name: stateful-service
spec:
  type: LoadBalancer
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 3600
  selector:
    app: stateful-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080

Load Balancing Algorithms

Round Robin (Default)

Distributes requests evenly across all healthy pods.

Connection-Based

For long-lived connections (WebSockets, gRPC):

apiVersion: v1
kind: Service
metadata:
  name: websocket-service
  annotations:
    # Use connection-based load balancing
    service.kubernetes.io/topology-aware-hints: "auto"
spec:
  type: LoadBalancer
  selector:
    app: websocket
  ports:
    - name: ws
      protocol: TCP
      port: 80
      targetPort: 8080

Monitoring and Verification

Check Service Status

# List all services
kubectl get svc

# Get external IP (may take a few minutes)
kubectl get svc exchange-api -w

# Describe service for detailed information
kubectl describe svc exchange-api

Test Load Balancer

# Get the external IP
EXTERNAL_IP=$(kubectl get svc exchange-api -o jsonpath='{.status.loadBalancer.ingress[0].ip}')

# Test connectivity
curl http://$EXTERNAL_IP

# Test with specific headers
curl -H "Host: api.example.com" http://$EXTERNAL_IP

GKE-Specific Checks

# List GCP load balancers
gcloud compute forwarding-rules list

# View backend health
gcloud compute backend-services get-health <backend-service-name> --global

# Check firewall rules
gcloud compute firewall-rules list --filter="name~gke"

Cost Optimization

Use Ingress

Share a single LoadBalancer across multiple services using Ingress instead of creating multiple LoadBalancer services

Internal Services

Use ClusterIP for internal services that don’t need external access

Regional Load Balancers

Use regional load balancers instead of global when possible to reduce costs

Right-Size

Remove unused LoadBalancer services to avoid unnecessary charges

Best Practices

Use Ingress for HTTP/HTTPS: For most web applications, use Ingress with a single LoadBalancer instead of multiple LoadBalancer services
Configure Health Checks: Always define proper liveness and readiness probes
Enable Connection Draining: Ensure graceful shutdown with proper termination grace periods
Use Static IPs for Production: Reserve static IPs for production load balancers to maintain consistent DNS records
Implement TLS Termination: Use Ingress with cert-manager for automatic TLS certificate management
Monitor Traffic: Set up monitoring for load balancer metrics and backend health

Troubleshooting

External IP Shows “Pending”

# Check service events
kubectl describe svc <service-name>

# Verify quota limits
gcloud compute project-info describe --project=<project-id>

Connection Refused

Verify pod selector matches deployment labels
Check if pods are running: kubectl get pods -l app=<app-name>
Verify target port matches container port
Check pod logs: kubectl logs <pod-name>

Intermittent Failures

Check readiness probe configuration
Verify backend health: kubectl get endpoints <service-name>
Review pod resource limits and usage

NGINX Ingress Controller - Advanced HTTP routing and TLS termination
cert-manager - Automatic TLS certificates for load balancers
Autoscaling - Scale pods based on load

Get Started

Setup & Deployment

Core Components

Services

Operations

Security

Overview

Service Types

ClusterIP (Default)

NodePort

LoadBalancer

GKE Load Balancer Configuration

Network Load Balancer (L4)

Internal Load Balancer

Static External IP

Using Ingress with Load Balancer

High Availability Patterns

Multi-Zone Deployment

Health Checks

Session Affinity

Load Balancing Algorithms

Round Robin (Default)

Connection-Based

Monitoring and Verification

Check Service Status

Test Load Balancer

GKE-Specific Checks

Cost Optimization

Use Ingress

Internal Services

Regional Load Balancers

Right-Size

Best Practices

Troubleshooting

External IP Shows “Pending”

Connection Refused

Intermittent Failures

Build docs developers (and LLMs) love

Get Started

Setup & Deployment

Core Components

Services

Operations

Security

​Overview

​Service Types

​ClusterIP (Default)

​NodePort

​LoadBalancer

​GKE Load Balancer Configuration

​Network Load Balancer (L4)

​Internal Load Balancer

​Static External IP

​Using Ingress with Load Balancer

​High Availability Patterns

​Multi-Zone Deployment

​Health Checks

​Session Affinity

​Load Balancing Algorithms

​Round Robin (Default)

​Connection-Based

​Monitoring and Verification

​Check Service Status

​Test Load Balancer

​GKE-Specific Checks

​Cost Optimization

Use Ingress

Internal Services

Regional Load Balancers

Right-Size

​Best Practices

​Troubleshooting

​External IP Shows “Pending”

​Connection Refused

​Intermittent Failures

​Related Components

Build docs developers (and LLMs) love

Overview

Service Types

ClusterIP (Default)

NodePort

LoadBalancer

GKE Load Balancer Configuration

Network Load Balancer (L4)

Internal Load Balancer

Static External IP

Using Ingress with Load Balancer

High Availability Patterns

Multi-Zone Deployment

Health Checks

Session Affinity

Load Balancing Algorithms

Round Robin (Default)

Connection-Based

Monitoring and Verification

Check Service Status

Test Load Balancer

GKE-Specific Checks

Cost Optimization

Best Practices

Troubleshooting

External IP Shows “Pending”

Connection Refused

Intermittent Failures

Related Components