Load Balancing Strategies

Load balancing is essential for distributing high-concurrency user requests across multiple application servers, enabling horizontal scaling and improved system reliability.

What is Load Balancing?

Load balancing distributes incoming network traffic across a cluster of servers to:

Maximize throughput: Utilize multiple servers’ computing resources
Minimize response time: Route requests to available servers
Avoid overload: Prevent any single server from becoming a bottleneck
Ensure high availability: Continue serving requests even if servers fail

Load Balancing Approaches

HTTP Redirect
DNS Load Balancing
Reverse Proxy

HTTP Redirect Load Balancing

Mechanism: Load balancer returns HTTP 302 redirect to selected application server

How It Works

Initial request

User sends HTTP request to load balancer

Server selection

Load balancer selects target server using algorithm (random, round-robin, etc.)

Redirect response

Returns HTTP 302 with application server IP address

Direct connection

Browser sends new request directly to application server

Simple Implementation

@Override
protected void doGet(HttpServletRequest request, 
                    HttpServletResponse response) 
                    throws ServletException, IOException {
    // Get client request URL
    String clientRequestURL = request.getRequestURL().toString();
    
    // Select target server based on condition
    String targetURL;
    if (someCondition()) {
        targetURL = "http://server1.example.com" + request.getServletPath();
    } else {
        targetURL = "http://server2.example.com" + request.getServletPath();
    }
    
    // Execute redirect
    response.sendRedirect(targetURL);
}

Simplicity vs. Practicality

This approach can be implemented in less than 10 lines of Java code, making it extremely simple. However, it’s rarely used in production due to significant drawbacks.

Advantages

Simple Design

Easy to implement with minimal code

No Proxy Overhead

Load balancer doesn’t handle response traffic

Disadvantages

Critical Issues:

Double Request Overhead
- User makes TWO requests per operation
- First to load balancer, then to application server
- Doubles latency and network overhead
Security Vulnerability
- Application server IP addresses exposed to public
- Direct external access to application servers
- Cannot hide servers behind firewall
- Increased attack surface
Limited Control
- Cannot inspect or modify response traffic
- No SSL termination at load balancer
- Difficult to implement sticky sessions

Industry Practice: HTTP redirect load balancing is rarely used in production. Modern systems prefer DNS load balancing combined with internal HTTP load balancers.

DNS Load Balancing

Mechanism: DNS server returns different IP addresses to different clients

How It Works

Domain name request

Browser needs to resolve domain to IP before making HTTP request

DNS resolution

DNS server applies load balancing algorithm during name resolution

IP address returned

Different users receive different IP addresses

Direct connection

User connects directly to assigned server

Advantages Over HTTP Redirect

Performance

No repeated DNS lookups

DNS results cached locally
Cache duration: minutes to hours
No performance impact on subsequent requests

Security

Two-tier architecture

DNS resolves to load balancer IPs
Application servers use private IPs
No direct external access to app servers
Reduced attack surface

Two-Tier Load Balancing

Best Practice: Large-scale internet applications use two levels of load balancing:

DNS Level: Distributes traffic to load balancer cluster
Load Balancer Level: Distributes traffic to application server cluster

Benefits:

Application servers use internal IPs only
Only load balancers exposed to internet
Load balancers have strict firewall rules
Defense in depth security model

Real-World Usage

Industry Adoption

Used by major platforms:

Google
Baidu
Taobao/Alibaba
Amazon
Most large-scale internet applications

Test it yourself:

# Different computers will get different IPs
ping www.baidu.com

# Example results:
# Computer 1: 110.242.68.3
# Computer 2: 110.242.68.4

Configuration

Choose DNS provider

Select domain registrar with load balancing support (most providers offer this)

Configure A records

Add multiple A records pointing to different load balancer IPs

Set TTL

Configure Time-To-Live for DNS cache duration

Monitor distribution

Verify traffic is distributed across all IPs

No coding required! DNS load balancing is configured through your domain provider’s control panel, not in application code.

Limitations

DNS caching challenges:

Cannot instantly redirect traffic during failures
TTL creates delay in propagating changes
Client-side DNS caching is unpredictable
Geographic distribution may not be optimal

Reverse Proxy Load Balancing

Mechanism: Load balancer acts as proxy, forwarding requests and responses

Common Technologies

Nginx

Most popular
High performance
Rich features
Event-driven

HAProxy

TCP/HTTP load balancing
Advanced health checks
Detailed statistics

Envoy

Modern architecture
Service mesh native
Dynamic configuration

Nginx Example

upstream backend {
    # Load balancing strategy
    least_conn;  # or: ip_hash, random, etc.
    
    # Backend servers
    server 192.168.1.10:8080 weight=3;
    server 192.168.1.11:8080 weight=2;
    server 192.168.1.12:8080 weight=1 backup;
    
    # Health check
    server 192.168.1.13:8080 max_fails=3 fail_timeout=30s;
}

server {
    listen 80;
    server_name example.com;
    
    location / {
        proxy_pass http://backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }
}

Advantages

Full Control

Inspect and modify requests/responses
SSL/TLS termination
Request routing by path/header
Response compression

Advanced Features

Sticky sessions
Rate limiting
Caching
Authentication

Load Balancing Strategies

Algorithms for selecting which server handles each request:

Round Robin

Pattern: Distribute requests sequentially in circular orderHow it works:

Request 1 → Server 1
Request 2 → Server 2
Request 3 → Server 3
Request 4 → Server 1 (cycle repeats)

✅ Pros:

Simple to implement
Fair distribution
No server state required

❌ Cons:

Doesn’t account for server capacity
Ignores current server load
May overload slower servers

Best for: Homogeneous server clusters with similar capacity

Weighted Round Robin

Pattern: Round robin with different weights per serverHow it works:

Server 1 (weight=3): Gets 3 out of every 6 requests
Server 2 (weight=2): Gets 2 out of every 6 requests
Server 3 (weight=1): Gets 1 out of every 6 requests

Configuration example:

upstream backend {
    server app1.example.com weight=3;  # Powerful server
    server app2.example.com weight=2;  # Medium server
    server app3.example.com weight=1;  # Smaller server
}

✅ Pros:

Accounts for different server capacities
Better resource utilization
Flexible configuration

Best for: Heterogeneous clusters with varying server specifications

Random

Pattern: Randomly select server for each requestHow it works:

import random

def select_server(servers):
    return random.choice(servers)

✅ Pros:

Simple implementation
Stateless
Naturally distributes over time

❌ Cons:

Uneven distribution in short term
No capacity awareness

Best for: Large request volumes where statistical distribution evens out

Least Connections

Pattern: Route to server with fewest active connectionsHow it works:

Server 1: 10 active connections
Server 2: 15 active connections ← Skip
Server 3: 8 active connections  ← Choose this

✅ Pros:

Dynamic load awareness
Better for long-lived connections
Adapts to varying request durations

❌ Cons:

Requires connection tracking
More complex state management

Best for: Applications with variable request processing timesNacos implementation: Supports “Least Connections with Slow Start” variant

IP Hash

Pattern: Hash client IP to consistently route to same serverHow it works:

def select_server(client_ip, servers):
    hash_value = hash(client_ip)
    index = hash_value % len(servers)
    return servers[index]

✅ Pros:

Session affinity without cookies
Consistent routing per client
Simplified session management

❌ Cons:

Uneven distribution if client IPs cluster
Server changes require rehashing
Not suitable behind proxies/NAT

Best for: Stateful applications requiring session persistence

Least Response Time

Pattern: Route to server with fastest response timeHow it works:

Track average response time per server
Send new requests to fastest server
Continuously update metrics

✅ Pros:

Performance-aware routing
Automatically avoids slow servers
Optimizes user experience

❌ Cons:

Complex metric collection
Requires health monitoring
Can create hot spots

Best for: Distributed servers across geographic regions

Comparison Matrix

Strategy	Complexity	State Required	Distribution	Use Case
Round Robin	Low	None	Even	Homogeneous clusters
Weighted RR	Low	Weights only	Proportional	Heterogeneous clusters
Random	Very Low	None	Statistical	High-volume traffic
Least Connections	Medium	Connection counts	Dynamic	Variable request times
IP Hash	Medium	Hash table	IP-based	Session persistence
Least Response Time	High	Metrics + health	Performance-based	Geographic distribution

Design Considerations

Health Checks

Essential for reliability:

Active health probes
Passive failure detection
Automatic server removal
Graceful re-introduction

Example (Nginx):

server 192.168.1.10:8080 
    max_fails=3 
    fail_timeout=30s;

Session Persistence

Maintain user sessions:

Sticky sessions (cookie-based)
IP hash routing
Shared session storage (Redis)
Stateless design (JWT)

Trade-off: Stickiness vs. flexibility

SSL Termination

Handle encryption at load balancer:

Reduce backend server load
Centralized certificate management
Simpler backend configuration
May decrypt sensitive data

Alternative: End-to-end encryption

High Availability

Eliminate single point of failure:

Active-passive LB pairs
Active-active with shared VIP
DNS-level LB failover
Health check redundancy

Technologies: Keepalived, VRRP, BGP

Common Patterns

Global + Regional
Layer 4 + Layer 7

Multi-Tier Geographic Load Balancing

Benefits:

Reduced latency (geographic proximity)
Regulatory compliance (data residency)
Disaster recovery across regions

Best Practices

Start with DNS load balancing

Distribute traffic across load balancer clusters geographically

Use reverse proxy for application tier

Nginx/HAProxy to distribute to application servers with private IPs

Implement health checks

Both active probes and passive failure detection

Choose appropriate algorithm

Match strategy to your traffic patterns and server characteristics

Plan for high availability

Redundant load balancers with automatic failover

Monitor and tune

Collect metrics, identify bottlenecks, adjust configuration

Common Mistakes to Avoid:

Exposing application servers to internet (use internal IPs)
Single load balancer (creates single point of failure)
No health checks (routes to failed servers)
Wrong algorithm for workload (e.g., round-robin for stateful apps)
Insufficient monitoring (can’t diagnose issues)

Service Discovery

Dynamic service registration for automatic load balancer updates

Message Queues

Asynchronous load distribution through queuing

Java & Spring

Databases

Algorithms & Data Structures

System Design

​What is Load Balancing?

​Load Balancing Approaches

​HTTP Redirect Load Balancing

​How It Works

​Simple Implementation

Simplicity vs. Practicality

​Advantages

Simple Design

No Proxy Overhead

​Disadvantages

​DNS Load Balancing

​How It Works

​Advantages Over HTTP Redirect

Performance

Security

​Two-Tier Load Balancing

​Real-World Usage

Industry Adoption

​Configuration

​Limitations

​Reverse Proxy Load Balancing

​Common Technologies

Nginx

HAProxy

Envoy

​Nginx Example

​Advantages

Full Control

Advanced Features

​Load Balancing Strategies

​Comparison Matrix

​Design Considerations

Health Checks

Session Persistence

SSL Termination

High Availability

​Common Patterns

​Multi-Tier Geographic Load Balancing

​Combined Network and Application Load Balancing

​Best Practices

​Related Topics

Service Discovery

Message Queues

Build docs developers (and LLMs) love

What is Load Balancing?

Load Balancing Approaches

HTTP Redirect Load Balancing

How It Works

Simple Implementation

Advantages

Disadvantages

DNS Load Balancing

How It Works

Advantages Over HTTP Redirect

Two-Tier Load Balancing

Real-World Usage

Configuration

Limitations

Reverse Proxy Load Balancing

Common Technologies

Nginx Example

Advantages

Load Balancing Strategies

Comparison Matrix

Design Considerations

Common Patterns

Multi-Tier Geographic Load Balancing

Combined Network and Application Load Balancing

Best Practices

Related Topics