Load Balancer

Load balancers distribute incoming client requests to computing resources such as application servers and databases. In each case, the load balancer returns the response from the computing resource to the appropriate client.

Benefits

Load balancers are effective at:

Preventing requests from going to unhealthy servers
Preventing overloading resources
Helping to eliminate a single point of failure

Load balancers can be implemented with hardware (expensive) or with software such as HAProxy.

Additional Benefits

SSL termination - Decrypt incoming requests and encrypt server responses so backend servers do not have to perform these potentially expensive operations
- Removes the need to install X.509 certificates on each server
Session persistence - Issue cookies and route a specific client’s requests to same instance if the web apps do not keep track of sessions

High Availability

To protect against failures, it’s common to set up multiple load balancers, either in active-passive or active-active mode.

Active-Passive

With active-passive fail-over, heartbeats are sent between the active and the passive server on standby. If the heartbeat is interrupted, the passive server takes over the active’s IP address and resumes service. The length of downtime is determined by whether the passive server is already running in ‘hot’ standby or whether it needs to start up from ‘cold’ standby. Only the active server handles traffic. Active-passive failover can also be referred to as master-slave failover.

Active-Active

In active-active, both servers are managing traffic, spreading the load between them. If the servers are public-facing, the DNS would need to know about the public IPs of both servers. If the servers are internal-facing, application logic would need to know about both servers. Active-active failover can also be referred to as master-master failover.

Load Balancing Methods

Load balancers can route traffic based on various metrics, including:

Random
Least loaded
Session/cookies
Round robin or weighted round robin
Layer 4
Layer 7

Layer 4 Load Balancing

Layer 4 load balancers look at info at the transport layer to decide how to distribute requests. Generally, this involves the source, destination IP addresses, and ports in the header, but not the contents of the packet. Layer 4 load balancers forward network packets to and from the upstream server, performing Network Address Translation (NAT).

Layer 7 Load Balancing

Layer 7 load balancers look at the application layer to decide how to distribute requests. This can involve contents of the header, message, and cookies. Layer 7 load balancers terminate network traffic, reads the message, makes a load-balancing decision, then opens a connection to the selected server. For example, a layer 7 load balancer can direct video traffic to servers that host videos while directing more sensitive user billing traffic to security-hardened servers. Performance Consideration: At the cost of flexibility, layer 4 load balancing requires less time and computing resources than Layer 7, although the performance impact can be minimal on modern commodity hardware.

Horizontal Scaling

Load balancers can also help with horizontal scaling, improving performance and availability. Scaling out using commodity machines is more cost efficient and results in higher availability than scaling up a single server on more expensive hardware, called Vertical Scaling. It is also easier to hire for talent working on commodity hardware than it is for specialized enterprise systems.

Considerations for Horizontal Scaling

Scaling horizontally introduces complexity and involves cloning servers
- Servers should be stateless: they should not contain any user-related data like sessions or profile pictures
- Sessions can be stored in a centralized data store such as a database (SQL, NoSQL) or a persistent cache (Redis, Memcached)
Downstream servers such as caches and databases need to handle more simultaneous connections as upstream servers scale out

Disadvantages of Load Balancers

The load balancer can become a performance bottleneck if it does not have enough resources or if it is not configured properly.
Introducing a load balancer to help eliminate a single point of failure results in increased complexity.
A single load balancer is a single point of failure, configuring multiple load balancers further increases complexity.

Layer 4 vs Layer 7 Trade-off: Layer 4 offers better performance with less flexibility, while Layer 7 provides content-based routing capabilities at a slight performance cost.

Get Started

System Design Topics

Core Components

Database

Benefits

Additional Benefits

High Availability

Active-Passive

Active-Active

Load Balancing Methods

Layer 4 Load Balancing

Layer 7 Load Balancing

Horizontal Scaling

Considerations for Horizontal Scaling

Source(s) and Further Reading

Build docs developers (and LLMs) love

Get Started

System Design Topics

Core Components

Database

​Benefits

​Additional Benefits

​High Availability

​Active-Passive

​Active-Active

​Load Balancing Methods

​Layer 4 Load Balancing

​Layer 7 Load Balancing

​Horizontal Scaling

​Considerations for Horizontal Scaling

​Source(s) and Further Reading

Build docs developers (and LLMs) love

Benefits

Additional Benefits

High Availability

Active-Passive

Active-Active

Load Balancing Methods

Layer 4 Load Balancing

Layer 7 Load Balancing

Horizontal Scaling

Considerations for Horizontal Scaling

Source(s) and Further Reading