Load Balancing Algorithms: L4/L7 Hydraulics & GSLB Forensics

The Distribution Split

1. Layer 4 vs. Layer 7: Speed vs. Context

The first decision in traffic engineering is the plane of resolution. **Layer 4 (L4)** operates at the transport layer, while **Layer 7 (L7)** understands the application payload.

The Performance Trade-off

Layer 4 (Speed)

Routes based on Source/Dest IP and Port. Extremely low latency (ASIC/DPDK speed) because it doesn't wait for the full packet to arrive. Ideal for simple load distribution.

Layer 7 (Logic)

Routes based on URLs, Headers, and Cookies. Consumes more CPU but allows for 'Smart Routing' (e.g., sending /api to the Go pool and /images to the S3 bucket).

Load Distribution Engine

Visualize how incoming traffic is distributed across backend servers.

Clients

Generating requests from multiple IPs

Load BalancerRound Robin

Backend Pool

Server 10 act

Total Ref: 0

Server 20 act

Total Ref: 0

Server 30 act

Total Ref: 0

Round Robin guarantees an equal number of requests sent to each server over time. However, it blindly sends traffic without considering the actual load (active connections) on the servers, which can lead to imbalance if some requests take longer to process than others.

The Hashing Ring

2. Consistent Hashing: Protecting the Cache

In a standard IP Hash (Source IP % N Servers), adding a single server changes 'N', which re-maps almost every client to a new server. This destroys cache affinity. **Consistent Hashing** (based on the Ketama algorithm) solves this.

The Ring Equation

\text{Server} = \text{Clockwise}(\text{Hash}(K)) \pmod{2^{160}}

Servers and request keys are hashed onto a 160-bit ring. When a server is removed, only the requests that belonged to that specific server are reassigned to the next clockwise neighbor. This ensures that only 1/N connections are disrupted.

P2c: Power of Two Choices

In massive clusters, checking the health of 1,000 servers for every request is too slow. P2c picks 2 servers at random and chooses the best one. This achieves nearly the same performance as 'Least Connections' but with constant-time computation.

Engineering Proximity

3. GSLB & Anycast: Global Traffic Steering

How does a user in London get a different server than a user in Tokyo? We use **GSLB** (Global Server Load Balancing) and **Anycast BGP**.

The TTL War: DNS Steering

GSLB is just a smart DNS server. It returns the 'nearest' IP based on the user's source IP. The challenge is TTL (Time To Live). If a data center dies, you must lower the TTL to 60s or less to ensure the DNS records expire quickly, otherwise, users will be 'stuck' to a dead site.

BGP Anycast Paradox:

Anycast uses the same IP advertised from multiple locations. The network (BGP) naturally sends users to the 'closest' node. However, Anycast is blind to application health. If the 'closest' node is on fire, BGP will still send you there until the route is withdrawn.

The Friction of Stability

4. Adaptive Balancing: EWMA & Gray Failures

A server that is 'up' but slow is more dangerous than a server that is 'down.' We use **EWMA (Exponentially Weighted Moving Average)** to detect these 'Gray Failures.'

The Latency Tracker

\\text{EWMA}_t = \\alpha \\cdot \\text{Sample}_t + (1 - \\alpha) \\cdot \\text{EWMA}_{t-1}

By giving more weight to the most recent responses ( $\alpha$ ), the load balancer can detect if a server is starting to throttle within milliseconds and 'Soft Drain' its traffic before a formal health check fails.

// Scientific Audit: Verified against NGINX/HAProxy best practices and ketama consistent hashing specs as of Q2 2026.

Frequently Asked Questions

Technical Standards & References

Eisenbud, D., et al. (Google Research)

Maglev: A Fast and Reliable Software Network Load Balancer

VIEW OFFICIAL SOURCE

Karger, D., et al. (Initial Paper)

Consistent Hashing and Random Trees

VIEW OFFICIAL SOURCE

Mitzenmacher, M.

The Power of Two Choices in Randomized Load Balancing

VIEW OFFICIAL SOURCE

IETF

RFC 7151: DNS-based Global Server Load Balancing

VIEW OFFICIAL SOURCE

HAProxy Technologies

Direct Server Return (DSR) Best Practices

VIEW OFFICIAL SOURCE

Mathematical models derived from standard engineering protocols. Not for human safety critical systems without redundant validation.

Related Engineering Resources

Interactive Tool

L4 vs L7 Forensics

Deep dive into packet-level distribution hydraulics.

Interactive Tool

CDN Architecture

How Load Balancing works at the planetary edge.

Interactive Tool

Reliability Patterns

Circuit breakers and retries that backup the balancer.

Interactive Tool

Anycast Routing

The BGP-level global load distribution mechanism.

Maglev: Google's Consistent Hash Table at Global Scale

Google's Maglev (NSDI 2016) is a software-based load balancer that processes 1 Gbps per core with a connection lookup table that uses consistent hashing. Unlike hardware LBs that rely on TCAM-based flow tables limited to a few hundred thousand entries, Maglev uses a Consistent Hash Table (CHT) that maps the 5-tuple of each connection to one of the backend servers. The CHT is a lookup table of size $M = 65537$ (a prime number), where each entry points to one backend. When a backend is added or removed, the CHT is recomputed and the affected entries (approximately $M/N$ for $N$ backends) are updated:

P_{disruption} = \frac{\text{entries reassigned}}{M} \approx \frac{1}{N}

The key performance metric is Connection Tracking Rate: a server receiving 10 Mpps with 100+ byte packets must classify and forward each packet in under 100 ns. Maglev achieves this by (1) hashing the 5-tuple using a CRC32c hardware instruction (12 ns), (2) looking up the CHT entry via array indexing (3 ns), and (3) forwarding the packet to the backend's virtual MAC address via a pre-populated neighbor table. The total per-packet processing cost is 50-80 ns, well under the 100 ns budget. The CHT must be updated within 10 ms of a backend failure to prevent new connections from being assigned to the dead backend. Maglev uses a Rendezvous Hash mechanism where each connection is first hashed to a virtual "rendezvous point" in the CHT, and the two nearest backends clockwise from the point are selected. This provides Affinity for Consistent Hashing: existing connections to the surviving backend remain uninterrupted, while only the connections previously assigned to the failed backend are redirected to the new second-choice backend, minimizing the disruption to live traffic.

The Hashed Port Problem

When Maglev's hash function uses both src port and dst port, a quirk arises for HTTP/1.1 persistent connections: the client opens one TCP connection (src port X) and sends multiple requests over it. The hash function produces one backend assignment for the entire connection. But HTTP/2 multiplexes multiple requests over a single TCP connection using the same src port, so all those requests hash to the same backend, creating a load imbalance. Google's solution is to use a Two-Level Hash: the first hash maps the client IP to a "connection group," the second hash uses a flow label (set by the client in the TCP options field) to spread requests within the same connection group across backends. This achieves 10% better load distribution for HTTP/2 traffic at the cost of requiring clients to set the TCP flow label.

Direct Server Return: The Asymmetric Path Optimization

Direct Server Return (DSR), also known as Triangular Routing, eliminates the load balancer as a bottleneck for return traffic. In the standard proxy model, the client sends a request to the VIP, the load balancer rewrites the destination MAC to the backend server's MAC, the backend processes the request, and the response must flow back through the load balancer (which then rewrites the source IP back to the VIP). This creates a bottleneck: the LB must process both inbound and outbound traffic, doubling its throughput requirement. In DSR, the backend server sends the response directly to the client, bypassing the LB entirely. The path is asymmetric: request goes LB → server, response goes server → client:

R_{LB, DSR} = \max(R_{ingress}, R_{egress}) \approx R_{ingress}

This halves the LB throughput requirement—a 100 Gbps LB can terminate 100 Gbps of connections instead of 50 Gbps. The implementation requires that the backend server configures a loopback interface with the VIP address (for the client to see the correct source IP on the response) and enables reverse path filtering to accept the response's source MAC from the directly connected router rather than the LB. In Linux, this is done by setting $rp_filter = 2$ (loose mode) and adding the VIP to the loopback interface. DSR is the standard configuration for L4 LBs in 2026 (AWS NLB, Google's Maglev, Azure ILB) because it halves the hardware cost and eliminates the LB as a latency bottleneck for response data. The trade-off is that DSR cannot perform Connection Draining: if a backend fails, in-flight response packets are lost because the LB cannot buffer the TCP stream. The application must handle retransmission at the client side or use a dual-LB configuration where a secondary LB monitors for server failure and injects RST packets on behalf of the failed server.

Partner in Accuracy

"You are our partner in accuracy. If you spot a discrepancy in calculations, a technical typo, or have a field insight to share, don't hesitate to reach out. Your expertise helps us maintain the highest standards of reliability."

Contributors are acknowledged in our technical updates.

In a Nutshell

1. Layer 4 vs. Layer 7: Speed vs. Context

The Performance Trade-off

Layer 4 (Speed)

Layer 7 (Logic)

Load Distribution Engine

2. Consistent Hashing: Protecting the Cache

The Ring Equation

P2c: Power of Two Choices

3. GSLB & Anycast: Global Traffic Steering

The TTL War: DNS Steering

BGP Anycast Paradox:

4. Adaptive Balancing: EWMA & Gray Failures

The Latency Tracker

Frequently Asked Questions

Technical Standards & References

Related Engineering Resources

L4 vs L7 Forensics

CDN Architecture

Reliability Patterns

Anycast Routing

Maglev: Google's Consistent Hash Table at Global Scale

Direct Server Return: The Asymmetric Path Optimization