The Physics of Load Balancing: Algorithms & Health Checks

The "Traffic Cop" Problem

The goal of load balancing is to maximize throughput, minimize response time, and ensure no single server is overwhelmed.

Technical Trio: Core Algorithms

Round Robin: Requests are sent sequentially to each server in the list. Simple, but doesn't account for server load or request complexity.
Least Connections: Sends the next request to the server with the fewest active connections. Ideal for long-lived connections (e.g., database streams).
IP Hash: Uses the client's IP address to consistently route them to the same server. Essential for applications that aren't fully stateless.

Traffic Distribution Lab

L7 Algorithm Efficiency & Failover Simulation

Ingress Traffic Rate1 req/s

Server A

Server B

Server C

Client 1

Client 2

Client 3

PROXY_LB

Server A

Load: 0 connCap: 20

Server B

Load: 0 connCap: 15

Server C

Load: 0 connCap: 25

Round Robin: Best for clusters where all servers have identical specs and requests take roughly the same time to process. Ignores actual server performance.

Least Connections: The dynamic choice. If Server B is processing a heavy 1GB download, the LB knows and sends new lighter requests to Server A/C instead.

IP Hash / Persistence: Critical for legacy apps. By hashing the source IP, we ensure Client 1 always hits Server A, maintaining their local session state.

Health Checks: The Pulse of the Pool

An algorithm is useless if it sends traffic to a dead server. Load balancers perform periodic Health Checks.

L3/L4 Check: Can I ping the server? Is port 443 open?
L7 (Active) Check: Does the `/health` endpoint return a `200 OK`? This detects application-level hangs even if the network stack is up.

Session Persistence (Sticky Sessions)

Many legacy applications store user data in local server RAM. If a user's first request goes to Server A and their second goes to Server B, they are 'logged out.'

We solve this with Session Affinity (using cookies or IP hashing), though modern architecture prefers Stateless Services where session data is stored in a shared Redis cache.

Conclusion

Load balancing is the foundation of high availability. By choosing the right algorithm for your traffic pattern, you transform a fragile single point of failure into a resilient, scalable cluster.

Engineering Knowledge Expansion

Traffic Engineering

PingDo.net

Load Balancing Algorithms

In a Nutshell

The "Traffic Cop" Problem

Health Checks: The Pulse of the Pool

Session Persistence (Sticky Sessions)

Conclusion

Layer 4 & 7 Load Balancing

API Gateway Architecture

Service Reliability Metrics

Technical Standards & References

Related Engineering Resources

Load Balancing

API Gateway Architecture

Theoretical RTT