Microservices Latency (IPC)
The Cost of Distributed Intelligence
The Monolith vs. Microservice Tax
In a monolith, calling Method B from Method A is a function call on the stack. In a microservice, that same call involves:
- Serialization (JSON/Binary).
- The TCP Handshake (or connection reuse).
- Network propagation.
- Deserialization at the target.
The Serialization Tax
Serialization Overhead
In high-throughput environments, the CPU time spent translating objects to JSON strings can exceed the actual compute time of the microservice. gRPC reduces this by using direct binary memory layouts.
2. The 'Sidecar' Tax: Envoy & Service Meshes
Modern platforms like Istio or Linkerd use Sidecar Proxies (Envoy) to handle security, SSL termination, and observability. While powerful, this architectural pattern introduces a "tax" on every request.
| Hop Stage | Latency (Typical) | Accumulated |
|---|---|---|
| Source Service $\to$ Source Sidecar | ~0.5ms | 0.5ms |
| Source Sidecar $\to$ Destination Sidecar | ~1-5ms (Network) | 1.5 - 5.5ms |
| Dest Sidecar $\to$ Target Service | ~0.5ms | 2.0 - 6.0ms |
In a deep microservice call chain (e.g., 5 services deep), the sidecar latency alone can push the total response time beyond the user's perception threshold (100ms), even if the services themselves are highly optimized.
3. eBPF: Bypassing the TCP Stack
A revolutionary approach to IPC latency is eBPF-based Socket Redirection (used in project Cilium). In a standard Sidecar setup, data goes:
With eBPF, the kernel can "short-circuit" the socket at the sockmap level. If it detects that both sockets are on the same host, it copies data directly from one socket buffer to another, bypassing the entire TCP/IP stack.
eBPF Performance Gain
Typical sidecar latency drop when using eBPF socket redirection:
By removing the traversal of the kernel network stack, eBPF allows sidecar-based architectures to approach the performance of monolithic applications.
Conclusion
Distributed systems are systems of tradeoffs. To build a high-performance cloud application, you must account for the microseconds lost in translation and the milliseconds lost in flight.