Quantifying RoCE v2 Header Overhead
The Encapsulation Tax
Unlike native InfiniBand, which uses a proprietary L2/L3 transport, **RoCE v2 (RDMA over Converged Ethernet)** leverages the existing IP and UDP stacks to allow RDMA traffic to traverse standard L3 routers. While this enables massive scalability across multi-vendor fabrics, it introduces a significant "encapsulation tax" that reduces the effective Goodput of the network.
Ethernet + IP + UDP
Standard Ethernet (14-18B), IPv4 (20B), and UDP (8B) headers form the outer envelope. In a RoCE v2 fabric, these 42-46 bytes are mandatory per packet, regardless of payload size.
BTH + ICRC
The InfiniBand Base Transport Header (12B) and the Invariant CRC (4B) are nested inside the UDP payload. This is where the RDMA magic happens, providing reliability and direct memory access.
Comparative Efficiency Table
| Header Layer | RoCE v2 size | Impact |
|---|---|---|
| L2 Ethernet + FCS | 18 Bytes | Standard framing |
| L3 IPv4 Header | 20 Bytes | Allows L3 scalability |
| L4 UDP Header | 8 Bytes | Used for entropy/routing |
| IB BTH + ICRC | 16 Bytes | RDMA Transport Payload |
| Total Overhead | 74 Bytes | Including L1 Inter-packet gap |
Optimizing Effective Bandwidth
To minimize the impact of the header tax, AI infrastructure teams typically focus on two variables:
- Jumbo Frames (MTU 9000): By increasing the payload size from 1500 to 9000 bytes, the relative weight of the 74-byte header drops from ~5% to less than 1%.
- Hardware Offloading: Modern BlueField-3 DPUs handles header encapsulation/decapsulation in silicon, ensuring that the GPU's memory bandwidth is dedicated entirely to the model weights.
