The Ethernet Evolution

Traditional Ethernet was built on the principle of Best-Effort Delivery. If a switch buffer overflowed, the router dropped packets, relying on TCP retransmissions to fill the gaps. For RDMA-based AI clusters, this "drop-and-retry" cycle introduces millisecond-level tail latencies that kill performance. **Priority Flow Control (PFC)** and **Enhanced Transmission Selection (ETS)** transform lossy Ethernet into a high-performance lossless fabric.

Interactive Lossless Fabric Simulator

LOSSLESS FABRIC SIMULATOR

Real-time Buffer Management & Scheduling

RDMA (Priority 3)BW: 80%
Buffer: 30%
XOFF
XON
NVMe-oF (Priority 4)BW: 15%
Buffer: 10%
XOFF
XON
MGMT (Priority 0)BW: 5%
Buffer: 5%
XOFF
XON

Priority Flow Control (PFC)

PFC operates at Layer 2 to pause specific traffic classes when buffers fill up (XOFF). This prevents frame loss without blocking the entire physical link, maintaining a 'lossless' environment for RDMA.

Fabric Status OPTIMIZED
RDMA UTILIZATION30%

Observe how PFC pauses specific traffic classes (RDMA) while ETS manages bandwidth allocation across the link.

PFC: Priority Flow Control (802.1Qbb)

PFC operates at the link layer to provide flow control independently for each of the eight traffic classes. When a downstream switch’s buffer reaches a critical threshold (XOFF), it sends a PAUSE frame for that specific class ID (e.g., Priority 3 for RDMA).

XOFF/XON Thresholds

The "Xoff" threshold triggers a pause, while "Xon" signals resumes. At 800G, these thresholds must be tuned with micro-precision to avoid wasting buffer space or risking a drop.

Pause Storm Risk

If a device continuously sends PAUSE frames without clearing its buffer, it can block the entire traffic path. **PFC Watchdogs** are critical for identifying and disabling misbehaving endpoints.

ETS: Enhanced Transmission Selection (802.1Qaz)

While PFC prevents drops, ETS ensures fair bandwidth distribution. It allows network architects to define **Bandwidth Groups** and assign weights, replacing the primitive "Strict Priority" scheduling which could easily starve management and storage traffic.

Traffic ClassWeight (Example)PFC Status
Priority 3 (RoCE v2)80%ENABLED
Priority 4 (Storage)15%ENABLED
Priority 0 (Management)5%DISABLED

RoCE Header Analysis

PFC and ETS are the transport layers for RoCE v2 packets. Explore our deep dive into the 74-byte header overhead that fuels the AI fabric.

Share Article

Technical Standards & References

REF [ieee-802-1qbb]
IEEE (2011)
IEEE Std 802.1Qbb: Priority-based Flow Control
Published: IEEE/ISO
VIEW OFFICIAL SOURCE
REF [ieee-802-1qaz]
IEEE (2011)
IEEE Std 802.1Qaz: Enhanced Transmission Selection
Published: IEEE/ISO
VIEW OFFICIAL SOURCE
Mathematical models derived from standard engineering protocols. Not for human safety critical systems without redundant validation.