1. BGP EVPN vs. Legacy Flood-and-Learn
In traditional Layer 2 networking, switches learn where a host is by looking at the source MAC of incoming frames. If the destination is unknown, the switch broadcasts (floods) the packet to every port. In a datacenter with servers, this BUM (Broadcast, Unknown Unicast, Multicast) traffic creates a 'Network Storm' that kills performance.
The Operational Forensics
Control Plane Learning (EVPN)
Uses Multi-Protocol BGP to share MAC/IP bindings. Information is known before traffic flows. Zero Flooding. Massive Scalability.
Data Plane Learning (Legacy)
Relies on flooding to discover hosts. Wastes bandwidth. Hard to troubleshoot. Susceptible to Loops and Spanning Tree failures.
2. EVPN Route Types: The 5 Pillars of Connectivity
BGP EVPN uses specialized Network Layer Reachability Information (NLRI) to describe the network. There are critical route types that every architect must master.
Route Type 2 (MAC/IP)
The core of host reachability. It maps a host's MAC and IP to a specific VTEP (Switch). This enables the 'Intelligence' of the fabric.
Route (Prefix)
Used for subnet-level routing between different or for external connectivity to the Internet or firewalls.
Type 1 & (The ESI Combo)
Route Type 4 is used for switches to discover each other on a shared multi-homed link (ESI) and elect a Designated Forwarder. Route Type 1 is used for 'Mass Withdrawal'—if a link fails, a single BGP update can remove all MACs associated with that link, enabling sub-second convergence.
3. Multi-Homing with ESI: Ending Convergence Delay
Legacy multi-chassis link aggregation (MLAG) required proprietary sync protocols. EVPN standardizes this using the **Ethernet Segment Identifier (ESI)**.
The DF Election Logic
In an ESI multi-homing group, the switches perform a 'Designated Forwarder' (DF) election for every VNI. This ensures that only one switch handles the BUM traffic for a given network, preventing duplicate frames and loops without needing Spanning Tree.
Split Horizon Forensics:
EVPN uses a 'Local Bias' or 'Split Horizon' mechanism in the VXLAN header to ensure that a packet sent from one member of an ESI is never reflected back to the same ESI from another member of the cluster.
4. ARP Suppression: The Proxy Forensics
ARP traffic is the 'Background Radiation' of a flat network. EVPN silences this noise by using the BGP control plane as a high-speed lookup engine.
The ARP Proxy Path
- H1 sends an ARP Request for H2.
- Switch A (VTEP) intercepts the ARP packet.
- Switch A looks up H2's IP in its BGP EVPN Type-2 table.
- If found, Switch A crafts an ARP Reply locally and sends it back to H1.
- The broadcast packet is dropped and never enters the fabric core.
Frequently Asked Questions
Technical Standards & References
Related Engineering Resources
7. VXLAN Data-Plane Encapsulation: VTEP Packet Walk and the Inner-to-Outer Header Mapping
While BGP EVPN provides the control-plane intelligence, the VXLAN (Virtual Extensible LAN, RFC 7348) data plane is what actually transports frames across the IP underlay. A deep understanding of the encapsulation mechanics — from the inner Ethernet frame to the outer UDP/IP header — is essential for MTU planning, hardware offload verification, and troubleshooting silent drops.
The VTEP Encapsulation Pipeline
When a VTEP (VXLAN Tunnel Endpoint) receives a frame from a local server belonging to VNI 100, the hardware ASIC follows a strict pipeline:
- Ingress Classification: The frame is received on a switchport belonging to VLAN 100. The ASIC tags the frame with the internal VLAN ID and performs a MAC lookup in the VNI 100's MAC table.
- VNI Mapping: The ASIC maps VLAN 100 to VNI 100 via the VLAN-to-VNI binding table (provisioned by the EVPN control plane via Type-3 routes).
- Remote MAC Lookup: The destination MAC is looked up in the EVPN MAC table (populated by Type-2 routes). The lookup returns the Remote VTEP IP — the underlay IP of the VTEP owning the destination MAC.
- Encapsulation: The ASIC prepends the following headers to the original Ethernet frame: outer MAC (underlay next-hop MAC), outer IP (with source = local VTEP IP, destination = remote VTEP IP), outer UDP (port = 4789, source port = hash of inner packet for ECMP entropy), and the VXLAN header (VNI = 100).
- Underlay Forwarding: The resulting packet — up to 1,550 bytes for a 1,500-byte inner frame — is forwarded through the IP underlay using standard routing.
The sum means a standard 1,514-byte inner Ethernet frame (14-byte header + 1,500-byte payload) becomes a 1,554-byte outer packet on the wire. If the underlay MTU is 1,500 bytes, this packet will be fragmented or dropped. This is why underlay MTU must be set to at least 1,554 bytes (or ideally 1,600 bytes for headroom) on all infrastructure links carrying VXLAN traffic — a fact that is the single most common cause of "black-hole" traffic in new EVPN deployments.
Hardware VTEP Offload: SmartNIC and DPU Integration
In hypervisor environments, the VTEP function can run in software (Open vSwitch kernel module) or be offloaded to a SmartNIC or DPU (Data Processing Unit). NVIDIA's BlueField-3 DPU implements the full VXLAN encapsulation pipeline in hardware, achieving line-rate 200 GbE VTEP performance while consuming zero host CPU cycles. The DPU maintains its own EVPN MAC table (up to 256,000 entries), received from the control-plane agent running on the DPU's embedded Arm core. The host OS is unaware of the VXLAN encapsulation — it sees a standard Ethernet interface with MTU 1,500, while the DPU transparently handles the 50-byte overhead by fragmenting at the hardware level or negotiating a larger MTU with the hypervisor's virtual switch.
8. EVPN Route Target Filtering and Multi-Tenant VNI Scoping
In a multi-tenant EVPN fabric, a single BGP session carries routes for potentially hundreds of tenants, each with their own VNI. Without filtering, a tenant in VNI 100 would receive the MAC/IP advertisements of VNI 200, causing the VTEP to install irrelevant MAC entries and waste TCAM space. Route Target (RT) filtering is the EVPN mechanism for scoping route propagation to only the VNIs that a given VTEP participates in.
The Route Target Constraint Mechanism
Each VNI in EVPN is associated with a Route Target (RT) — an extended community attribute (8 bytes: 2-byte Type + 6-byte Value, typically encoded as or ). The RT is attached to every Type-2, Type-3, and Type-5 route belonging to that VNI. On the receiving VTEP, the BGP process checks each received route's RT against the import RT list configured for each VNI on the local VTEP. If the RT does not match any local VNI, the route is not installed in the Adj-RIB-In — it is discarded before consuming MAC table memory.
This filtering occurs in the BGP control plane, not the data plane. The BGP RT Constraint (RFC 4684) optimization goes a step further: instead of receiving all routes and filtering locally, the VTEP advertises a Route Target Membership NLRI (RTM-NLRI) to its Route Reflector, indicating which RTs it is interested in. The RR then sends only matching routes, reducing the BGP UPDATE rate on each VTEP by 60–90% in a multi-tenant fabric:
AFI: 1 (L2VPN), SAFI: 132 (RT Membership)
Route Target: 65000:100 (VNI 100)
Action: Import (receiving VTEP wants this VNI's routes)
The RT Constraint with Multiple Route Reflectors
In a fabric with multiple Route Reflectors for redundancy, each RR independently receives RTM-NLRIs from each VTEP. If an RR does not receive an RTM-NLRI for RT 65000:100 from any client, it drops all routes carrying that RT from its outbound UPDATEs to all clients. This is called RT Pruning and is the mechanism that prevents a VNI configured on only 5 out of 500 leaf switches from wasting BGP processing on the other 495 leaves. Cumulus Networks reported that RT Constraint reduced BGP memory usage by 55% in a 1,000-leaf fabric with 200 distinct VNIs.
"You are our partner in accuracy. If you spot a discrepancy in calculations, a technical typo, or have a field insight to share, don't hesitate to reach out. Your expertise helps us maintain the highest standards of reliability."
Contributors are acknowledged in our technical updates.