IGMP Snooping & L2 Multicast
Optimizing Distribution in Switched Fabrics
The Multicast-to-Broadcast Fallacy
By default, an Ethernet switch is entirely unaware of IP Multicast groups. When a multicast frame arrives with a destination MAC address not present in the CAM (Content Addressable Memory) table, the switch follows standard unknown-unicast/multicast behavior: it floods the frame out of every port in the VLAN except the ingress port. In a 48-port access switch, one multicast stream floods 47 ports with unwanted traffic.
For a single video stream, this means 47 endpoints — servers, printers, VoIP phones, workstations — receive a stream none of them requested, consuming CPU cycles for packet processing and injecting noise onto every link. Scale this to 20 multicast streams in a broadcast domain, and available bandwidth collapses for every host on the VLAN.
How IGMP Snooping Intervenes
IGMP Snooping allows the Layer 2 switch to examine (snoop) IGMP control messages exchanged between hosts and routers — without being a Layer 3 device itself. When the switch detects an IGMP Membership Report (Join) on a port, it records the port number and multicast group address in its Layer 2 Multicast Table (separate from the standard MAC address table).
IGMP Snooping Mechanics
Interactive Layer 2 Multicast Pruning & Join/Leave Events
No active listeners.
From this point, traffic for that multicast group is forwarded only to ports with active memberships plus the configured MRouter port. Ports with no active members receive nothing — no wasted bandwidth, no phantom CPU load on end hosts. This is the definition of Layer 2 Multicast Pruning.
The Querier Mechanism: Keeping Membership Fresh
Membership tables would become stale if hosts leave groups without announcing it (e.g., power-off). IGMP defines a Querier role — typically the Layer 3 gateway — that periodically sends General Query messages to the all-hosts multicast address (). Hosts with active group memberships must respond with Membership Reports within the Query Response Interval (default: ).
If a host fails to respond to a General Query (or multiple queries), the switch removes its port from the membership entry. If no hosts respond for a group at all, the group entry is deleted entirely, stopping forwarding on that stream completely.
- Query Interval: Default between General Queries.
- Query Response Interval: — the window for hosts to respond.
- Group Membership Timeout:
IGMPv2 vs. IGMPv3: Source-Specific Multicast
IGMP has evolved through three versions, each adding precision to group membership management:
- IGMPv1: Basic join/leave. No explicit Leave Group message — groups expire via timeout only. Slow convergence.
- IGMPv2: Introduces the explicit Leave Group message, triggering a Group-Specific Query before removing the entry. Enables Fast-Leave processing on ports with a single receiver, dramatically reducing channel-change latency for IPTV.
- IGMPv3: Adds Source-Specific Multicast (SSM). Instead of joining a group (), hosts can request traffic from a specific sender (). This allows a network to enforce that a receiver only accepts video from the authorized encoder, blocking any rogue source injecting content into the same group address.
CPU Exhaustion: The Control Plane Cost
IGMP Snooping has a hidden cost: IGMP Join, Leave, and Query packets are exception traffic — packets that cannot be handled by hardware ASICs and must be forwarded to the switch CPU for processing. In large VLANs with hundreds of active multicast groups and frequent leave/join cycling (e.g., IPTV channel zapping), this can generate thousands of IGMP messages per second, causing Control Plane Exhaustion.
Mitigation strategies include:
- IGMP Rate Limiting: Apply a CoPP (Control Plane Policing) policy specifically for IGMP traffic to cap the CPU-bound exception rate.
- IGMP Join Suppression: In environments with many receivers per group, suppress redundant Join messages to prevent CPU spikes when all receivers respond to a General Query simultaneously.
- VLAN Segmentation: Limit multicast group density per VLAN to reduce the per-VLAN IGMP table size and associated CPU overhead.
MLD Snooping: The IPv6 Equivalent
IPv6 replaces IGMP with MLD (Multicast Listener Discovery), defined in RFC 2710 (MLDv1) and RFC 3810 (MLDv2). MLD Snooping on a Layer 2 switch performs the same function — pruning multicast distribution based on listener reports — but operates on ICMPv6 messages (type for MLDv1, type 143 for MLDv2). In dual-stack networks, both IGMP Snooping (IPv4 multicast) and MLD Snooping (IPv6 multicast) must be enabled and configured independently. Many switch platforms enable IGMP Snooping by default but leave MLD Snooping disabled, causing IPv6 multicast to flood as broadcast while IPv4 multicast is correctly pruned.
IGMP Querier Election and the Multicast Router Discovery Protocol
In a multi-access Layer 2 segment with multiple multicast routers (or switches acting as IGMP queriers), a single router must be elected as the Querier — the entity responsible for sending periodic IGMP General Query messages to discover active multicast listeners. The Querier election, defined in RFC 2236 (IGMPv2), uses a simple comparison: the router with the lowest IP address on the subnet becomes the Querier. If the Querier fails, the router with the next-lowest IP takes over after a Querier Timeout (default , typically twice the Query Interval of ).
The IGMP Query Message Format
The IGMP General Query is an IP packet with protocol number 2, destination IP (all-hosts multicast), and a Type field of . The packet structure is:
+-- Type (1 byte): 0x11
+-- Max Resp Time (1 byte): 100 (10.0 seconds in units of 0.1s)
+-- Checksum (2 bytes): IP pseudo-header checksum
+-- Group Address (4 bytes): 0.0.0.0 (General Query) or group-specific
The Max Response Time field is the key convergence parameter. It tells listeners the maximum time they have to respond to the query. A listener picks a random delay between 0 and this value before sending its Membership Report. If another listener for the same group reports first, the random timer is cancelled — this is the Report Suppression mechanism that prevents all listeners from responding simultaneously. The Max Response Time defaults to 10 seconds but can be tuned to as low as 1 second for fast-leave environments, at the cost of increased control-plane traffic:
In a VLAN with 500 multicast groups and 10 listeners per group, reducing the Max Response Time from 10 s to 1 s increases the peak report rate from approximately 500 reports/second to 5,000 reports/second — a 10x CPU load increase on the Querier. Modern switches mitigate this with IGMP Query Rate Limiting, which caps the number of queries processed per second regardless of how many queries are received.
The Multicast Router Discovery (MRD) Protocol
In complex topologies, a switch may need to discover which ports connect to multicast routers (upstream toward the source) and which connect to listeners (downstream). MRD (Multicast Router Discovery, RFC 4286) is an advisory protocol that operates independently of IGMP. MRD routers send periodic Multicast Router Advertisement (MRA) messages to (the ALL-MROUTERS multicast address). Switches listening for these messages mark the receiving port as a mrouter port — all multicast traffic is forwarded to that port regardless of IGMP membership reports.
If MRD is not deployed, the switch must rely on alternative mechanisms to identify mrouter ports: (1) observing IGMP Queries arriving on a port (if the switch sees an IGMP Query on Port A, Port A is an mrouter port), (2) using the PIM (Protocol Independent Multicast) hello messages if PIM-SM is configured, or (3) statically configuring the mrouter port. The static approach is most common in data center fabrics where all Spine switches are known mrouter ports for each Leaf VNI. Failure to correctly identify mrouter ports causes multicast black-holes: traffic from a source reaches the switch but is not forwarded upstream because the switch sees no listener on the mrouter port.
IGMP Snooping Control-Plane Protection and TCAM Scaling at the Access Edge
The IGMP snooping function is implemented in the switch's control-plane CPU, not the data-plane ASIC. Each IGMP Membership Report, Leave, and Query must be punted from the ASIC to the CPU for processing. In a high-density access switch with 48 ports and 1,000+ simultaneous multicast groups, the punt rate can overwhelm the CPU. This is managed through three ASIC-level protection mechanisms.
Rate-Limiting and CoPP for IGMP
The first line of defense is IGMP Rate-Limiting on the ingress port. The ASIC's policer limits the rate of packets with destination MAC (IPv4 multicast MAC range) to a configurable PPS value. The typical configuration for an access switch is:
ip igmp limit 100
ip igmp rate-limit 1000
!
// Global CoPP protection
control-plane
service-policy input COPP-IGMP
!
class-map IGMP-CLASS
match access-group name IGMP
!
policy-map COPP-IGMP
class IGMP-CLASS
police 10000 pps conform-action transmit exceed-action drop
This limits each port to 100 active IGMP groups and 1,000 IGMP packets per second. If the limit is exceeded, excess packets are dropped in hardware, protecting the CPU from a multicast DoS attack. The trade-off is that legitimate multicast groups beyond 100 per port are silently ignored — the switch never learns about them, and traffic for those groups floods to all ports.
IGMP Group-Specific TCAM Regions
Modern access-layer ASICs (e.g., Broadcom Tomahawk 5, Jericho 3) implement IGMP snooping tables in dedicated TCAM regions that are separate from the MAC/ACL TCAM. A typical allocation is 4K–8K IGMP group entries per chip. When the IGMP group limit is reached, the ASIC enters Group-Exhaustion Mode: it drops all new IGMP JOIN requests and floods all multicast traffic for unregistered groups to all ports in the VLAN. This behavior is configurable via the ip igmp snooping last-member-query-count and ip igmp snooping robustness-variable parameters, which control how aggressively stale groups are pruned.
Fast-Leave Processing and the Leave Latency Equation
When a host sends an IGMP Leave message (IGMPv2) or an IS_EX (IGMPv3), the switch sends a Group-Specific Query (GSQ) to verify that no other listener exists on the port. The number of GSQs is controlled by the last-member-query-count (default 2) and the last-member-query-interval (default 1 second). The total leave latency is:
With defaults, . In an IPTV environment where channel changes must happen in under 200 ms, this is unacceptable. Fast-Leave (or Immediate Leave) bypasses the GSQ process entirely: the switch immediately removes the port from the multicast group after receiving a Leave. Fast-Leave is only safe on ports with a single listener (access ports). Enabling it on a shared segment with multiple listeners causes premature group removal and traffic interruption for other listeners on the same port.