In a Nutshell

Kubernetes networking is notoriously complex because it operates on a 'flat' IP per-pod model — every Pod must communicate with every other Pod without NAT. The Container Networking Interface (CNI) is the standard that allows different networking providers (Calico, Flannel, Cilium) to plug into Kubernetes to handle pod-to-pod and pod-to-service communication. This article explores overlay networks, eBPF-based acceleration, MTU trade-offs, and the evolution from IPtables to eBPF-based data planes.

The Pod-to-Pod Mandate

In Kubernetes, every Pod gets its own unique IP address. The fundamental networking requirement is that any Pod must be able to communicate with any other Pod on any Node without using Network Address Translation (NAT). This 'flat network' model is deceptively simple to state but complex to implement across physical servers spanning multiple subnets and data centers.

The challenge is that the underlying physical infrastructure was never designed for this requirement. A typical data center uses a traditional routed or VLAN-based network where VMs have one IP per host, and routing between subnets requires explicit configuration. Kubernetes needs to transparently overlay a virtual pod network on top of this existing physical fabric.

How CNI Works

When a Pod is created, the Kubernetes agent (Kubelet) calls a CNI Plugin. The plugin is a binary executable that speaks the CNI specification — a simple JSON API. The plugin is responsible for:

  1. Assigning an IP address to the Pod from a pre-allocated CIDR block (e.g., 10.244.0.0/16).
  2. Creating a virtual ethernet pair (veth): one end lives inside the Pod's network namespace, the other on the host node.
  3. Updating the routing table on the host node so traffic destined for this Pod's IP routes to its veth interface.
  4. Establishing the tunnel (if using an overlay) to other nodes so cross-node Pod traffic can be forwarded.
  5. Programming any NetworkPolicy rules (IPtables or eBPF maps) that restrict traffic based on pod labels.

Pod Networking Visualizer

CNI Data Plane & Encapsulation Logic

Worker Node A (10.1.0.10)
Pod A10.244.1.2
eth0
Pod B10.244.1.3
eth0
cbr0 (Bridge) / veth-pair junction
eth0
Worker Node B (10.1.0.11)
Pod C10.244.2.2
cbr0
eth0
Pod IP Space
Node IP Space (Underlay)
Local Communication: When Pod A talks to Pod B on the same node, the traffic never leaves the Linux internal bridge (cbr0). It's purely virtual switching via veth pairs.
Overlay Networking (VXLAN): To cross nodes, the pod packet is "encapsulated" inside a regular UDP packet from Node A to Node B. This is why you see the Pod IP inside the Node IP.

Service Networking & Kube-Proxy

Pods are ephemeral — they die and restart with new IPs constantly. We use Services to provide a stable virtual IP (ClusterIP) that persists regardless of which Pod instances are running behind it. The magic of mapping a Service IP to a set of Pod IPs happens via Kube-Proxy, which runs on every node and installs IPtables or IPVS rules that perform load-balanced DNAT (Destination NAT) when traffic hits the Service ClusterIP.

The MTU Tax: A Hidden Performance Trap

When using overlay encapsulation (VXLAN), every packet gains an additional header: 8 bytes VXLAN, 8 bytes UDP, 20 bytes IP, 14 bytes Ethernet = 50 bytes of overhead per packet. On a standard 1500-byte MTU network, this leaves only 1450 bytes for actual pod payload. If the Pod MTU is not explicitly reduced to 1450, the physical host will receive 1500-byte packets that now exceed the physical MTU after encapsulation, causing silent packet fragmentation.

Conclusion

Kubernetes networking is the ultimate abstraction layer. It hides the complexity of physical routing from the application developer, but it requires the platform engineer to deeply understand the tunnels, veth interfaces, IPtables chains, and eBPF maps that make that abstraction possible. The evolution from Flannel to Calico to Cilium mirrors the broader industry shift from rule-based kernel networking to programmable, kernel-native data planes — a shift that will define cloud-native infrastructure for the next decade.

Share Article

Technical Standards & References

CNCF (2024)
CNI Specification v1.0
VIEW OFFICIAL SOURCE
cilium.io (2024)
Cilium: BPF-based Networking and Observability
VIEW OFFICIAL SOURCE
Red Hat (2024)
Kubernetes Networking Explained
VIEW OFFICIAL SOURCE
Linux Foundation (2023)
eBPF: The Future of Linux Networking
VIEW OFFICIAL SOURCE
Mathematical models derived from standard engineering protocols. Not for human safety critical systems without redundant validation.

Related Engineering Resources