CNI: Networking in Kubernetes
The Overlay Architecture of Containers
The Pod-to-Pod Mandate
In Kubernetes, every Pod gets its own unique IP address. The fundamental networking requirement is that any Pod must be able to communicate with any other Pod on any Node without using Network Address Translation (NAT). This 'flat network' model is deceptively simple to state but complex to implement across physical servers spanning multiple subnets and data centers.
The challenge is that the underlying physical infrastructure was never designed for this requirement. A typical data center uses a traditional routed or VLAN-based network where VMs have one IP per host, and routing between subnets requires explicit configuration. Kubernetes needs to transparently overlay a virtual pod network on top of this existing physical fabric.
How CNI Works
When a Pod is created, the Kubernetes agent (Kubelet) calls a CNI Plugin. The plugin is a binary executable that speaks the CNI specification — a simple JSON API. The plugin is responsible for:
- Assigning an IP address to the Pod from a pre-allocated CIDR block (e.g.,
10.244.0.0/16). - Creating a virtual ethernet pair (veth): one end lives inside the Pod's network namespace, the other on the host node.
- Updating the routing table on the host node so traffic destined for this Pod's IP routes to its veth interface.
- Establishing the tunnel (if using an overlay) to other nodes so cross-node Pod traffic can be forwarded.
- Programming any NetworkPolicy rules (IPtables or eBPF maps) that restrict traffic based on pod labels.
Pod Networking Visualizer
CNI Data Plane & Encapsulation Logic
veth pairs.Service Networking & Kube-Proxy
Pods are ephemeral — they die and restart with new IPs constantly. We use Services to provide a stable virtual IP (ClusterIP) that persists regardless of which Pod instances are running behind it. The magic of mapping a Service IP to a set of Pod IPs happens via Kube-Proxy, which runs on every node and installs IPtables or IPVS rules that perform load-balanced DNAT (Destination NAT) when traffic hits the Service ClusterIP.
The MTU Tax: A Hidden Performance Trap
When using overlay encapsulation (VXLAN), every packet gains an additional header: 8 bytes VXLAN, 8 bytes UDP, 20 bytes IP, 14 bytes Ethernet = 50 bytes of overhead per packet. On a standard 1500-byte MTU network, this leaves only 1450 bytes for actual pod payload. If the Pod MTU is not explicitly reduced to 1450, the physical host will receive 1500-byte packets that now exceed the physical MTU after encapsulation, causing silent packet fragmentation.
Conclusion
Kubernetes networking is the ultimate abstraction layer. It hides the complexity of physical routing from the application developer, but it requires the platform engineer to deeply understand the tunnels, veth interfaces, IPtables chains, and eBPF maps that make that abstraction possible. The evolution from Flannel to Calico to Cilium mirrors the broader industry shift from rule-based kernel networking to programmable, kernel-native data planes — a shift that will define cloud-native infrastructure for the next decade.