Networking in a Containerized Data Center - the Gotchas
This was presented at the Microservices for the Enterprise Meetup in March 2016. I present 10 "gotchas" to applying virtual networking concepts to containers, all of which were traps people were falling into at that time.
to networking § All containers on a machine share the same IP address § Gotcha #1: WWW1 WWW2 80 80 Proxy 8080 8081 Still most container deployments use this method!
route IP from the container pHost 1 Virtual underlay vNIC pNIC vNIC VM1 Container A Container B Container C Linux kernel routing (no encapsulation) veth0 veth1 veth2 pHost 2 Virtual Underlay VM2 Container D Container E Container F Linux kernel routing (no encapsulation) pNIC vNIC vNIC veth0 veth1 veth2 Physical Underlay
host, no virtual underlay, straight-up IP pHost 1 pNIC vNIC VM1 Container A Container B Container C Linux kernel routing (no encapsulation) veth0 veth1 veth2 pHost 2 VM2 Container D Container E Container F Linux kernel routing (no encapsulation) pNIC vNIC veth0 veth1 veth2 Physical Underlay
still assume port mapping § E.g. Marathon load balancer service (but being fixed…) § Some PaaS’s not yet supporting IP per container § But several moving to build on Kubernetes, and will likely pick it up Gotcha #5: IP per container not yet universally supported
get your configuration wrong and get sub- optimal performance, e.g. § select wrong Flannel back-end for your fabric § turn off AWS src-dest IP checks § get MTU size wrong for the underlay… Gotcha #6: running on public cloud
a /24 per Kubernetes node (=> 254 pods) § Run 10 VMs per server, each with a Kubernetes node § 40 servers per rack § 20 racks per data center § 4 data centers § => now need a /15 for the rack, a /10 space for the data center, and the entire 10/8 rfc1918 range to cover 4 data centers. § … and hope your business doesn’t expand to need a 5th data center! Gotcha #7: IP addresses aren’t infinite
fairly stable § Fine-grained policy being added – will move from alpha (annotation— based) to first-class citizen API § Mesos – multiple ways to network your container § Net-modules – but only supports Mesos containerizer § Docker networking – but then not fully integrated e.g. into MesosDNS § CNI – possible future, but not here today § Roll-your-own orchestrator-network co-ordination – the approach some of our users have taken § Docker § Swarm / Docker Datacenter still early; libnetwork evolution? policy? Gotcha #8: orchestration platforms support still evolving
limited functionality / visibility to plug-ins § E.g. network name you specify as a user is NOT passed to the underlying SDN § Consequences: § Diagnostics hard to correlate § Hard to enable ”side loaded” commands referring to networks created on Docker command line (e.g. Calico advanced policy) § Hard to network between Docker virtual network domain and non- containerized workloads Gotcha #9: Docker libnetwork is “special”
a function that tells me when all nodes have caught up to the global state?” § Sure… Gotcha #10: at cloud scale, nothing ever converges function is_converged() return false