Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Networking in a Containerized Data Center - the...

Networking in a Containerized Data Center - the Gotchas

This was presented at the Microservices for the Enterprise Meetup in March 2016. I present 10 "gotchas" to applying virtual networking concepts to containers, all of which were traps people were falling into at that time.

Andy Randall

March 31, 2016
Tweet

More Decks by Andy Randall

Other Decks in Technology

Transcript

  1. Project Calico is sponsored by @projectcalico Sponsored by Networking in

    a Containerized Data Center: the Gotchas! MICROSERVICES FOR ENTERPRISES MEETUP Andy Randall | @andrew_randall Palo Alto, March 31, 2016
  2. Project Calico is sponsored by @projectcalico (n) North American. “an

    instance of publicly tricking someone or exposing them to ridicule, especially by means of an elaborate deception.”
  3. Project Calico is sponsored by @projectcalico The original “container approach”

    to networking § All containers on a machine share the same IP address § Gotcha #1: WWW1 WWW2 80 80 Proxy 8080 8081 Still most container deployments use this method!
  4. Project Calico is sponsored by @projectcalico World is moving to

    “IP per container” Container Network Interface (CNI) Container Network Model (libnetwork, 0.19) net-modules (0.26) (future: CNI?)
  5. Project Calico is sponsored by @projectcalico We’ve solved “IP per

    VM” before… VM 1 VM 2 VM 3 Virtual Switch
  6. Project Calico is sponsored by @projectcalico We’ve solved “IP per

    VM” before… VM 1 VM 2 VM 3 Virtual Switch VM 1 VM 2 VM 3 Virtual Switch
  7. Project Calico is sponsored by @projectcalico Consequences for containers (gotcha

    #2): Scale Hundreds of servers, low churn Millions of containers, high churn
  8. Project Calico is sponsored by @projectcalico pHost 1 Virtual Switch

    / encapsulation vNIC pNIC vNIC VM1 Consequences for containers (gotcha #3): Layering Packets are double encap’d! Container A Container B Container C Virtual Switch / encapsulation veth0 veth1 veth2 pHost 2 Virtual Switch / encapsulation VM2 Container D Container E Container F Virtual Switch / encapsulation pNIC vNIC vNIC veth0 veth1 veth2 Physical Switch
  9. Project Calico is sponsored by @projectcalico Consequences for containers (gotcha

    #4): walled gardens Legacy App pHost 1 Virtual Switch / encapsulation vNIC pNIC vNIC VM1 Container A Container B Container C Virtual Switch / encapsulation veth0 veth1 veth2 Physical Switch
  10. Project Calico is sponsored by @projectcalico “Any intelligent fool can

    make things bigger, more complex… It takes a touch of genius – and a lot of courage – to move in the opposite direction.”
  11. Project Calico is sponsored by @projectcalico A Saner Approach: just

    route IP from the container pHost 1 Virtual underlay vNIC pNIC vNIC VM1 Container A Container B Container C Linux kernel routing (no encapsulation) veth0 veth1 veth2 pHost 2 Virtual Underlay VM2 Container D Container E Container F Linux kernel routing (no encapsulation) pNIC vNIC vNIC veth0 veth1 veth2 Physical Underlay
  12. Project Calico is sponsored by @projectcalico Variant: 1 vm per

    host, no virtual underlay, straight-up IP pHost 1 pNIC vNIC VM1 Container A Container B Container C Linux kernel routing (no encapsulation) veth0 veth1 veth2 pHost 2 VM2 Container D Container E Container F Linux kernel routing (no encapsulation) pNIC vNIC veth0 veth1 veth2 Physical Underlay
  13. Project Calico is sponsored by @projectcalico Results: bare metal performance

    from virtual networks 0 1 2 3 4 5 6 7 8 9 10 Bare metal Calico OVS+VXLAN Throughput Gbps 0 20 40 60 80 100 120 Bare metal Calico OVS+VXLAN CPU % per Gbps Source: https://www.projectcalico.org/calico-dataplane-performance/
  14. Project Calico is sponsored by @projectcalico § Some container frameworks

    still assume port mapping § E.g. Marathon load balancer service (but being fixed…) § Some PaaS’s not yet supporting IP per container § But several moving to build on Kubernetes, and will likely pick it up Gotcha #5: IP per container not yet universally supported
  15. Project Calico is sponsored by @projectcalico § You can easily

    get your configuration wrong and get sub- optimal performance, e.g. § select wrong Flannel back-end for your fabric § turn off AWS src-dest IP checks § get MTU size wrong for the underlay… Gotcha #6: running on public cloud
  16. Project Calico is sponsored by @projectcalico Consequences of MTU size…

    0 50 100 150 200 250 300 t2.micro m4.xlarge qperf bandwidth Bare Metal Calico
  17. Project Calico is sponsored by @projectcalico Consequences of MTU size…

    0 50 100 150 200 250 300 t2.micro m4.xlarge qperf bandwidth Bare Metal Calico (MTU=1440) Calico (MTU=8980)
  18. Project Calico is sponsored by @projectcalico § Suppose we assign

    a /24 per Kubernetes node (=> 254 pods) § Run 10 VMs per server, each with a Kubernetes node § 40 servers per rack § 20 racks per data center § 4 data centers § => now need a /15 for the rack, a /10 space for the data center, and the entire 10/8 rfc1918 range to cover 4 data centers. § … and hope your business doesn’t expand to need a 5th data center! Gotcha #7: IP addresses aren’t infinite
  19. Project Calico is sponsored by @projectcalico § Kubernetes § CNI

    fairly stable § Fine-grained policy being added – will move from alpha (annotation— based) to first-class citizen API § Mesos – multiple ways to network your container § Net-modules – but only supports Mesos containerizer § Docker networking – but then not fully integrated e.g. into MesosDNS § CNI – possible future, but not here today § Roll-your-own orchestrator-network co-ordination – the approach some of our users have taken § Docker § Swarm / Docker Datacenter still early; libnetwork evolution? policy? Gotcha #8: orchestration platforms support still evolving
  20. Project Calico is sponsored by @projectcalico § Docker libnetwork provides

    limited functionality / visibility to plug-ins § E.g. network name you specify as a user is NOT passed to the underlying SDN § Consequences: § Diagnostics hard to correlate § Hard to enable ”side loaded” commands referring to networks created on Docker command line (e.g. Calico advanced policy) § Hard to network between Docker virtual network domain and non- containerized workloads Gotcha #9: Docker libnetwork is “special”
  21. Project Calico is sponsored by @projectcalico § “Can you write

    a function that tells me when all nodes have caught up to the global state?” § Sure… Gotcha #10: at cloud scale, nothing ever converges function is_converged() return false