Slide 1

Slide 1 text

Kubernetes Networking Seattle Kubernetes Meetup CJ Cullen Software Engineer @cj_cullen github.com/cjcullen

Slide 2

Slide 2 text

Docker Networking

Slide 3

Slide 3 text

Docker networking docker start ...

Slide 4

Slide 4 text

Docker networking docker start ...

Slide 5

Slide 5 text

Docker networking docker0 172.16.1.0/24

Slide 6

Slide 6 text

Docker networking docker0 172.16.1.0/24 docker run ...

Slide 7

Slide 7 text

Docker networking docker0 172.16.1.0/24

Slide 8

Slide 8 text

Docker networking docker0 172.16.1.0/24 172.16.1.1 vethAQ2IT eth0

Slide 9

Slide 9 text

Docker networking docker0 172.16.1.0/24 172.16.1.1 vethAQ2IT eth0 docker run ...

Slide 10

Slide 10 text

Docker networking docker0 172.16.1.0/24 172.16.1.1 vethAQ2IT eth0 172.16.1.2 vethS1LUI eth0

Slide 11

Slide 11 text

172.16.1.1 172.16.1.2 Docker networking 172.16.1.1 172.16.1.1

Slide 12

Slide 12 text

172.16.1.1 172.16.1.2 Docker networking 172.16.1.1 172.16.1.1 NAT NAT NAT NAT NAT

Slide 13

Slide 13 text

Host ports A: 172.16.1.1 3306 B: 172.16.1.2 80 9376 11878 SNAT SNAT C: 172.16.1.1 8000

Slide 14

Slide 14 text

Host ports A: 172.16.1.1 3306 B: 172.16.1.2 80 9376 11878 SNAT SNAT C: 172.16.1.1 8000 REJECTED

Slide 15

Slide 15 text

Kubernetes Networking

Slide 16

Slide 16 text

Kubernetes networking IPs are routable • vs docker default private IP Pods can reach each other without NAT • even across nodes No brokering of port numbers • too complex, why bother? This is a fundamental requirement • can be L3 routed • can be underlayed (cloud) • can be overlayed (SDN)

Slide 17

Slide 17 text

10.1.1.0/24 10.1.1.1 10.1.1.2 Kubernetes networking 10.1.2.0/24 10.1.2.1 10.1.3.0/24 10.1.3.1

Slide 18

Slide 18 text

10.1.1.0/24 10.1.1.1 10.1.1.2 Kubernetes networking 10.1.2.0/24 10.1.2.1 10.1.3.0/24 10.1.3.1 ?

Slide 19

Slide 19 text

Kubernetes networking On GCE/GKE • GCE Advanced Routes (program the fabric) • “Everything to 10.1.1.0/24, send to this VM” Plenty of other ways • AWS: Route Tables • Weave • Calico • Flannel • OVS • OpenContrail • Cisco Contiv • Others...

Slide 20

Slide 20 text

Kubernetes networking On GCE/GKE • GCE Advanced Routes (program the fabric) • “Everything to 10.1.1.0/24, send to this VM” Plenty of other ways • AWS: Route Tables • Weave • Calico • Flannel • OVS • OpenContrail • Cisco Contiv • Others...

Slide 21

Slide 21 text

Kubernetes networking On GCE/GKE • GCE Advanced Routes (program the fabric) • “Everything to 10.1.1.0/24, send to this VM” Plenty of other ways • AWS: Route Tables • Weave • Calico • Flannel • OVS • OpenContrail • Cisco Contiv • Others...

Slide 22

Slide 22 text

Pods

Slide 23

Slide 23 text

Pods Small group of containers & volumes Tightly coupled The atom of scheduling & placement Shared namespace • share IP address & localhost • share IPC, etc. Managed lifecycle • bound to a node, restart in place • can die, cannot be reborn with same ID Example: data puller & web server Consumers Content Manager File Puller Web Server Volume Pod

Slide 24

Slide 24 text

Pods Small group of containers & volumes Tightly coupled The atom of scheduling & placement Shared namespace • share IP address & localhost • share IPC, etc. Managed lifecycle • bound to a node, restart in place • can die, cannot be reborn with same ID Example: data puller & web server 10.1.1.2

Slide 25

Slide 25 text

Pods Small group of containers & volumes Tightly coupled The atom of scheduling & placement Shared namespace • share IP address & localhost • share IPC, etc. Managed lifecycle • bound to a node, restart in place • can die, cannot be reborn with same ID Example: data puller & web server c1 --net=container:infra --ipc=container:infra infra 10.1.1.2 c2 --net=container:infra --ipc=container:infra

Slide 26

Slide 26 text

Services

Slide 27

Slide 27 text

Services A group of pods that work together • grouped by a selector Defines access policy • “load balanced” or “headless” Gets a stable virtual IP and port • sometimes called the service portal • also a DNS name VIP is managed by kube-proxy • watches all services • updates iptables when backends change Hides complexity - ideal for non-native apps Client Virtual IP

Slide 28

Slide 28 text

kube-proxy

Slide 29

Slide 29 text

kube-proxy (legacy) iptables kube-proxy apiserver Node X

Slide 30

Slide 30 text

iptables apiserver Node X watch services & endpoints kube-proxy (legacy) kube-proxy

Slide 31

Slide 31 text

iptables apiserver Node X kubectl run ... watch kube-proxy (legacy) kube-proxy

Slide 32

Slide 32 text

iptables apiserver Node X schedule watch kube-proxy (legacy) kube-proxy

Slide 33

Slide 33 text

iptables apiserver Node X watch kubectl expose ... kube-proxy (legacy) kube-proxy

Slide 34

Slide 34 text

iptables apiserver Node X new service! update kube-proxy (legacy) kube-proxy

Slide 35

Slide 35 text

iptables apiserver Node X watch kube-proxy (legacy) kube-proxy listen

Slide 36

Slide 36 text

iptables apiserver Node X watch kube-proxy (legacy) kube-proxy listen

Slide 37

Slide 37 text

iptables apiserver Node X watch configure kube-proxy (legacy) kube-proxy

Slide 38

Slide 38 text

iptables apiserver Node X watch VIP kube-proxy (legacy) kube-proxy

Slide 39

Slide 39 text

iptables apiserver Node X new endpoints! update VIP kube-proxy (legacy) kube-proxy

Slide 40

Slide 40 text

iptables apiserver Node X VIP watch kube-proxy (legacy) kube-proxy

Slide 41

Slide 41 text

iptables apiserver Node X VIP watch kube-proxy (legacy) kube-proxy Client

Slide 42

Slide 42 text

iptables apiserver Node X VIP watch kube-proxy (legacy) kube-proxy Client

Slide 43

Slide 43 text

iptables apiserver Node X VIP watch kube-proxy (legacy) kube-proxy Client

Slide 44

Slide 44 text

iptables apiserver Node X VIP watch kube-proxy (legacy) kube-proxy Client

Slide 45

Slide 45 text

kube-proxy (legacy) Userspace proxy isn’t ideal Burns CPU copying bytes • “Proxy” is just parallel copy loops. Loses source IP • Everything looks like it’s from the node IP. Userspace TCP listening = higher latency

Slide 46

Slide 46 text

iptables kube-proxy

Slide 47

Slide 47 text

iptables kube-proxy iptables kube-proxy apiserver Node X

Slide 48

Slide 48 text

iptables kube-proxy iptables kube-proxy apiserver Node X watch services & endpoints

Slide 49

Slide 49 text

iptables kube-proxy iptables kube-proxy apiserver Node X kubectl run ... watch

Slide 50

Slide 50 text

iptables kube-proxy iptables kube-proxy apiserver Node X schedule watch

Slide 51

Slide 51 text

iptables kube-proxy iptables kube-proxy apiserver Node X watch kubectl expose ...

Slide 52

Slide 52 text

iptables kube-proxy iptables kube-proxy apiserver Node X new service! update

Slide 53

Slide 53 text

iptables kube-proxy iptables kube-proxy apiserver Node X watch configure

Slide 54

Slide 54 text

iptables kube-proxy iptables kube-proxy apiserver Node X watch VIP

Slide 55

Slide 55 text

iptables kube-proxy iptables kube-proxy apiserver Node X new endpoints! update VIP

Slide 56

Slide 56 text

iptables kube-proxy iptables kube-proxy apiserver Node X VIP watch configure

Slide 57

Slide 57 text

iptables kube-proxy iptables kube-proxy apiserver Node X VIP watch

Slide 58

Slide 58 text

iptables kube-proxy iptables kube-proxy apiserver Node X VIP watch Client

Slide 59

Slide 59 text

iptables kube-proxy iptables kube-proxy apiserver Node X VIP watch Client

Slide 60

Slide 60 text

iptables kube-proxy iptables kube-proxy apiserver Node X VIP watch Client

Slide 61

Slide 61 text

iptables kube-proxy iptables kube-proxy apiserver Node X VIP watch Client

Slide 62

Slide 62 text

iptables kube-proxy Mean Latency contrib/for-tests/netperf-tester --number=1000 Mean Latency Microseconds iptables kube-proxy legacy kube-proxy

Slide 63

Slide 63 text

Services are just an abstraction • Only requirement: route (and maybe load balance) a virtual IP to a set of backends. Kube-proxy is an implementation • Kube-proxy watches apiserver. • iptables is re-configured on changes. There could be other ways • Userspace, iptables, IP Virtual Servers? Services

Slide 64

Slide 64 text

DNS Run SkyDNS as a pod in the cluster • kube2sky bridges Kubernetes API -> SkyDNS • Tell kubelets about it (static service IP) Strictly optional, but practically required • LOTS of things depend on it • Probably will become more integrated Or plug in your own! kubernetes kubernetes.default kubernetes.default.svc.cluster.local foo.my-namespace.svc.cluster.local

Slide 65

Slide 65 text

DNS Run SkyDNS as a pod in the cluster • kube2sky bridges Kubernetes API -> SkyDNS • Tell kubelets about it (static service IP) Strictly optional, but practically required • LOTS of things depend on it • Probably will become more integrated Or plug in your own! apiserver watch etcd kube-dns-qxin kube2sky skyDNS

Slide 66

Slide 66 text

DNS Run SkyDNS as a pod in the cluster • kube2sky bridges Kubernetes API -> SkyDNS • Tell kubelets about it (static service IP) Strictly optional, but practically required • LOTS of things depend on it • Probably will become more integrated Or plug in your own! nameserver 10.0.0.10 ... /etc/resolv.conf apiserver watch etcd kube-dns-qxin kube2sky skyDNS

Slide 67

Slide 67 text

DNS Run SkyDNS as a pod in the cluster • kube2sky bridges Kubernetes API -> SkyDNS • Tell kubelets about it (static service IP) Strictly optional, but practically required • LOTS of things depend on it • Probably will become more integrated Or plug in your own! nameserver 10.0.0.10 ... /etc/resolv.conf apiserver watch etcd kube-dns-qxin kube2sky skyDNS 10.0.0.10

Slide 68

Slide 68 text

Putting it Together What happens when I... $ curl foo.my-namespace Client

Slide 69

Slide 69 text

What happens when I... $ curl foo.my-namespace Putting it Together nameserver 10.0.0.10 ... /etc/resolv.conf Client 10.1.0.1

Slide 70

Slide 70 text

10.1.0.1 Putting it Together What happens when I... $ curl foo.my-namespace etcd kube-dns-qxin kube2sky skyDNS 10.0.0.10 foo.my-namespace? Client

Slide 71

Slide 71 text

Putting it Together What happens when I... $ curl foo.my-namespace etcd kube-dns-qxin kube2sky skyDNS 10.0.0.10 10.0.123.45 Client 10.1.0.1

Slide 72

Slide 72 text

Putting it Together What happens when I... $ curl foo.my-namespace 10.0.123.45 Client 10.1.0.1

Slide 73

Slide 73 text

Putting it Together What happens when I... $ curl foo.my-namespace Client VIP 10.0.123.45 10.1.0.1

Slide 74

Slide 74 text

Putting it Together What happens when I... $ curl foo.my-namespace Client VIP 10.0.123.45 iptables 10.1.0.1

Slide 75

Slide 75 text

Putting it Together What happens when I... $ curl foo.my-namespace Client VIP 10.0.123.45 iptables 10.1.0.1 10.1.0.6 10.1.3.1 10.1.6.3

Slide 76

Slide 76 text

Putting it Together What happens when I... $ curl foo.my-namespace Client VIP 10.0.123.45 iptables 10.1.3.1 10.1.0.1 10.1.0.6 10.1.3.1 10.1.6.3

Slide 77

Slide 77 text

Putting it Together What happens when I... $ curl foo.my-namespace Client VIP 10.0.123.45 iptables 10.1.3.1 10.1.0.1 10.1.0.6 10.1.3.1 10.1.6.3 10.1.3.0/24 -> Node X

Slide 78

Slide 78 text

Putting it Together What happens when I... $ curl foo.my-namespace Client VIP 10.0.123.45 iptables 10.1.3.1 10.1.0.1 10.1.0.6 10.1.3.1 10.1.6.3

Slide 79

Slide 79 text

Putting it Together What happens when I... $ curl foo.my-namespace Client VIP 10.0.123.45 iptables 10.1.3.1 10.1.0.1 10.1.0.6 10.1.3.1 10.1.6.3 Hello World!

Slide 80

Slide 80 text

Putting it Together What happens when I... $ curl foo.my-namespace Client iptables Hello World! 10.1.0.1 10.1.0.6 10.1.3.1 10.1.6.3 10.1.0.1

Slide 81

Slide 81 text

Putting it Together What happens when I... $ curl foo.my-namespace Client iptables Hello World! 10.1.0.1 10.1.0.6 10.1.3.1 10.1.6.3 10.1.0.0/24 -> Node Y 10.1.0.1

Slide 82

Slide 82 text

Putting it Together What happens when I... $ curl foo.my-namespace Client iptables Hello World! 10.1.0.1 10.1.0.6 10.1.3.1 10.1.6.3 10.1.0.0/24 -> Node Y 10.1.0.1

Slide 83

Slide 83 text

Putting it Together What happens when I... $ curl foo.my-namespace Hello World! Client iptables Hello World! 10.1.0.1 10.1.0.6 10.1.3.1 10.1.6.3 10.1.0.0/24 -> Node Y 10.1.0.1

Slide 84

Slide 84 text

What about external?

Slide 85

Slide 85 text

External Services Services IPs are only available inside the cluster Need to receive traffic from “the outside world” Builtin: Service “type” • nodePort: expose on a port on every node • loadBalancer: provision a cloud load-balancer DiY load-balancer solutions • socat (for nodePort remapping) • haproxy • nginx

Slide 86

Slide 86 text

The Bleeding Edge

Slide 87

Slide 87 text

Ingress (L7) Services are assumed L3/L4 Lots of apps want HTTP/HTTPS Ingress maps incoming traffic to backend services • by HTTP host headers • by HTTP URL paths HAProxy and GCE implementations No SSL yet Status: BETA in Kubernetes v1.1 URL Map Client

Slide 88

Slide 88 text

Ingress (L7) Services are assumed L3/L4 Lots of apps want HTTP/HTTPS Ingress maps incoming traffic to backend services • by HTTP host headers • by HTTP URL paths HAProxy and GCE implementations No SSL yet Status: BETA in Kubernetes v1.1 URL Map Client api.company.com api.company.com/foo api.company.com/bar othercompany.com/*

Slide 89

Slide 89 text

Network Plugins

Slide 90

Slide 90 text

Network Plugins Introduced in Kubernetes v1.0 • VERY experimental Uses CNI (CoreOS) in v1.1 • Simple exec interface • Not using Docker libnetwork • but can defer to Docker for networking Cluster admins can customize their installs • DHCP, MACVLAN, Flannel, custom net Plugin Plugin Plugin

Slide 91

Slide 91 text

Kubernetes is Open - open community - open design - open source - open to ideas Networking is Hard - help guide us! http://kubernetes.io https://github.com/kubernetes/kubernetes slack: kubernetes twitter: @kubernetesio