Slide 1

Slide 1 text

Google confidential │ Do not distribute Google confidential │ Do not distribute Max Forbes Container Days Boston 2015 Thanks to Brendan Burns and Tim Hockin for nearly all of the slides. Kubernetes Container Orchestration at Scale

Slide 2

Slide 2 text

Google confidential │ Do not distribute Everything at Google runs in containers: • Gmail, Web Search, Maps, ... • MapReduce, batch, ... • GFS, Colossus, ... • Even GCE itself: VMs in containers

Slide 3

Slide 3 text

Google confidential │ Do not distribute Everything at Google runs in containers: • Gmail, Web Search, Maps, ... • MapReduce, batch, ... • GFS, Colossus, ... • Even GCE itself: VMs in containers We launch over 2 billion containers per week.

Slide 4

Slide 4 text

Google confidential │ Do not distribute More than just “running” containers

Slide 5

Slide 5 text

Google confidential │ Do not distribute More than just “running” containers Scheduling: Where should my job be run?

Slide 6

Slide 6 text

Google confidential │ Do not distribute More than just “running” containers Scheduling: Where should my job be run? Lifecycle: Keep my job running

Slide 7

Slide 7 text

Google confidential │ Do not distribute More than just “running” containers Scheduling: Where should my job be run? Lifecycle: Keep my job running Discovery: Where is my job now?

Slide 8

Slide 8 text

Google confidential │ Do not distribute More than just “running” containers Scheduling: Where should my job be run? Lifecycle: Keep my job running Discovery: Where is my job now? Constituency: Who is part of my job?

Slide 9

Slide 9 text

Google confidential │ Do not distribute More than just “running” containers Scheduling: Where should my job be run? Lifecycle: Keep my job running Discovery: Where is my job now? Constituency: Who is part of my job? Scale-up: Making my jobs bigger or smaller

Slide 10

Slide 10 text

Google confidential │ Do not distribute More than just “running” containers Scheduling: Where should my job be run? Lifecycle: Keep my job running Discovery: Where is my job now? Constituency: Who is part of my job? Scale-up: Making my jobs bigger or smaller Auth{n,z}: Who can do things to my job?

Slide 11

Slide 11 text

Google confidential │ Do not distribute More than just “running” containers Scheduling: Where should my job be run? Lifecycle: Keep my job running Discovery: Where is my job now? Constituency: Who is part of my job? Scale-up: Making my jobs bigger or smaller Auth{n,z}: Who can do things to my job? Monitoring: What’s happening with my job?

Slide 12

Slide 12 text

Google confidential │ Do not distribute More than just “running” containers Scheduling: Where should my job be run? Lifecycle: Keep my job running Discovery: Where is my job now? Constituency: Who is part of my job? Scale-up: Making my jobs bigger or smaller Auth{n,z}: Who can do things to my job? Monitoring: What’s happening with my job? Health: How is my job feeling?

Slide 13

Slide 13 text

Google confidential │ Do not distribute More than just “running” containers Scheduling: Where should my job be run? Lifecycle: Keep my job running Discovery: Where is my job now? Constituency: Who is part of my job? Scale-up: Making my jobs bigger or smaller Auth{n,z}: Who can do things to my job? Monitoring: What’s happening with my job? Health: How is my job feeling? ...

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

Google confidential │ Do not distribute Kubernetes Greek for “Helmsman”; also the root of the word “Governor” • Container orchestration • Runs Docker containers • Supports multiple cloud and bare-metal environments • Inspired and informed by Google’s experiences and internal systems • Open source, written in Go Manage applications, not machines

Slide 16

Slide 16 text

Google confidential │ Do not distribute A 50000 foot view

Slide 17

Slide 17 text

Google confidential │ Do not distribute users master nodes A 50000 foot view CLI API UI apiserver kubelet kubelet kubelet scheduler

Slide 18

Slide 18 text

Google confidential │ Do not distribute A 50000 foot view apiserver kubelet kubelet kubelet scheduler Run X Replicas = 2 Memory = 4Gi CPU = 2.5

Slide 19

Slide 19 text

Google confidential │ Do not distribute A 50000 foot view apiserver kubelet kubelet kubelet scheduler SUCCESS UID=8675309

Slide 20

Slide 20 text

Google confidential │ Do not distribute A 50000 foot view apiserver kubelet kubelet kubelet scheduler Which nodes for X ?

Slide 21

Slide 21 text

Google confidential │ Do not distribute A 50000 foot view apiserver kubelet kubelet kubelet scheduler Run X Run X

Slide 22

Slide 22 text

Google confidential │ Do not distribute A 50000 foot view apiserver kubelet kubelet kubelet scheduler Registry pull X pull X

Slide 23

Slide 23 text

Google confidential │ Do not distribute A 50000 foot view apiserver kubelet kubelet kubelet scheduler Status X Status X X X

Slide 24

Slide 24 text

Google confidential │ Do not distribute A 50000 foot view apiserver kubelet kubelet kubelet scheduler X X GET X

Slide 25

Slide 25 text

Google confidential │ Do not distribute A 50000 foot view apiserver kubelet kubelet kubelet scheduler X X Status X

Slide 26

Slide 26 text

Google confidential │ Do not distribute All you really care about Run X Master Container Cluster X X Status X

Slide 27

Slide 27 text

Google confidential │ Do not distribute Design principles

Slide 28

Slide 28 text

Google confidential │ Do not distribute Design principles Declarative > imperative: State your desired results, let the system actuate

Slide 29

Slide 29 text

Google confidential │ Do not distribute Design principles Declarative > imperative: State your desired results, let the system actuate Control loops: Observe, rectify, repeat

Slide 30

Slide 30 text

Google confidential │ Do not distribute Design principles Declarative > imperative: State your desired results, let the system actuate Control loops: Observe, rectify, repeat Simple > Complex: Try to do as little as possible

Slide 31

Slide 31 text

Google confidential │ Do not distribute Design principles Declarative > imperative: State your desired results, let the system actuate Control loops: Observe, rectify, repeat Simple > Complex: Try to do as little as possible Modularity: Components, interfaces, & plugins

Slide 32

Slide 32 text

Google confidential │ Do not distribute Design principles Declarative > imperative: State your desired results, let the system actuate Control loops: Observe, rectify, repeat Simple > Complex: Try to do as little as possible Modularity: Components, interfaces, & plugins Legacy compatible: Requiring apps to change is a non-starter

Slide 33

Slide 33 text

Google confidential │ Do not distribute Design principles Declarative > imperative: State your desired results, let the system actuate Control loops: Observe, rectify, repeat Simple > Complex: Try to do as little as possible Modularity: Components, interfaces, & plugins Legacy compatible: Requiring apps to change is a non-starter No grouping: Labels are the only groups

Slide 34

Slide 34 text

Google confidential │ Do not distribute Design principles Declarative > imperative: State your desired results, let the system actuate Control loops: Observe, rectify, repeat Simple > Complex: Try to do as little as possible Modularity: Components, interfaces, & plugins Legacy compatible: Requiring apps to change is a non-starter No grouping: Labels are the only groups Cattle > Pets: Manage your workload in bulk

Slide 35

Slide 35 text

Google confidential │ Do not distribute Pets vs. Cattle

Slide 36

Slide 36 text

Google confidential │ Do not distribute Design principles Declarative > imperative: State your desired results, let the system actuate Control loops: Observe, rectify, repeat Simple > Complex: Try to do as little as possible Modularity: Components, interfaces, & plugins Legacy compatible: Requiring apps to change is a non-starter No grouping: Labels are the only groups Cattle > Pets: Manage your workload in bulk

Slide 37

Slide 37 text

Google confidential │ Do not distribute Design principles Declarative > imperative: State your desired results, let the system actuate Control loops: Observe, rectify, repeat Simple > Complex: Try to do as little as possible Modularity: Components, interfaces, & plugins Legacy compatible: Requiring apps to change is a non-starter No grouping: Labels are the only groups Cattle > Pets: Manage your workload in bulk Open > Closed: Open Source, standards, REST, JSON, etc.

Slide 38

Slide 38 text

Google confidential │ Do not distribute Primary concepts

Slide 39

Slide 39 text

Google confidential │ Do not distribute Primary concepts 0. Container: A sealed application package (Docker)

Slide 40

Slide 40 text

Google confidential │ Do not distribute Primary concepts 0. Container: A sealed application package (Docker) 1. Pod: A small group of tightly coupled Containers example: content syncer & web server

Slide 41

Slide 41 text

Google confidential │ Do not distribute Primary concepts 0. Container: A sealed application package (Docker) 1. Pod: A small group of tightly coupled Containers example: content syncer & web server 2. Controller: A loop that drives current state towards desired state example: replication controller

Slide 42

Slide 42 text

Google confidential │ Do not distribute Primary concepts 0. Container: A sealed application package (Docker) 1. Pod: A small group of tightly coupled Containers example: content syncer & web server 2. Controller: A loop that drives current state towards desired state example: replication controller

Slide 43

Slide 43 text

Google confidential │ Do not distribute Primary concepts 0. Container: A sealed application package (Docker) 1. Pod: A small group of tightly coupled Containers example: content syncer & web server 2. Controller: A loop that drives current state towards desired state example: replication controller 3. Service: A set of running pods that work together example: load-balanced backends

Slide 44

Slide 44 text

Google confidential │ Do not distribute Primary concepts 0. Container: A sealed application package (Docker) 1. Pod: A small group of tightly coupled Containers example: content syncer & web server 2. Controller: A loop that drives current state towards desired state example: replication controller 3. Service: A set of running pods that work together example: load-balanced backends 4. Labels: Identifying metadata attached to other objects example: phase=canary vs. phase=prod 5. Selector: A query against labels, producing a set result example: all pods where label phase == prod

Slide 45

Slide 45 text

Google confidential │ Do not distribute Pods

Slide 46

Slide 46 text

Google confidential │ Do not distribute Pods Small group of containers & volumes

Slide 47

Slide 47 text

Google confidential │ Do not distribute Pods Small group of containers & volumes Tightly coupled

Slide 48

Slide 48 text

Google confidential │ Do not distribute Pods Small group of containers & volumes Tightly coupled The atom of cluster scheduling & placement

Slide 49

Slide 49 text

Google confidential │ Do not distribute Pods Small group of containers & volumes Tightly coupled The atom of cluster scheduling & placement Shared namespace • share IP address & localhost

Slide 50

Slide 50 text

Google confidential │ Do not distribute Pods Small group of containers & volumes Tightly coupled The atom of cluster scheduling & placement Shared namespace • share IP address & localhost Ephemeral • can die and be replaced

Slide 51

Slide 51 text

Google confidential │ Do not distribute Pets vs. Cattle

Slide 52

Slide 52 text

Google confidential │ Do not distribute Pods Small group of containers & volumes Tightly coupled The atom of cluster scheduling & placement Shared namespace • share IP address & localhost Ephemeral • can die and be replaced

Slide 53

Slide 53 text

Google confidential │ Do not distribute Pods Small group of containers & volumes Tightly coupled The atom of cluster scheduling & placement Shared namespace • share IP address & localhost Ephemeral • can die and be replaced Example: data puller & web server Pod File Puller Web Server Volume Consumers Content Manager

Slide 54

Slide 54 text

Google confidential │ Do not distribute Why pods?

Slide 55

Slide 55 text

Google confidential │ Do not distribute Why pods? Pod Web Server Volume Consumers Content Manager File Puller

Slide 56

Slide 56 text

Google confidential │ Do not distribute Why pods? Pod File Puller Web Server Volume Consumers Content Manager • infeasible for provider to build and maintain all variants of this “as a service”

Slide 57

Slide 57 text

Google confidential │ Do not distribute Why pods? Pod Scary C program data collector :-(

Slide 58

Slide 58 text

Google confidential │ Do not distribute Why pods? Pod Scary C program adapter data collector

Slide 59

Slide 59 text

Google confidential │ Do not distribute Why pods? Pod component A component B

Slide 60

Slide 60 text

Google confidential │ Do not distribute Why pods? Pod component A component B

Slide 61

Slide 61 text

Google confidential │ Do not distribute Why pods? Pod app DB client DB

Slide 62

Slide 62 text

Google confidential │ Do not distribute Why IP-per-pod? No port mangling.

Slide 63

Slide 63 text

Google confidential │ Do not distribute Why not put everything in one container? - transparency - decouple software dependencies - ease of use - efficiency

Slide 64

Slide 64 text

Google confidential │ Do not distribute Why not something besides pods? like co-scheduling? - simpler to have scheduling atom - other benefits of pods - resource sharing - IPC - shared fate - simplified management

Slide 65

Slide 65 text

Google confidential │ Do not distribute Pod lifecycle

Slide 66

Slide 66 text

Google confidential │ Do not distribute Pod lifecycle Once scheduled to a node, pods do not move • restart policy means restart in-place

Slide 67

Slide 67 text

Google confidential │ Do not distribute Pod lifecycle Once scheduled to a node, pods do not move • restart policy means restart in-place Pods can be observed pending, running, succeeded, or failed • failed is really the end - no more restarts • no complex state machine logic

Slide 68

Slide 68 text

Google confidential │ Do not distribute Pod lifecycle Once scheduled to a node, pods do not move • restart policy means restart in-place Pods can be observed pending, running, succeeded, or failed • failed is really the end - no more restarts • no complex state machine logic Pods are not rescheduled by the scheduler or apiserver • even if a node dies • controllers are responsible for this • keeps the scheduler simple Apps should consider these rules • Services hide this • Makes pod-to-pod communication more formal

Slide 69

Slide 69 text

Google confidential │ Do not distribute Labels

Slide 70

Slide 70 text

Google confidential │ Do not distribute Labels Arbitrary metadata Attached to any API object Generally represent identity

Slide 71

Slide 71 text

Google confidential │ Do not distribute Labels

Slide 72

Slide 72 text

Google confidential │ Do not distribute Labels - "release" : "stable", "canary", …

Slide 73

Slide 73 text

Google confidential │ Do not distribute Labels - "release" : "stable", "canary", … - "environment" : "dev", "qa", "production" ...

Slide 74

Slide 74 text

Google confidential │ Do not distribute Labels - "release" : "stable", "canary", … - "environment" : "dev", "qa", "production" ... - "tier" : "frontend", "backend", "middleware", …

Slide 75

Slide 75 text

Google confidential │ Do not distribute Labels - "release" : "stable", "canary", … - "environment" : "dev", "qa", "production" ... - "tier" : "frontend", "backend", "middleware", … - "partition" : "customerA", "customerB", …

Slide 76

Slide 76 text

Google confidential │ Do not distribute Labels - "release" : "stable", "canary", … - "environment" : "dev", "qa", "production" ... - "tier" : "frontend", "backend", "middleware", … - "partition" : "customerA", "customerB", … - "track" : "daily", "weekly", ...

Slide 77

Slide 77 text

Google confidential │ Do not distribute Labels Arbitrary metadata Attached to any API object Generally represent identity

Slide 78

Slide 78 text

Google confidential │ Do not distribute Labels Arbitrary metadata Attached to any API object Generally represent identity Queryable by selectors • think SQL ‘select ... where ...’ The only grouping mechanism • pods under a ReplicationController • pods in a Service • capabilities of a node (constraints) Example: “phase: canary”

Slide 79

Slide 79 text

Google confidential │ Do not distribute Selectors App: Nifty Phase: Dev Role: FE App: Nifty Phase: Test Role: FE App: Nifty Phase: Dev Role: BE App: Nifty Phase: Test Role: BE

Slide 80

Slide 80 text

Google confidential │ Do not distribute App == Nifty App: Nifty Phase: Dev Role: FE App: Nifty Phase: Test Role: FE App: Nifty Phase: Dev Role: BE App: Nifty Phase: Test Role: BE Selectors

Slide 81

Slide 81 text

Google confidential │ Do not distribute App == Nifty Role == FE App: Nifty Phase: Dev Role: FE App: Nifty Phase: Test Role: FE App: Nifty Phase: Dev Role: BE App: Nifty Phase: Test Role: BE Selectors

Slide 82

Slide 82 text

Google confidential │ Do not distribute App == Nifty Role == BE App: Nifty Phase: Dev Role: FE App: Nifty Phase: Test Role: FE App: Nifty Phase: Dev Role: BE App: Nifty Phase: Test Role: BE Selectors

Slide 83

Slide 83 text

Google confidential │ Do not distribute App == Nifty Phase == Dev App: Nifty Phase: Dev Role: FE App: Nifty Phase: Test Role: FE App: Nifty Phase: Dev Role: BE App: Nifty Phase: Test Role: BE Selectors

Slide 84

Slide 84 text

Google confidential │ Do not distribute App == Nifty Phase == Test App: Nifty Phase: Dev Role: FE App: Nifty Phase: Test Role: FE App: Nifty Phase: Dev Role: BE App: Nifty Phase: Test Role: BE Selectors

Slide 85

Slide 85 text

Google confidential │ Do not distribute Replication Controllers Canonical example of control loops Runs out-of-process wrt API server Have 1 job: ensure N copies of a pod • if too few, start new ones • if too many, kill some • group == selector Cleanly layered on top of the core • all access is by public APIs Replicated pods are fungible • No implied ordinality or identity Replication Controller - Name = “nifty-rc” - Selector = {“App”: “Nifty”} - PodTemplate = { ... } - NumReplicas = 4 API Server How many? 3 Start 1 more OK How many? 4

Slide 86

Slide 86 text

Google confidential │ Do not distribute Replication Controllers node 1 f0118 node 3 node 4 node 2 d9376 b0111 a1209 Replication Controller - Desired = 4 - Current = 4

Slide 87

Slide 87 text

Google confidential │ Do not distribute Replication Controllers node 1 f0118 node 3 node 4 node 2 Replication Controller - Desired = 4 - Current = 4 d9376 b0111 a1209

Slide 88

Slide 88 text

Google confidential │ Do not distribute Replication Controllers node 1 f0118 node 3 node 4 Replication Controller - Desired = 4 - Current = 3 b0111 a1209

Slide 89

Slide 89 text

Google confidential │ Do not distribute Replication Controllers node 1 f0118 node 3 node 4 Replication Controller - Desired = 4 - Current = 4 b0111 a1209 c9bad

Slide 90

Slide 90 text

Google confidential │ Do not distribute Replication Controllers node 1 f0118 node 3 node 4 node 2 Replication Controller - Desired = 4 - Current = 5 d9376 b0111 a1209 c9bad

Slide 91

Slide 91 text

Google confidential │ Do not distribute Replication Controllers node 1 f0118 node 3 node 4 node 2 Replication Controller - Desired = 4 - Current = 4 d9376 b0111 a1209 c9bad

Slide 92

Slide 92 text

Google confidential │ Do not distribute Pod networking Pod IPs are routable • Docker default is private IP Pods can reach each other without NAT • even across nodes No brokering of port numbers This is a fundamental requirement • several SDN solutions

Slide 93

Slide 93 text

Google confidential │ Do not distribute Services A group of pods that act as one == Service • group == selector Defines access policy • only “load balanced” for now Gets a stable virtual IP and port • called the service portal • also a DNS name VIP is captured by kube-proxy • watches the service constituency • updates when backends change Hide complexity - ideal for non-native apps Portal (VIP) Client

Slide 94

Slide 94 text

Google confidential │ Do not distribute Services 10.0.0.1 : 9376 Client kube-proxy Service - Name = “nifty-svc” - Selector = {“App”: “Nifty”} - Port = 9376 - ContainerPort = 8080 Portal IP is assigned iptables DNAT TCP / UDP apiserver watch 10.240.2.2 : 8080 10.240.1.1 : 8080 10.240.3.3 : 8080 TCP / UDP

Slide 95

Slide 95 text

Google confidential │ Do not distribute Services A group of pods that act as one == Service • group == selector Defines access policy • only “load balanced” for now Gets a stable virtual IP and port • called the service portal • also a DNS name VIP is captured by kube-proxy • watches the service constituency • updates when backends change Hide complexity - ideal for non-native apps Portal (VIP) Client

Slide 96

Slide 96 text

Google confidential │ Do not distribute WATCH Services, Endpoints Services kube-proxy apiserver

Slide 97

Slide 97 text

Google confidential │ Do not distribute Services kube-proxy Pod - Name = “pod1” - Labels = {“App”: “Nifty”} - Port = 9376 apiserver POST pods WATCH Services, Endpoints

Slide 98

Slide 98 text

Google confidential │ Do not distribute Services kube-proxy apiserver pod1 10.240.1.1 : 9376 pod2 10.240.2.2 : 9376 pod3 10.240.3.3 : 9376 run pods Pod - Name = “pod1” - Labels = {“App”: “Nifty”} - Port = 9376 WATCH Services, Endpoints

Slide 99

Slide 99 text

Google confidential │ Do not distribute POST service pod1 10.240.1.1 : 9376 pod2 10.240.2.2 : 9376 pod3 10.240.3.3 : 9376 Services kube-proxy Service - Name = “nifty-svc” - Selector = {“App”: “Nifty”} - Port = 80 - TargetPort = 9376 - PortalIP - 10.9.8.7 apiserver WATCH Services, Endpoints

Slide 100

Slide 100 text

Google confidential │ Do not distribute pod1 10.240.1.1 : 9376 pod2 10.240.2.2 : 9376 pod3 10.240.3.3 : 9376 Services kube-proxy apiserver Service - Name = “nifty-svc” - Selector = {“App”: “Nifty”} - Port = 80 - TargetPort = 9376 - PortalIP - 10.9.8.7 WATCH Services, Endpoints new service!

Slide 101

Slide 101 text

Google confidential │ Do not distribute pod1 10.240.1.1 : 9376 pod2 10.240.2.2 : 9376 pod3 10.240.3.3 : 9376 Services kube-proxy apiserver Linux listen on port X (random) Service - Name = “nifty-svc” - Selector = {“App”: “Nifty”} - Port = 80 - TargetPort = 9376 - PortalIP - 10.9.8.7 WATCH Services, Endpoints

Slide 102

Slide 102 text

Google confidential │ Do not distribute pod1 10.240.1.1 : 9376 pod2 10.240.2.2 : 9376 pod3 10.240.3.3 : 9376 Services kube-proxy apiserver Linux listen on port X iptables redirect 10.9.8.7:80 to localhost:X Service - Name = “nifty-svc” - Selector = {“App”: “Nifty”} - Port = 80 - TargetPort = 9376 - PortalIP - 10.9.8.7 WATCH Services, Endpoints

Slide 103

Slide 103 text

Google confidential │ Do not distribute pod1 10.240.1.1 : 9376 pod2 10.240.2.2 : 9376 pod3 10.240.3.3 : 9376 Services kube-proxy apiserver Linux listen on port X iptables redirect 10.9.8.7:80 to localhost:X Service - Name = “nifty-svc” - Selector = {“App”: “Nifty”} - Port = 80 - TargetPort = 9376 - PortalIP - 10.9.8.7 WATCH Services, Endpoints new endpoints!

Slide 104

Slide 104 text

Google confidential │ Do not distribute pod1 10.240.1.1 : 9376 pod2 10.240.2.2 : 9376 pod3 10.240.3.3 : 9376 Services kube-proxy apiserver Linux listen on port X iptables redirect 10.9.8.7:80 to localhost:X Service - Name = “nifty-svc” - Selector = {“App”: “Nifty”} - Port = 80 - TargetPort = 9376 - PortalIP - 10.9.8.7

Slide 105

Slide 105 text

Google confidential │ Do not distribute pod1 10.240.1.1 : 9376 pod2 10.240.2.2 : 9376 pod3 10.240.3.3 : 9376 Services kube-proxy apiserver Linux listen on port X iptables Client redirect 10.9.8.7:80 to localhost:X Service - Name = “nifty-svc” - Selector = {“App”: “Nifty”} - Port = 80 - TargetPort = 9376 - PortalIP - 10.9.8.7 connect to 10.9.8.7:80

Slide 106

Slide 106 text

Google confidential │ Do not distribute pod1 10.240.1.1 : 9376 pod2 10.240.2.2 : 9376 pod3 10.240.3.3 : 9376 Services kube-proxy apiserver Linux listen on port X iptables Client redirect 10.9.8.7:80 to localhost:X Service - Name = “nifty-svc” - Selector = {“App”: “Nifty”} - Port = 80 - TargetPort = 9376 - PortalIP - 10.9.8.7 connect to 10.9.8.7:80

Slide 107

Slide 107 text

Google confidential │ Do not distribute pod1 10.240.1.1 : 9376 pod2 10.240.2.2 : 9376 pod3 10.240.3.3 : 9376 Services kube-proxy apiserver Linux iptables Client Service - Name = “nifty-svc” - Selector = {“App”: “Nifty”} - Port = 80 - TargetPort = 9376 - PortalIP - 10.9.8.7 connect to localhost:X

Slide 108

Slide 108 text

Google confidential │ Do not distribute pod1 10.240.1.1 : 9376 pod2 10.240.2.2 : 9376 pod3 10.240.3.3 : 9376 Services kube-proxy apiserver Linux listen on port X iptables Client Service - Name = “nifty-svc” - Selector = {“App”: “Nifty”} - Port = 80 - TargetPort = 9376 - PortalIP - 10.9.8.7 proxy for client

Slide 109

Slide 109 text

Google confidential │ Do not distribute Events A central place for information about your cluster • filed by any component: kubelet, scheduler, etc Real-time information on the current state of your pod • kubectl describe pod foo Real-time information on the current state of your cluster • kubectl get --watch-only events • You can also ask only for events that mention some object you care about.

Slide 110

Slide 110 text

Google confidential │ Do not distribute Monitoring Optional add-on to Kubernetes clusters Run cAdvisor as a pod on each node • gather stats from all containers • export via REST Run Heapster as a pod in the cluster • just another pod, no special access • aggregate stats Run Influx and Grafana in the cluster • more pods • alternately: store in Google Cloud Monitoring

Slide 111

Slide 111 text

Google confidential │ Do not distribute Logging Optional add-on to Kubernetes clusters Run fluentd as a pod on each node • gather logs from all containers • export to elasticsearch Run Elasticsearch as a pod in the cluster • just another pod, no special access • aggregate logs Run Kibana in the cluster • yet another pod • alternately: store in Google Cloud Logging

Slide 112

Slide 112 text

DEMO

Slide 113

Slide 113 text

Google confidential │ Do not distribute Kubernetes is Open Source We want your help! http://kubernetes.io https://github.com/GoogleCloudPlatform/kubernetes irc.freenode.net #google-containers @kubernetesio

Slide 114

Slide 114 text

Google confidential │ Do not distribute Questions? Images by Connie Zhou http://kubernetes.io