Slide 1

Slide 1 text

Google Cloud Platform logo A Crash Course on Container Orchestration & Kubernetes Interop ITX May 15, 2017 Tim Hockin Principal Software Engineer @thockin

Slide 2

Slide 2 text

Google Cloud Platform Containers are a great way to package and run apps: • Self-contained • Low overhead • Fast starting • Easy to compose • Easy to replace

Slide 3

Slide 3 text

Google Cloud Platform Almost everything at Google runs in containers: • Gmail, Web Search, Maps, ... • MapReduce, batch, ... • GFS, Colossus, ... • Even Google’s Cloud Platform: our VMs run in containers!

Slide 4

Slide 4 text

Google Cloud Platform Almost everything at Google runs in containers: • Gmail, Web Search, Maps, ... • MapReduce, batch, ... • GFS, Colossus, ... • Even Google’s Cloud Platform: our VMs run in containers! We launch billions of containers every week

Slide 5

Slide 5 text

Google Cloud Platform For the rest of this presentation, I am going to assume that you are bought in to containers. Images by Connie Zhou

Slide 6

Slide 6 text

Google Cloud Platform How do I deploy containers?

Slide 7

Slide 7 text

Google Cloud Platform Understanding how containers work is not the same as using them in production (duh!) Different organizations will have very different needs (duh!) • Not everyone needs to operate “at scale” There are many ways to run and manage containers (duh!) Deployment options

Slide 8

Slide 8 text

Google Cloud Platform Never underestimate the value of manual solutions SSH into machines and run docker Pro: simple, available everywhere, no special tools needed, easily understood Con: not automated, not reproducible (human make mistakes), doesn’t scale, doesn’t self-heal Humans

Slide 9

Slide 9 text

Google Cloud Platform A very common first step into containers Puppet, Chef, Ansible, Salt, or just bespoke scripts Pro: integrates with existing environments, easily understood results, reproducible Con: manual scheduling, doesn’t self-heal, doesn’t scale, generally non-portable Scripts

Slide 10

Slide 10 text

Google Cloud Platform Automatically match containers to available machines Mesos, Kubernetes, Docker Swarm, Nomad, or even home-grown systems Pro: automated, reproducible, self-healing, scalable, generally portable Con: some overhead, requires new tooling and training, more complex results Orchestration systems

Slide 11

Slide 11 text

Google Cloud Platform What’s in an orchestration system?

Slide 12

Slide 12 text

Google Cloud Platform Scheduling: match containers to machines • by resource needs (CPU, memory) • by affinity requirements (put X near Y) • by labels (put X on a “test” machine) Replication: run N copies Handle machine failures Discovery: find peers and services in other containers Inspection: tell me what is happening Basic features

Slide 13

Slide 13 text

Google Cloud Platform Built-in load-balancers Automated updates Cluster auto-scaling: better utilization App auto-scaling: handle spikes and troughs Provisioning storage Re-packing machines Late-binding configuration Advanced features

Slide 14

Slide 14 text

Google Cloud Platform These are things that were historically managed by humans at human speed Making better use of human time

Slide 15

Slide 15 text

Google Cloud Platform These are things that were historically managed by humans at human speed Who carries an on-call duty pager, or has a dev/ops team that does? Making better use of human time

Slide 16

Slide 16 text

Google Cloud Platform These are things that were historically managed by humans at human speed Who carries an on-call duty pager, or has a dev/ops team that does? When does that pager usually go off? ● In my experience, usually 3am Making better use of human time

Slide 17

Slide 17 text

Google Cloud Platform These are things that were historically managed by humans at human speed Who carries an on-call duty pager, or has a dev/ops team that does? When does that pager usually go off? ● In my experience, usually 3am Orchestration can handle a lot of situations for you automatically - turn 3am pages into advisory emails Making better use of human time

Slide 18

Slide 18 text

Google Cloud Platform What options do I have?

Slide 19

Slide 19 text

Google Cloud Platform Mesos: • Most mature (predates Docker) • Two-level system Docker Swarm Mode: • Built-in to Docker • Easy to set up Nomad: • HashiCorp • Youngest of the bunch Kubernetes: • Derives from Google’s Borg & Omega • Rapidly growing adoption Orchestrators, at a glance

Slide 20

Slide 20 text

Google Cloud Platform Started as a UC Berkeley research project Now owned by The Apache Foundation Commercial support by Mesosphere (DC/OS) Two-level scheduler - core and “frameworks” (e.g. Spark, Cassandra, Marathon) Big tech shops (Twitter, Uber, Apple, NetFlix) Scales very well (10k+ machines) Complex to set up and administer Mesos, at a glance

Slide 21

Slide 21 text

Google Cloud Platform Built into Docker Focuses on ease of use & easy setup Very similar to Kubernetes in many ways • First version “Docker Swarm” was totally different Less mature than Mesos or Kubernetes Primarily developed by Docker, Inc. Scales to thousands of machines Docker Swarm Mode, at a glance

Slide 22

Slide 22 text

Google Cloud Platform From HashiCorp Integrates with Consul and Vault (both very well regarded) Designed to be simple Least mature of the bunch Scales to thousands of machines Not much adoption, yet Nomad, at a glance

Slide 23

Slide 23 text

Google Cloud Platform Derives ideas from Google’s Borg & Omega Owned by Cloud Native Compute Foundation Designed to be composable More complex than some others Scales to thousands of machines Very rapid adoption Large community - thousands of developers Kubernetes, at a glance

Slide 24

Slide 24 text

Google Cloud Platform Diving deeper into Kubernetes

Slide 25

Slide 25 text

Google Cloud Platform Greek for “Helmsman”; also the root of the words “governor” and “cybernetic” • Manages container clusters • Inspired and informed by Google’s experiences and internal systems • Supports multiple cloud and bare-metal environments • Supports multiple container runtimes • 100% Open source, written in Go Manage applications, not machines Kubernetes

Slide 26

Slide 26 text

Google Cloud Platform kubelet UI kubelet CLI API users master nodes etcd kubelet scheduler controllers apiserver The 10000 foot view

Slide 27

Slide 27 text

Google Cloud Platform UI API Container Cluster All you really care about

Slide 28

Slide 28 text

Google Cloud Platform Running a container

Slide 29

Slide 29 text

Google Cloud Platform apiserver

Slide 30

Slide 30 text

Google Cloud Platform apiserver etcd

Slide 31

Slide 31 text

Google Cloud Platform apiserver etcd

Slide 32

Slide 32 text

Google Cloud Platform apiserver scheduler etcd controller manager

Slide 33

Slide 33 text

Google Cloud Platform apiserver scheduler etcd controller manager

Slide 34

Slide 34 text

Google Cloud Platform kubelet apiserver scheduler controller manager etcd

Slide 35

Slide 35 text

Google Cloud Platform scheduler controller manager etcd apiserver kubelet docker

Slide 36

Slide 36 text

Google Cloud Platform kubelet scheduler controller manager docker etcd apiserver

Slide 37

Slide 37 text

Google Cloud Platform kubelet apiserver scheduler controller manager docker cloud provider etcd

Slide 38

Slide 38 text

Google Cloud Platform kubelet apiserver scheduler controller manager docker cloud provider etcd

Slide 39

Slide 39 text

Google Cloud Platform Co-scheduling

Slide 40

Slide 40 text

Google Cloud Platform Highly-coupled containers File Puller Web Server ?

Slide 41

Slide 41 text

Google Cloud Platform Highly-coupled containers File Puller Web Server

Slide 42

Slide 42 text

Google Cloud Platform Highly-coupled containers File Puller Web Server REJECTED

Slide 43

Slide 43 text

Google Cloud Platform Small group of containers & volumes Tightly coupled The atom of scheduling & placement Shared namespaces • share IP address & localhost • share IPC, etc. Managed lifecycle • bound to a node, restart in place • can die, cannot be reborn with same ID Pods Consumers Content Manager File Puller Web Server Volume Pod

Slide 44

Slide 44 text

Google Cloud Platform Examples: • data syncer (e.g. from git) & server • log producer & log saver • monitoring adapter • policy-enforcing (e.g. auth) proxy • cache Pods Consumers Content Manager File Puller Web Server Volume Pod

Slide 45

Slide 45 text

Google Cloud Platform Finding things

Slide 46

Slide 46 text

Google Cloud Platform Physical view

Slide 47

Slide 47 text

Google Cloud Platform Physical view

Slide 48

Slide 48 text

Google Cloud Platform Physical view

Slide 49

Slide 49 text

Google Cloud Platform Logical view

Slide 50

Slide 50 text

Google Cloud Platform Labels and selectors Arbitrary metadata Attached to any API object Generally represent identity Queryable by selectors • think SQL ‘select ... where ...’ The only grouping mechanism • pods in a ReplicaSet • pods in a Service • capabilities of a node (constraints)

Slide 51

Slide 51 text

Google Cloud Platform App: MyApp Phase: prod Role: FE App: MyApp Phase: test Role: FE App: MyApp Phase: prod Role: BE App: MyApp Phase: test Role: BE Selectors

Slide 52

Slide 52 text

Google Cloud Platform App: MyApp Phase: prod Role: FE App: MyApp Phase: test Role: FE App: MyApp Phase: prod Role: BE App: MyApp Phase: test Role: BE App = MyApp Selectors

Slide 53

Slide 53 text

Google Cloud Platform App: MyApp Phase: prod Role: FE App: MyApp Phase: test Role: FE App: MyApp Phase: prod Role: BE App: MyApp Phase: test Role: BE App = MyApp, Role = FE Selectors

Slide 54

Slide 54 text

Google Cloud Platform App: MyApp Phase: prod Role: FE App: MyApp Phase: test Role: FE App: MyApp Phase: prod Role: BE App: MyApp Phase: test Role: BE App = MyApp, Role = BE Selectors

Slide 55

Slide 55 text

Google Cloud Platform App: MyApp Phase: prod Role: FE App: MyApp Phase: test Role: FE App: MyApp Phase: prod Role: BE App: MyApp Phase: test Role: BE App = MyApp, Phase = prod Selectors

Slide 56

Slide 56 text

Google Cloud Platform App: MyApp Phase: prod Role: FE App: MyApp Phase: test Role: FE App: MyApp Phase: prod Role: BE App: MyApp Phase: test Role: BE App = MyApp, Phase = test Selectors

Slide 57

Slide 57 text

Google Cloud Platform Logical view

Slide 58

Slide 58 text

Google Cloud Platform Logical view

Slide 59

Slide 59 text

Google Cloud Platform Logical view

Slide 60

Slide 60 text

Google Cloud Platform Logical view

Slide 61

Slide 61 text

Google Cloud Platform Discovery

Slide 62

Slide 62 text

Google Cloud Platform A group of pods that work together • grouped by a selector Defines access policy • “load balanced” or “headless” Can have a stable virtual IP and port • also a DNS name! VIP is managed by kube-proxy • watches all services • updates iptables when backends change • default implementation - can be replaced! Hides complexity Client Virtual IP Services

Slide 63

Slide 63 text

Google Cloud Platform Service VIPs are only available inside the cluster Need to receive traffic from “the outside world” Service “type” • NodePort: expose on a port on every node • LoadBalancer: provision a cloud load-balancer DiY load-balancer solutions • socat (for nodePort remapping) • haproxy • nginx External services

Slide 64

Slide 64 text

Google Cloud Platform Replication

Slide 65

Slide 65 text

Google Cloud Platform Declaration of intent: run N copies of a pod Simple control loop One job: ensure N copies • too few? start some • too many? kill some Layered on top of Pods ReplicaSet - name = “my-rc” - selector = {“App”: “MyApp”} - template = { ... } - replicas = 4 API Server How many? 3 Start 1 more OK How many? 4 ReplicaSets

Slide 66

Slide 66 text

Google Cloud Platform Manages replica changes for you • stable object name • simply edit the object • configurable server-side rolling-updates Can have multiple updates in flight Layered on top of ReplicaSets ... Deployments

Slide 67

Slide 67 text

Google Cloud Platform Updates

Slide 68

Slide 68 text

Google Cloud Platform ReplicaSet - name: my-app-v1 - replicas: 3 - selector: - app: MyApp - version: v1 Service - app: MyApp Rolling Update

Slide 69

Slide 69 text

Google Cloud Platform ReplicaSet - name: my-app-v1 - replicas: 3 - selector: - app: MyApp - version: v1 ReplicaSet - name: my-app-v2 - replicas: 0 - selector: - app: MyApp - version: v2 Service - app: MyApp Rolling Update

Slide 70

Slide 70 text

Google Cloud Platform ReplicaSet - name: my-app-v1 - replicas: 3 - selector: - app: MyApp - version: v1 ReplicaSet - name: my-app-v2 - replicas: 1 - selector: - app: MyApp - version: v2 Service - app: MyApp Rolling Update

Slide 71

Slide 71 text

Google Cloud Platform ReplicaSet - name: my-app-v1 - replicas: 2 - selector: - app: MyApp - version: v1 ReplicaSet - name: my-app-v2 - replicas: 1 - selector: - app: MyApp - version: v2 Service - app: MyApp Rolling Update

Slide 72

Slide 72 text

Google Cloud Platform ReplicaSet - name: my-app-v1 - replicas: 2 - selector: - app: MyApp - version: v1 ReplicaSet - name: my-app-v2 - replicas: 2 - selector: - app: MyApp - version: v2 Service - app: MyApp Rolling Update

Slide 73

Slide 73 text

Google Cloud Platform ReplicaSet - name: my-app-v1 - replicas: 1 - selector: - app: MyApp - version: v1 ReplicaSet - name: my-app-v2 - replicas: 2 - selector: - app: MyApp - version: v2 Service - app: MyApp Rolling Update

Slide 74

Slide 74 text

Google Cloud Platform ReplicaSet - name: my-app-v1 - replicas: 1 - selector: - app: MyApp - version: v1 ReplicaSet - name: my-app-v2 - replicas: 3 - selector: - app: MyApp - version: v2 Service - app: MyApp Rolling Update

Slide 75

Slide 75 text

Google Cloud Platform ReplicaSet - name: my-app-v1 - replicas: 0 - selector: - app: MyApp - version: v1 ReplicaSet - name: my-app-v2 - replicas: 3 - selector: - app: MyApp - version: v2 Service - app: MyApp Rolling Update

Slide 76

Slide 76 text

Google Cloud Platform ReplicaSet - name: my-app-v2 - replicas: 3 - selector: - app: MyApp - version: v2 Service - app: MyApp Rolling Update

Slide 77

Slide 77 text

Google Cloud Platform Configuration and secrets

Slide 78

Slide 78 text

Google Cloud Platform node API Pod Config Map ConfigMaps & Secrets Goal: manage app configuration & secrets • ...without making overly-brittle container images 12-factor says config comes from the environment • Kubernetes is the environment Manage configs via the Kubernetes API Inject them as virtual volumes into your Pods • late-binding, live-updated (atomic) • also available as env vars

Slide 79

Slide 79 text

Google Cloud Platform Auto-scaling

Slide 80

Slide 80 text

Google Cloud Platform Automatically scale number of pods as needed • based on CPU utilization (for now) • custom metrics coming Efficiency now, capacity when you need it Operates within user-defined min/max bounds Set it and forget it ... Stats HorizontalPodAutoScalers

Slide 81

Slide 81 text

Google Cloud Platform Automatically scale number of nodes as needed • based on scheduler backlog & idleness Efficiency now, capacity when you need it Operates within user-defined min/max bounds Set it and forget it ... Sched ClusterAutoScaler

Slide 82

Slide 82 text

Google Cloud Platform Storage

Slide 83

Slide 83 text

Google Cloud Platform Manage storage with its own lifecycle Driver plugins - more than 20 supported • Google Persistent Disk • Amazon EBS • Azure Volumes • Gluster • Ceph Dynamic provisioning - allocate on-demand Local disks volumes in development Containers are not just for stateless apps! PersistentVolumes • iSCSI • Cinder • ScaleIO • Portworx • ...

Slide 84

Slide 84 text

Google Cloud Platform About the Kubernetes Project

Slide 85

Slide 85 text

Google Cloud Platform 1500+ Contributors 400+ Person-Years of Effort Top 0.001% of all Github Projects 4000+ External Projects Based on K8s Contributors Users Community

Slide 86

Slide 86 text

86 86 Kubernetes is Open https://kubernetes.io Code: github.com/kubernetes/kubernetes Chat: slack.k8s.io Twitter: @kubernetesio open community open design open source open to ideas