cluster management, cognitive shifts, & 
kubernetes

cluster management, cognitive shifts, & 
kubernetes

How did we get into the cluster management game? This briefly covers some of the lessons that we have learned over the years from the system internally known as Borg. Then, we step through an application running on kubernetes and see how those lessons are present in what kubernetes does and how it's designed.

62b2249a42d624dc93357931f0f5d2f1?s=128

juliaferraioli

July 07, 2015
Tweet

Transcript

  1. 7.

    @juliaferraioli job hello_world = { runtime = { cell =

    'ic' } // Cell (cluster) to run in developer view
  2. 8.

    @juliaferraioli job hello_world = { runtime = { cell =

    'ic' } // Cell (cluster) to run in binary = '.../hello_world_webserver' // Program to run developer view
  3. 9.

    @juliaferraioli job hello_world = { runtime = { cell =

    'ic' } // Cell (cluster) to run in binary = '.../hello_world_webserver' // Program to run args = { port = '%port%' } // Command line parameters developer view
  4. 10.

    @juliaferraioli job hello_world = { runtime = { cell =

    'ic' } // Cell (cluster) to run in binary = '.../hello_world_webserver' // Program to run args = { port = '%port%' } // Command line parameters requirements = { // Resource requirements developer view
  5. 11.

    @juliaferraioli job hello_world = { runtime = { cell =

    'ic' } // Cell (cluster) to run in binary = '.../hello_world_webserver' // Program to run args = { port = '%port%' } // Command line parameters requirements = { // Resource requirements ram = 100M developer view
  6. 12.

    @juliaferraioli job hello_world = { runtime = { cell =

    'ic' } // Cell (cluster) to run in binary = '.../hello_world_webserver' // Program to run args = { port = '%port%' } // Command line parameters requirements = { // Resource requirements ram = 100M disk = 100M developer view
  7. 13.

    @juliaferraioli job hello_world = { runtime = { cell =

    'ic' } // Cell (cluster) to run in binary = '.../hello_world_webserver' // Program to run args = { port = '%port%' } // Command line parameters requirements = { // Resource requirements ram = 100M disk = 100M cpu = 0.1 developer view
  8. 14.

    @juliaferraioli job hello_world = { runtime = { cell =

    'ic' } // Cell (cluster) to run in binary = '.../hello_world_webserver' // Program to run args = { port = '%port%' } // Command line parameters requirements = { // Resource requirements ram = 100M disk = 100M cpu = 0.1 } developer view
  9. 15.

    @juliaferraioli job hello_world = { runtime = { cell =

    'ic' } // Cell (cluster) to run in binary = '.../hello_world_webserver' // Program to run args = { port = '%port%' } // Command line parameters requirements = { // Resource requirements ram = 100M disk = 100M cpu = 0.1 } replicas = 5 // Number of tasks developer view
  10. 16.

    @juliaferraioli job hello_world = { runtime = { cell =

    'ic' } // Cell (cluster) to run in binary = '.../hello_world_webserver' // Program to run args = { port = '%port%' } // Command line parameters requirements = { // Resource requirements ram = 100M disk = 100M cpu = 0.1 } replicas = 5 // Number of tasks } developer view
  11. 17.

    @juliaferraioli job hello_world = { runtime = { cell =

    'ic' } // Cell (cluster) to run in binary = '.../hello_world_webserver' // Program to run args = { port = '%port%' } // Command line parameters requirements = { // Resource requirements ram = 100M disk = 100M cpu = 0.1 } replicas = 5 // Number of tasks } 10000 developer view
  12. 18.

    @juliaferraioli Image by Connie Zhou Hello world! Hello world! Hello

    world! Hello world! Hello world! Hello world! Hello world! Hello world!
  13. 19.

    @juliaferraioli Hello world! Hello world! Hello world! Hello world! Hello

    world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Image by Connie Zhou Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world!
  14. 24.

    @juliaferraioli advanced bin- packing algorithms experimental placement of production VM

    workload, July 2014 available resources one
 machine efficiency
  15. 25.

    @juliaferraioli advanced bin- packing algorithms experimental placement of production VM

    workload, July 2014 stranded resources available resources one
 machine efficiency
  16. 26.

    @juliaferraioli Borg paper: http://goo.gl/1C4nuo Images by Connie Zhou observations: 1.

    resiliency is achieved only by ruthless attention to detail a. ubiquitous software fault tolerance b. persistent, declarative specs
 2. we get efficiency by: a. sharing resources b. reclaiming unused allocations 1. containers make users more productive
  17. 27.

    @juliaferraioli app libs kernel libs app app kernel app libs

    libs kernel kernel not so long ago: virtual machines
  18. 35.

    @juliaferraioli A loop that drives current state towards desired state

    A small group of tightly coupled containers Pod Replication Controller Service A set of running pods that work together Arbitrary metadata to organize components Labels kubernetes cluster
  19. 36.

    @juliaferraioli Dashboard show: type = FE Object Pod frontend Object

    type = FE type = FE • metadata with semantic meaning • membership identifier • the only grouping mechanism behavior benefits ➔ allow for intent of many users (e.g. dashboards) ➔ build higher level systems ➔ queryable by selectors labels
  20. 37.

    @juliaferraioli Dashboard show: type = FE Object Pod frontend Pod

    frontend Object Object Dashboard show: version = v2 type = FE version = v2 type = FE type = FE version = v2 • metadata with semantic meaning • membership identifier • the only grouping mechanism behavior benefits ➔ allow for intent of many users (e.g. dashboards) ➔ build higher level systems ➔ queryable by selectors labels
  21. 38.

    @juliaferraioli A loop that drives current state towards desired state

    A set of running pods that work together Replication Controller Service Arbitrary metadata to organize components Labels A small group of tightly coupled containers Pod kubernetes cluster
  22. 39.

    @juliaferraioli small group of containers & volumes containers are tightly

    coupled the atom of cluster scheduling shared namespace • Shared network IP and port namespace ephemeral • can die and be replaced Pod Site generator Web Server Volume Consumers Content Manager pods
  23. 40.

    @juliaferraioli pods template:
 metadata:
 labels:
 name: frontend
 spec:
 containers:
 -

    name: php-redis
 image: kubernetes/example-guestbook-php-redis:v2
 ports:
 - containerPort: 80
  24. 41.

    @juliaferraioli A set of running pods that work together Service

    Arbitrary metadata to organize components Labels A small group of tightly coupled containers A loop that drives current state towards desired state Pod Replication Controller kubernetes cluster
  25. 42.

    @juliaferraioli Replication Controller Pod Pod frontend Pod frontend Pod Pod

    Replication Controller #pods = 1 version = v2 show: version = v2 version= v1 version = v1 version = v2 Replication Controller #pods = 2 version = v1 show: version = v2 Behavior Benefits • keeps pods running • gives direct control of pod #s • grouped by label selector ➔ recreates pods, maintains desired state ➔ fine-grained control for scaling ➔ standard grouping semantics replication controllers
  26. 43.

    @juliaferraioli Replication Controller Replication Controller - Name = “frontend” -

    Selector = {“Name”: “frontend”} - PodTemplate = { ... } - NumReplicas = 3 API Server 2 Start 1 more OK 3 How many? How many? canonical example of control loops have one job: ensure N copies of a pod replicated pods are fungible replication controllers
  27. 44.

    @juliaferraioli replication controllers kind: ReplicationController
 metadata:
 name: frontend
 labels:
 name:

    frontend
 spec:
 replicas: 3
 selector:
 name: frontend
 template: <snip!> $ kubectl create -f frontend-controller.yaml
  28. 45.

    @juliaferraioli kubernetes cluster Arbitrary metadata to organize components Labels A

    small group of tightly coupled containers A loop that drives current state towards desired state A set of running pods that work together Pod Replication Controller Service
  29. 46.

    @juliaferraioli Portal (VIP) Client Pod Container Container Container Container Pod

    Container Container Container Container Pod Container Container Container Container a group of pods that act as one == Service defines access policy gets a stable virtual IP and port VIP is captured by kube-proxy services
  30. 47.

    @juliaferraioli Service Label selectors: version = 1.0 type = Frontend

    Service Label selectors: version = v1 type = FE Replication Controller Pod Pod frontend Pod version= v1 version = v1 Replication Controller #pods = 2 show: version = v2 type = FE type = FE VIP services
  31. 48.

    @juliaferraioli services kind: Service
 metadata:
 name: frontend
 labels:
 name: frontend


    spec:
 type: LoadBalancer
 ports:
 - port: 80
 selector:
 name: frontend $ kubectl create -f frontend-service.yaml
  32. 50.

    @juliaferraioli Master APIs Scheduling REST (pods, services, controllers) AuthN Scheduler

    Replication Controller Node3 Kubelet Proxy Pod Container Container Container Container Pod Container Container Container Container Node3 Kubelet Proxy Pod Container Container Container Container Pod Container Container Container Container Node1 Kubelet Proxy Pod Container Container Container Container Pod Container Container Container Container $ kubectl proxy --www=k8s-visualizer/ visualizing kubernetes
  33. 53.

    @juliaferraioli open sourced in June, 2014 Google launched Google Container

    Engine (GKE) • https://cloud.google.com/container-engine/ roadmap: • https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/roadmap.md where we are today http://www.kuberneteslaunch.com/
  34. 55.

    @juliaferraioli thanks to... Brian Dorsey - Developer Advocate, Google Aja

    Hammerly - Developer Advocate, Google Amy Unruh - Developer Programs Engineer, Google Mandy Waite - Developer Advocate, Google John Wilkes - Principal Engineer, Google