Kubernetes at Scale @ DevOps Meetup Singapore

01dc8e954957a10b428aa60b28c89d52?s=47 Ian Lewis
February 21, 2017

Kubernetes at Scale @ DevOps Meetup Singapore

01dc8e954957a10b428aa60b28c89d52?s=128

Ian Lewis

February 21, 2017
Tweet

Transcript

  1. Kubernetes & Google Container Engine Overview DevOps Meetup Singapore

  2. Confidential & Proprietary Google Cloud Platform 2 Ian Lewis Developer

    Advocate - Google Cloud Platform Tokyo, Japan @IanMLewis
  3. For the last 15 years Google has been building the

    world’s fastest, most powerful infrastructure.
  4. Google’s World Spanning Backbone Network

  5. None
  6. Building what’s next 6 33 Countries 70 Edge Locations The

    most of any Cloud Provider Google-Grade Networking
  7. Monitoring Mobile Development Compute Network Big Data Storage

  8. 2012 2015 MapReduce Spanner 2003 2006 2010 2011 GFS Borg

    Colossus Dremel Bigtable Chubby 2004
  9. Copyright 2015 Google Inc Google has been running all our

    services in Containers for 10 years. We start over 2 billion containers every week. Images by Connie Zhou
  10. http://research.google.com/pubs/pub43438.html

  11. Image by Connie Zhou

  12. job hello_world = { runtime = { cell = 'ic'

    } // Cell (cluster) to run in binary = '.../hello_world_webserver' // Program to run args = { port = '%port%' } // Command line parameters requirements = { // Resource requirements ram = 100M disk = 100M cpu = 0.1 } replicas = 5 // Number of tasks } 10000 Developer View
  13. Developer View

  14. web browsers BorgMaster link shard UI shard BorgMaster link shard

    UI shard BorgMaster link shard UI shard BorgMaster link shard UI shard Scheduler borgcfg web browsers scheduler Borglet Borglet Borglet Borglet Config file BorgMaster link shard UI shard persistent store (Paxos) Binary Developer View What just happened?
  15. Hello world! Hello world! Hello world! Hello world! Hello world!

    Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Image by Connie Zhou Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world! Hello world!
  16. Container Image Dependencies Application Code Containers encapsulate application code and

    all dependencies. Applications can be depend less on the infrastructure where it runs. • In traditional IT environments, applications needed specific infrastructure. Dependencies needed to be installed beforehand. • Containers incorporate applications and their dependencies so deployment to development, test, and production can be made easier. • Don’t need to be dependent on on-premise, private or public cloud environments. What are Containers?
  17. Fast Simple and Fast compared to VMs. Can be started

    in just a few milliseconds. Portable Can be run in a many environments. Efficiency Low overhead. Resources use by containers can be limited. Why Containers?
  18. None
  19. Copyright 2015 Google Inc Container Management Node Node Cluster Node

    ??? • How to deploy to multiple nodes? • How to deal with node failures? • How to deal with container failures? • How do you update your applications?
  20. Kubernetes κυβερνήτης: Greek for “pilot” or “helmsman of a ship”

    the open source cluster manager from Google
  21. CNCF(Cloud Native Computing Foundation)

  22. Google Cloud Platform Goal: Avoid vendor lock-in Runs in many

    environments, including “bare metal” and “your laptop” The API and the implementation are 100% open The whole system is modular and replaceable Workload portability
  23. Google Cloud Platform Goal: Write once, run anywhere* Don’t force

    apps to know about concepts that are cloud-provider-specific Examples of this: • Network model • Ingress • Service load-balancers • PersistentVolumes * approximately Workload portability
  24. Google Cloud Platform Goal: Avoid coupling Don’t force apps to

    know about concepts that are Kubernetes-specific Examples of this: • Namespaces • Services / DNS • Downward API • Secrets / ConfigMaps Workload portability
  25. Google Cloud Platform Result: Portability Build your apps on-prem, lift-and-shift

    into cloud when you are ready Don’t get stuck with a platform that doesn’t work for you Put your app on wheels and move it whenever and wherever you need Workload portability
  26. Container Engine Google Cloud Platform

  27. Google Cloud Platform Small group of containers & volumes Tightly

    coupled The atom of scheduling & placement Shared namespace • share IP address & localhost • share IPC, etc. Managed lifecycle • bound to a node, restart in place • can die, cannot be reborn with same ID Example: data puller & web server Consumers Content Manager File Puller Web Server Volume Pod Pods
  28. Google Cloud Platform Docker Containers IPC Network PID Hostname Mount

    nginx IPC Network PID Hostname Mount nginx IPC Network PID Hostname Mount nginx
  29. Google Cloud Platform IPC Network PID Hostname Mounts nginx IPC

    Network PID Hostname Mount git pull IPC Network PID Hostname Mount nginx Docker Containers
  30. Google Cloud Platform IPC Network PID Hostname Mounts nginx IPC

    Network PID Hostname Mount git pull IPC Network PID Hostname Mount nginx Docker Containers VOLUME VOLUME Host Volume
  31. Google Cloud Platform Host NIC Network IPC Network PID Hostname

    Mounts nginx IPC Network PID Hostname Mount git pull IPC Network PID Hostname Mount nginx Docker Containers NAT NAT
  32. Google Cloud Platform 172.16.1.1 172.16.1.2 172.16.1.1 172.16.1.1 NAT NAT NAT

    NAT NAT Docker networking
  33. Google Cloud Platform A: 172.16.1.1 3306 B: 172.16.1.2 80 9376

    11878 SNAT SNAT C: 172.16.1.1 8000 Port mapping
  34. Google Cloud Platform Pods & Docker? confd nginx HUP W

    RITE READ etcd CHANGE nginx.conf app app app IP Address LB
  35. Google Cloud Platform Container Container Pods & Docker? confd nginx

    HUP W RITE READ etcd CHANGE ? ? ? ?
  36. Google Cloud Platform Pods & Docker? Container nginx confd foreman

  37. Google Cloud Platform Container foreman Pods & Docker? nginx confd

  38. Google Cloud Platform Container foreman Pods & Docker? Everything’s A-OK!!

    nginx confd Crash-Restart Loop
  39. Google Cloud Platform IPC Network Pods docker … --net=container:id --ipc=container:id

    Hostname cgroup Web Server Pod cgroup File Puller localhost
  40. Google Cloud Platform Pods (TODO) docker … --net=container:id --ipc=container:id --pid=container:id

    https://github.com/docker /docker/issues/10163 IPC Network PID Hostname cgroup Web Server cgroup File Puller localhost
  41. Google Cloud Platform IPs are cluster-scoped • vs docker default

    private IP Pods can reach each other directly • even across nodes No brokering of port numbers • too complex, why bother? This is a fundamental requirement • can be L3 routed • can be underlayed (cloud) • can be overlayed (SDN) Kubernetes networking
  42. Google Cloud Platform 10.1.1.0/24 10.1.1.1 10.1.1.2 10.1.2.0/24 10.1.2.1 10.1.3.0/24 10.1.3.1

    Kubernetes networking
  43. Google Cloud Platform Goal: manage app configuration • ...without making

    overly-brittle container images 12-factor says config comes from the environment • Kubernetes is the environment Manage config via the Kubernetes API Inject config as a virtual volume into your Pods • late-binding, live-updated (atomic) • also available as env vars Status: GA in Kubernetes v1.2 node API Pod Config Map ConfigMaps
  44. Google Cloud Platform Goal: grant a pod access to a

    secured something • don’t put secrets in the container image! 12-factor says config comes from the environment • Kubernetes is the environment Manage secrets via the Kubernetes API Inject secrets as virtual volumes into your Pods • late-binding, tmpfs - never touches disk • also available as env vars node API Pod Secret Secrets
  45. Google Cloud Platform A higher-level storage abstraction • insulation from

    any one cloud environment Admin provisions them, users claim them • NEW: auto-provisioning (alpha in v1.2) Independent lifetime from consumers • lives until user is done with it • can be handed-off between pods Dynamically “scheduled” and managed, like nodes and pods Claim PersistentVolumes
  46. Google Cloud Platform Cluster Admin PersistentVolumes

  47. Google Cloud Platform Provision Cluster Admin PersistentVolumes PersistentVolumes

  48. Google Cloud Platform User Cluster Admin PersistentVolumes PersistentVolumes

  49. Google Cloud Platform User PVClaim Create Cluster Admin PersistentVolumes PersistentVolumes

  50. Google Cloud Platform User PVClaim Binder Cluster Admin PersistentVolumes PersistentVolumes

  51. Google Cloud Platform User PVClaim Pod Create Cluster Admin PersistentVolumes

    PersistentVolumes
  52. Google Cloud Platform User PVClaim Pod Cluster Admin PersistentVolumes *

    PersistentVolumes
  53. Google Cloud Platform User PVClaim Pod Delete * Cluster Admin

    PersistentVolumes * PersistentVolumes
  54. Google Cloud Platform User PVClaim Cluster Admin PersistentVolumes * PersistentVolumes

  55. Google Cloud Platform User PVClaim Pod Create Cluster Admin PersistentVolumes

    * PersistentVolumes
  56. Google Cloud Platform User PVClaim Pod Cluster Admin PersistentVolumes *

    PersistentVolumes
  57. Google Cloud Platform User PVClaim Pod Delete Cluster Admin PersistentVolumes

    * PersistentVolumes
  58. Google Cloud Platform User PVClaim Delete Cluster Admin PersistentVolumes *

    PersistentVolumes
  59. Google Cloud Platform User Recycler Cluster Admin PersistentVolumes PersistentVolumes

  60. Google Cloud Platform Deployments ReplicaSet - replicas: 3 - selector:

    - app: MyApp - version: v1 Deployment - name: MyApp kubectl create ...
  61. Google Cloud Platform Deployments ReplicaSet - replicas: 4 - selector:

    - app: MyApp - version: v1 Deployment - name: MyApp kubectl create ...
  62. Google Cloud Platform Deployments ReplicaSet - replicas: 3 - selector:

    - app: MyApp - version: v1 Deployment - name: MyApp kubectl create ...
  63. Google Cloud Platform Deployments ReplicaSet - replicas: 3 - selector:

    - app: MyApp - version: v1 Deployment - name: MyApp kubectl create ...
  64. Google Cloud Platform Rolling Updates ReplicaSet - replicas: 3 -

    selector: - app: MyApp - version: v1 Deployment - name: MyApp kubectl apply ...
  65. Google Cloud Platform ReplicaSet - replicas: 3 - selector: -

    app: MyApp - version: v1 Rolling Updates ReplicaSet - replicas: 0 - selector: - app: MyApp - version: v2 Deployment - name: MyApp
  66. Google Cloud Platform ReplicaSet - replicas: 3 - selector: -

    app: MyApp - version: v1 ReplicaSet - replicas: 1 - selector: - app: MyApp - version: v2 Rolling Updates Deployment - name: MyApp
  67. Google Cloud Platform ReplicaSet - replicas: 2 - selector: -

    app: nginx - ver: 1.10 ReplicaSet - replicas: 1 - selector: - app: nginx - ver: 1.11 Deployment - app: nginx Rolling Updates
  68. Google Cloud Platform ReplicaSet - replicas: 2 - selector: -

    app: MyApp - version: v1 ReplicaSet - replicas: 2 - selector: - app: MyApp - version: v2 Rolling Updates Deployment - name: MyApp
  69. Google Cloud Platform ReplicaSet - replicas: 1 - selector: -

    app: MyApp - version: v1 ReplicaSet - replicas: 2 - selector: - app: MyApp - version: v2 Rolling Updates Deployment - name: MyApp
  70. Google Cloud Platform ReplicaSet - replicas: 1 - selector: -

    app: MyApp - version: v1 ReplicaSet - replicas: 3 - selector: - app: MyApp - version: v2 Rolling Updates Deployment - name: MyApp
  71. Google Cloud Platform ReplicaSet - replicas: 0 - selector: -

    app: MyApp - version: v1 ReplicaSet - replicas: 3 - selector: - app: MyApp - version: v2 Rolling Updates Deployment - name: MyApp
  72. Google Cloud Platform ReplicaSet - replicas: 3 - selector: -

    app: MyApp - version: v2 Rolling Updates Deployment - name: MyApp
  73. Google confidential │ Do not distribute Services A group of

    pods that work together • grouped by a selector Defines access policy • “load balanced” or “headless” Gets a stable virtual IP and port • sometimes called the service portal • also a DNS name VIP is managed by kube-proxy • watches all services • updates iptables when backends change Hides complexity - ideal for non-native apps Virtual IP Client
  74. Google Cloud Platform Arbitrary metadata Attached to any API object

    Generally represent identity Queryable by selectors • think SQL ‘select ... where ...’ The only grouping mechanism • pods under a ReplicationController • pods in a Service • capabilities of a node (constraints) Labels
  75. Google Cloud Platform App: MyApp Phase: prod Role: FE App:

    MyApp Phase: test Role: FE App: MyApp Phase: prod Role: BE App: MyApp Phase: test Role: BE Selectors
  76. Google Cloud Platform App: MyApp Phase: prod Role: FE App:

    MyApp Phase: test Role: FE App: MyApp Phase: prod Role: BE App: MyApp Phase: test Role: BE App = MyApp Selectors
  77. Google Cloud Platform App: MyApp Phase: prod Role: FE App:

    MyApp Phase: test Role: FE App: MyApp Phase: prod Role: BE App: MyApp Phase: test Role: BE App = MyApp, Role = FE Selectors
  78. Google Cloud Platform App: MyApp Phase: prod Role: FE App:

    MyApp Phase: test Role: FE App: MyApp Phase: prod Role: BE App: MyApp Phase: test Role: BE App = MyApp, Role = BE Selectors
  79. Google Cloud Platform App: MyApp Phase: prod Role: FE App:

    MyApp Phase: test Role: FE App: MyApp Phase: prod Role: BE App: MyApp Phase: test Role: BE App = MyApp, Phase = prod Selectors
  80. Google Cloud Platform App: MyApp Phase: prod Role: FE App:

    MyApp Phase: test Role: FE App: MyApp Phase: prod Role: BE App: MyApp Phase: test Role: BE App = MyApp, Phase = test Selectors
  81. Google Cloud Platform Run-to-completion, as opposed to run-forever • Express

    parallelism vs. required completions • Workflow: restart on failure • Build/test: don’t restart on failure Aggregates success/failure counts Built for batch and big-data work Status: GA in Kubernetes v1.2 ... Jobs
  82. Google Cloud Platform Problem: I have too much stuff! •

    name collisions in the API • poor isolation between users • don’t want to expose things like Secrets Solution: Slice up the cluster • create new Namespaces as needed • per-user, per-app, per-department, etc. • part of the API - NOT private machines • most API objects are namespaced • part of the REST URL path • Namespaces are just another API object • One-step cleanup - delete the Namespace • Obvious hook for policy enforcement (e.g. quota) Namespaces
  83. slack.kubernetes.io

  84. Thank You