Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kubernetes Overview

Vish Kannan
November 27, 2014

Kubernetes Overview

Kubernetes presentation at a Docker meetup in Bangalore.

Vish Kannan

November 27, 2014
Tweet

More Decks by Vish Kannan

Other Decks in Technology

Transcript

  1. Google confidential │ Do not distribute Kubernetes & cAdvisor Docker

    meetup - Bangalore Vishnu Kannan ([email protected]) Software Enginner, Google Inc. Github, IRC: vishh
  2. Google confidential │ Do not distribute Google has been developing

    and using containers to manage our applications for over 10 years. Images by Connie Zhou
  3. Google confidential │ Do not distribute Traditional computing • Server

    per component • Configuration & Management • Plan ahead • Op Ex • Scalability limits. • Utilization libs app kernel libs app kernel libs app kernel libs app kernel libs app kernel libs app kernel
  4. Google confidential │ Do not distribute Cluster Computing Solution: Use

    Computers! • Automation • Scalability • Think about resources • Diverse workloads • Ease of management Omega, Mesos, Kubernetes, etc. libs app kernel libs app kernel libs app kernel libs app kernel libs app kernel libs app kernel
  5. Google confidential │ Do not distribute Typical parts of a

    Cluster Management System • Scheduler • Node manager • Binary deployment service • Application discovery • Application config management • Node and application monitoring
  6. Google confidential │ Do not distribute One application per machine?

    Can we do better? 1. Place multiple application on one machine 2. Partition the physical machine - VMs 3. Partition the resources on a physical machine - cgroups, namespaces (isolation) Smarter Node Management Capacity vs Usage
  7. Google confidential │ Do not distribute Old Way: Shared Machines

    No isolation No namespacing Common libs Highly coupled apps and OS app kernel libs app app app
  8. Google confidential │ Do not distribute Old Way: Virtual Machines

    Some isolation Expensive and inefficient Still highly coupled to the OS Hard to manage libs app kernel libs app app kernel app
  9. Google confidential │ Do not distribute New Way: Containers libs

    app kernel libs app libs app libs app Think of Lightweight VMs Isolate CPU, RAM, Disk, Users, Network, etc. Powered by Linux APIs • cgroups • namespaces • capabilities • chroots Better resource utilization.
  10. Google confidential │ Do not distribute cAdvisor Understand resource usage

    and performance of applications Google OSS project Written in Go; tiny resource footprint. Supports Docker containers natively. Lxc and raw cgroup supported. Understands Cpu, memory, filesystem and network utilization Easy to use REST Api Runs in a docker container
  11. Google confidential │ Do not distribute Heapster Cluster container monitoring

    using cAdvisor Default monitoring solution in kubernetes Filesystem based API to support other Cluster mangement systems. CoreOS support using Filesystem API Discovers and collects stats from cAdvisors running on all the nodes Pushes data to InfluxDB or BigQuery Typical setup: Heapster + InfluxDB + Grafana
  12. Google confidential │ Do not distribute Kubernetes Greek for “Helmsman”;

    also the root of the word “Governor” • Container orchestrator • Runs Docker containers • Supports multiple cloud and bare- metal environments • Inspired and informed by Google’s experiences • Open source, written in Go Manage applications, not machines
  13. Google confidential │ Do not distribute High Level Design CLI

    API UI apiserver users master kubelet kubelet kubelet nodes scheduler
  14. Google confidential │ Do not distribute Primary Concepts Container: A

    sealed application package (Docker) Pod: A small group of tightly coupled Containers example: content syncer & web server Controller: A loop that drives current state towards desired state example: replication controller Service: A set of running pods that work together example: load-balanced backends Labels: Identifying metadata attached to other objects example: phase=canary vs. phase=prod Selector: A query against labels, producing a set result example: all pods where label phase == prod
  15. Google confidential │ Do not distribute Design Principles Declarative >

    imperative: State your desired results, let the system actuate Control loops: Observe, rectify, repeat Simple > Complex: Try to do as little as possible Modularity: Components, interfaces, & plugins Legacy compatible: Requiring apps to change is a non-starter Network-centric: IP addresses are cheap No grouping: Labels are the only groups Cattle > Pets: Manage your workload in bulk Open > Closed: Open Source, standards, REST, JSON, etc.
  16. Google confidential │ Do not distribute Control Loops Drive current

    state -> desired state Act independently APIs - no shortcuts or back doors Observed state is truth Recurring pattern in the system Example: ReplicationController observe diff act
  17. Google confidential │ Do not distribute Atomic Storage Backing store

    for all master state Hidden behind an abstract interface Stateless means scalable Watchable • this is a fundamental primitive • don’t poll, watch Using CoreOS etcd
  18. Google confidential │ Do not distribute Pods Small group of

    containers & volumes Tightly coupled Scheduling atom Shared namespace • share IP address & localhost Ephemeral • can die and be replaced Example: data puller & web server Pod File Puller Web Server Volume Consumers Content Manager
  19. Google confidential │ Do not distribute Pod Networking Pod IPs

    are routable • Docker default is private IP Pods can reach each other without NAT • even across nodes Pods can egress traffic • if allowed by cloud environment No brokering of port numbers Fundamental requirement • several SDN solutions
  20. Google confidential │ Do not distribute Volumes Pod scoped Share

    pod’s lifetime & fate Support various types of volumes • Empty directory (default) • Host file/directory • Git repository • GCE Persistent Disk • ...more to come, suggestions welcome Pod Container Container Git GitHub Host Host’s FS GCE GCE PD Empty
  21. Google confidential │ Do not distribute Pod Lifecycle Once scheduled

    to a node, pods do not move • restart policy means restart in-place Pods can be observed pending, running, succeeded, or failed • failed is really the end - no more restarts • no complex state machine logic Pods are not rescheduled by the scheduler or apiserver • even if a node dies • controllers are responsible for this • keeps the scheduler simple
  22. Google confidential │ Do not distribute Labels Arbitrary metadata Attached

    to any API object Generally represent identity Queryable by selectors • think SQL ‘select ... where ...’ The only grouping mechanism • pods under a ReplicationController • pods in a Service • capabilities of a node (constraints) Example: “phase: canary” App: Nifty Phase: Dev Role: FE App: Nifty Phase: Dev Role: BE App: Nifty Phase: Test Role: FE App: Nifty Phase: Test Role: BE
  23. Google confidential │ Do not distribute Selectors App: Nifty Phase:

    Dev Role: FE App: Nifty Phase: Test Role: FE App: Nifty Phase: Dev Role: BE App: Nifty Phase: Test Role: BE
  24. Google confidential │ Do not distribute App == Nifty App:

    Nifty Phase: Dev Role: FE App: Nifty Phase: Test Role: FE App: Nifty Phase: Dev Role: BE App: Nifty Phase: Test Role: BE Selectors
  25. Google confidential │ Do not distribute App == Nifty Role

    == FE App: Nifty Phase: Dev Role: FE App: Nifty Phase: Test Role: FE App: Nifty Phase: Dev Role: BE App: Nifty Phase: Test Role: BE Selectors
  26. Google confidential │ Do not distribute App == Nifty Role

    == BE App: Nifty Phase: Dev Role: FE App: Nifty Phase: Test Role: FE App: Nifty Phase: Dev Role: BE App: Nifty Phase: Test Role: BE Selectors
  27. Google confidential │ Do not distribute App == Nifty Phase

    == Dev App: Nifty Phase: Dev Role: FE App: Nifty Phase: Test Role: FE App: Nifty Phase: Dev Role: BE App: Nifty Phase: Test Role: BE Selectors
  28. Google confidential │ Do not distribute App == Nifty Phase

    == Test App: Nifty Phase: Dev Role: FE App: Nifty Phase: Test Role: FE App: Nifty Phase: Dev Role: BE App: Nifty Phase: Test Role: BE Selectors
  29. Google confidential │ Do not distribute Replication Controllers Canonical example

    of control loops Runs out-of-process wrt API server Have 1 job: ensure N copies of a pod • if too few, start new ones • if too many, kill some • group == selector Cleanly layered on top of the core • all access is by public APIs Replication Controller - Name = “nifty-rc” - Selector = {“App”: “Nifty”} - PodTemplate = { ... } - NumReplicas = 4 API Server How many? 3 Start 1 more OK How many? 4
  30. Google confidential │ Do not distribute Services A group of

    pods that act as one • group == selector Defines access policy • only “load balanced” for now Gets a stable virtual IP and port • called the service portal • soon to have DNS VIP is captured by kube-proxy • watches the service constituency • updates when backends change Hide complexity - ideal for non-native apps Portal (VIP) Client
  31. Google confidential │ Do not distribute Cluster Services Logging, Monitoring,

    DNS, etc. All run as pods in the cluster - no special treatment, no back doors Open-source solutions for everything • cadvisor + influxdb + heapster == cluster monitoring • fluentd + elasticsearch + kibana == cluster logging • skydns + kube2sky == cluster DNS Can be easily replaced by custom solutions • Modular clusters to fit your needs
  32. Google confidential │ Do not distribute Status & Plans Open

    sourced in June, 2014 Google just launched Google Container Engine (GKE) • hosted Kubernetes • https://cloud.google.com/container-engine/ Roadmap: • https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/roadmap.md Driving towards a 1.0 release in O(months)
  33. Google confidential │ Do not distribute The Goal: Shake Things

    Up Containers is a new way of working Requires new concepts and new tools Google has a lot of experience... ...but we are listening to the users Workload portability is important!
  34. Google confidential │ Do not distribute cAdvisor & Kubernetes is

    Open Source We want your help! http://kubernetes.io https://github.com/google/cadvisor https://github.com/GoogleCloudPlatform/heapster irc.freenode.net #google-containers
  35. Google confidential │ Do not distribute Why containers? • Performance

    • Repeatability • Isolation • Quality of service • Accounting • Visibility • Portability A fundamentally different way of managing applications Images by Connie Zhou
  36. Google confidential │ Do not distribute cAdvisor Internals Docker Kernel

    cAdvisor • Collect • Measure • Analyze • Export Users LXC lmctfy
  37. Google confidential │ Do not distribute Docker Dramatically simplifies node

    management. Easy to use Build, test and deploy - anywhere Provides resource isolation and security Big ecosystem exists around Docker WIP - better resource isolation, hardening, performance, etc.
  38. Google confidential │ Do not distribute cAdvisor roadmap • Better

    signals and more resources • Memory • Disk I/O • Network • More suggestions • Insufficient resources • Performance effects • Start applying suggestions
  39. Google confidential │ Do not distribute Heapster roadmap • Auto

    scaling • Nodes • Containers • Recognize Antagonists • Bad interactions between containers • Current work: CPI2 • React to signals • Migrate containers (CRUI)
  40. Google confidential │ Do not distribute 10.1.1.0/24 10.1.1.93 10.1.1.113 Docker

    Networking 10.1.2.0/24 10.1.2.118 10.1.3.0/24 10.1.3.129
  41. Google confidential │ Do not distribute 10.1.1.0/24 10.1.1.93 10.1.1.113 Docker

    Networking 10.1.2.0/24 10.1.2.118 10.1.3.0/24 10.1.3.129 NAT NAT NAT NAT NAT
  42. Google confidential │ Do not distribute 10.1.1.0/24 10.1.1.93 10.1.1.113 Pod

    Networking 10.1.2.0/24 10.1.2.118 10.1.3.0/24 10.1.3.129
  43. Google confidential │ Do not distribute Replication Controllers node 1

    f0118 node 3 node 4 node 2 d9376 b0111 a1209 Replication Controller - Desired = 4 - Current = 4
  44. Google confidential │ Do not distribute Replication Controllers node 1

    f0118 node 3 node 4 node 2 Replication Controller - Desired = 4 - Current = 3 d9376 b0111 a1209
  45. Google confidential │ Do not distribute Replication Controllers node 1

    f0118 node 3 node 4 node 2 Replication Controller - Desired = 4 - Current = 4 d9376 b0111 a1209 c9bad
  46. Google confidential │ Do not distribute Replication Controllers node 1

    f0118 node 3 node 4 node 2 Replication Controller - Desired = 4 - Current = 5 d9376 b0111 a1209 c9bad
  47. Google confidential │ Do not distribute Replication Controllers node 1

    f0118 node 3 node 4 node 2 Replication Controller - Desired = 4 - Current = 4 d9376 b0111 a1209 c9bad
  48. Google confidential │ Do not distribute Services 10.0.0.1 : 9376

    Client kube-proxy Service - Name = “nifty-svc” - Selector = {“App”: “Nifty”} - Port = 9376 - ContainerPort = 8080 Portal IP is assigned iptables DNAT TCP / UDP apiserver watch 10.240.2.2 : 8080 10.240.1.1 : 8080 10.240.3.3 : 8080 TCP / UDP