Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Magic of Kubernetes Self-Healing Capabilities

The Magic of Kubernetes Self-Healing Capabilities

Kubecon EU 2019 Barcelona Talk.
Abstract: https://sched.co/MPcA
Video: https://youtu.be/91dgNqma7-Q

Saad Ali

May 22, 2019
Tweet

More Decks by Saad Ali

Other Decks in Technology

Transcript

  1. The Magic of Kubernetes Self-Healing Capabilities Saad Ali Senior Software

    Engineer, Google May 22, 2019 github.com/saad-ali twitter.com/the_saad_ali
  2. • Kubernetes manages clusters with a single node up to

    1000s of node • Failure is inevitable • Humans can’t keep up! Problem
  3. Agenda • How Kubernetes Self Healing Works • Examples of

    Self Healing in Kubernetes • Areas for Improvement
  4. Imperative API - Manual • You: provide exact set of

    instructions to drive to desired state • System: executes instructions • You: monitor system, and provide further instructions if it deviates. Declarative API - Automatic • You: define desired state • System: works to drive towards that state Declarative APIs
  5. Declarative APIs - Creating a pod Master: API Server The

    Kubernetes way! • You: create API object that is persisted on kube API server until deletion • System: all components work in parallel to drive to that state Node A Node B kubectl create -f replica.yaml
  6. Declarative APIs - Creating a pod The Kubernetes way! •

    You: create API object that is persisted on kube API server until deletion • System: all components work in parallel to drive to that state Master: API Server Node A Node B kubectl create -f replica.yaml apiVersion: apps/v1 kind: ReplicaSet metadata: name: frontend spec: replicas: 1 template: metadata: ... spec: ... containers: - name: nginx image: internal.mycorp.com:5000/mycontainer:1.7.9
  7. Declarative APIs - Creating a pod The Kubernetes way! •

    You: create API object that is persisted on kube API server until deletion • System: all components work in parallel to drive to that state Master: API Server Node A Node B Pod A definition
  8. Declarative APIs - Creating a pod The Kubernetes way! •

    You: create API object that is persisted on kube API server until deletion • System: all components work in parallel to drive to that state Master: API Server Node A Node B Pod A definition Pod A
  9. Declarative APIs - Creating a pod All components watch the

    Kubernetes API, and figure out what they need to do. Master: API Server Node A Node B Master: Scheduler
  10. Declarative APIs - Creating a pod All components watch the

    Kubernetes API, and figure out what they need to do. Master: API Server Node A Node B Master: Scheduler kubectl create pod
  11. Declarative APIs - Creating a pod All components watch the

    Kubernetes API, and figure out what they need to do. Master: API Server Node A Node B Master: Scheduler Pod A
  12. Declarative APIs - Creating a pod All components watch the

    Kubernetes API, and figure out what they need to do. Master: API Server Node A Node B Master: Scheduler Pod A Node: B
  13. Declarative APIs - Creating a pod All components watch the

    Kubernetes API, and figure out what they need to do. Master: API Server Node A Node B Master: Scheduler Pod A Node: B Pod A
  14. Declarative APIs - Creating a pod All components watch the

    Kubernetes API, and figure out what they need to do. Master: API Server Node A Node B Master: Scheduler Pod A Node: B Pod A kubectl delete pod A
  15. Declarative APIs - Creating a pod All components watch the

    Kubernetes API, and figure out what they need to do. Master: API Server Node A Node B Master: Scheduler Pod A
  16. Declarative APIs - Creating a pod All components watch the

    Kubernetes API, and figure out what they need to do. Master: API Server Node A Node B Master: Scheduler
  17. Level triggered instead of edge triggered -- no “missing events”

    issues. No single point of failure. Simple master components. Automatic recovery! Resulting in a Simpler, more robust system that can easily recover from failure of components. Benefits of Declarative API
  18. • In memory cache ◦ Desired State ◦ Actual State

    • Reconciler loop • Populator -- adds and removes from desired state. Controllers
  19. Example of Automatic Recovery Master: API Server Node A Node

    B Master: Scheduler Master: Node Controller Master: Replica Controller
  20. • Should have a way to observe and rectify Actual

    State Cache • Not always easy to implement ◦ Example: Orphaned volume mounts. ◦ Room for improvement. Actual State Drift
  21. • Detection ◦ 5 minutes • Force detach ◦ 6

    minutes • Attach volume ◦ Seconds to minutes • Starting new pod ◦ Seconds to minutes 10+ minutes to detect a shutdown node and move it. Node Shutdown