Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kubernetes Resource and Eviction Management

Kubernetes Resource and Eviction Management

1. Resource types of Pod
2. Resource of requests and limits
3. QoS classes
4. Node Behavior : Eviction Policy & oom_killer
5. Experience Sharing

ydFu(Ader Fu)

May 21, 2020

More Decks by ydFu(Ader Fu)

Other Decks in Technology


  1. 3 Outline ⚫ Resource types of Pod ⚫ Resource of

    requests and limits ⚫ QoS classes ⚫ Node Behavior : ➢ Eviction Policy & oom_killer ⚫ Experience Sharing
  2. Resource types • The Kubernetes scheduler uses Resource types to

    figure out where to run your pods. • CPU and memory are collectively referred to as compute resources. Compute resources are measurable quantities that can be requested, allocated, and consumed. • CPU is specified in units of millicores • Memory is specified in units of bytes. 5 https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#resource-types
  3. Compressible resource • Compressible resource • Hold no state •

    Can be taken away very quickly • “Merely” cause slowness when revoked • e.g. CPU,disk time • Incompressible resource • Hold state • Are slower to be taken away • Can fail to be revoked • e.g. Memory,disk space https://www.slideshare.net/damianigbe/kubernetes-scheduling-and-qos 6
  4. Resource types : CPU • CPU resources are measured in

    cpu units • One CPU, in Kubernetes is equivalent to • 1 AWS vCPU • 1 GCP Core • 1 Azure vCore • 1 IBM vCPU • 1 Hyperthread on a bare-metal Intel processor with Hyperthreading • Unit Form: the form 100m might be preferred. • CPU is considered a “compressible” resource. 7 https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#resource-types
  5. Resource types : Memory • Memory resources are measured in

    bytes. • Unit Form: • integer or as a fixed-point integer using one of these suffixes: E, P, T, G, M, K. • the power-of-two equivalents: Ei, Pi, Ti, Gi, Mi, Ki. • Memory is considered a “incompressible” resource. 8 https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#resource-types
  6. Unit of Resource types 9 Decimal Value Metric 1000 KB

    kilobyte 10002 MB megabyte 10003 GB gigabyte 10004 TB terabyte 10005 PB petabyte 10006 EB exabyte 10007 ZB zettabyte 10008 YB yottabyte Binary Value IEC 1024 KiB kibibyte 10242 MiB mebibyte 10243 GiB gibibyte 10244 TiB tebibyte 10245 PiB pebibyte 10246 EiB exbibyte 10247 ZiB zebibyte 10248 YiB yobibyte https://en.wikipedia.org/wiki/Mebibyte
  7. Resource requests and limits • Cgroups are used to map

    Pod CPU and Memory Resources https://schd.ws/hosted_files/kccnceu18/33/Inside%20Kubernetes%20QoS%20M.%20Gasch%20KubeCon%20E U%20FINAL.pdf 11 ESXi (Host) OS (Linux Kernel) Kubernetes (Pod Manifest) CPU Requests CPU Limits CPU Shares CPU Quota CPU Period CPU Shares CPU Reservation CPU Limit MEM Requests MEM Limits OOM Score Adj. MEM Limits MEM Shares MEM Reservation MEM Limit
  8. How Pods with resource are scheduled ➢How Pods with resource

    requests are scheduled? • When you create a Pod, the Kubernetes scheduler selects a node for the Pod to run on. Each node has a maximum capacity for each of the resource types: the amount of CPU and memory it can provide for Pods. ➢How Pods with resource limits are run? • When the kubelet starts a Container of a Pod, it passes the CPU and memory limits to the container runtime. https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#how-pods-with-resource- requests-are-scheduled 13
  9. Note • Node: It is important to remember that you

    cannot set requests that are larger than resources provided by your nodes. For example, if you have a cluster of dual-core machines, a Pod with a request of 2.5 cores will never be scheduled! • Pod: Each container in the Pod can set its own requests and limits, and these are all additive. 14 https://cloud.google.com/blog/products/gcp/kubernetes-best-practices-resource-requests-and-limits
  10. QoS classes 16 https://schd.ws/hosted_files/kccnceu18/33/Inside%20Kubernetes%20QoS%20M.%20Gasch%20KubeCon%20E U%20FINAL.pdf Guaranteed + Predictable SLA and

    highest Priority(Eviction) - Lower Efficiency (Resources capped, no Overcommit) Burstable + Increase Overcommit Level, use idle Resources* - Medium Priority (Eviction), unbounded Resources* Best Effort + High Resource Efficiency &Utilization - Resource Starvation and Eviction verylikely
  11. QoS classes 17 https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/#qos-classes CPU Memory Class R(equests) L(imits) R

    L Pod Best Effort ( 1Container) 0=R=L (all Containers) R L R L Pod Guaranteed ( 1Container) 0<R=L (all Containers) R L R L R L R L Pod (2 Containers) Burstable 0<R<=(L) (at least one Container) QoS Examples • Classes calculated based on CPU and Memory Resource Specifications (Requests/Limits)
  12. Assigned a QoS class of Guaranteed • For a Pod

    to be given a QoS class of Guaranteed: • Every Container in the Pod must have a memory limit and a memory request, and they must be the same. • Every Container in the Pod must have a CPU limit and a CPU request, and they must be the same 18 https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/#create-a-pod-that-gets-assigned-a- qos-class-of-guaranteed
  13. apiVersion: v1 kind: Pod metadata: name: qos-demo namespace: qos-example spec:

    containers: - name: qos-demo-ctr image: nginx resources: limits: memory: "200Mi" cpu: "700m" requests: memory: "200Mi" cpu: "700m" Specification:Guaranteed Pod 19 https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/#create-a-pod-that-gets-assigned-a- qos-class-of-guaranteed
  14. Note : Guaranteed Pod • If a Container specifies its

    own memory limit, but does not specify a memory request, Kubernetes automatically assigns a memory request that matches the limit • If a Container specifies its own CPU limit, but does not specify a CPU request, Kubernetes automatically assigns a CPU request that matches the limit. 20 https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/#create-a-pod-that-gets-assigned-a- qos-class-of-guaranteed
  15. Assigned a QoS class of Burstable • A Pod is

    given a QoS class of Burstable if: • The Pod does not meet the criteria for QoS class Guaranteed. • At least one Container in the Pod has a memory or CPU request. 21 https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/#create-a-pod-that-gets-assigned-a- qos-class-of-burstable
  16. Specification: Burstable Pod 22 apiVersion: v1 kind: Pod metadata: name:

    qos-demo-2 namespace: qos-example spec: containers: - name: qos-demo-2-ctr image: nginx resources: limits: memory: "200Mi" requests: memory: "100Mi" https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/#create-a-pod-that-gets-assigned-a- qos-class-of-guaranteed
  17. Assigned a QoS class of BestEffort • For a Pod

    to be given a QoS class of BestEffort • the Containers in the Pod must not have any memory or CPU limits or requests. 23 https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/#create-a-pod-that-gets-assigned-a- qos-class-of-besteffort
  18. pods/qos/qos-pod-3.yaml apiVersion: v1 kind: Pod metadata: name: qos-demo-3 namespace: qos-example

    spec: containers: - name: qos-demo-3-ctr image: nginx Specification: BestEffort Pod 24 https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/#create-a-pod-that-gets-assigned-a- qos-class-of-guaranteed
  19. Summuy : QoS classes setting 25 Resource types(set up) QoS

    classes limits requests O O O X Guaranteed X O Burstable X X Best Effort Resource types(value) QoS classes limits requests limits=requests Guaranteed limits>requests Burstable limits<requests Burstable • QoS classes Soucre Code: kubernetes/pkg/apis/core/v1/helper/qos/qos.go https://github.com/kubernetes/kubernetes/blob/5713c22eecff461 0026643fbd3d37c33a43c168d/pkg/apis/core/v1/helper/qos/qos.go
  20. QoS Classes and NodeBehavior • Resources are either compressible (CPU)

    or uncompressible (Memory, Storage) • Compressible = Throttling (Weight: cpu.shares) • Uncompressible = Evict (Kubelet) or OOM_kill (“OutOfMemory Killer” byKernel) • Kubelet Eviction Thresholds can be “hard” (instantly) and “soft” (allow Pod Termination Grace Period) • Note: – If Kubelet cannot react fast enough, e.g. Memory Spike, Kernel OOM kills Container • There’s no Coordination between Eviction and OOM Killer (Race Condition possible) https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/#qos-classes 27
  21. Eviction Policy • The kubelet needs to preserve node stability

    when available compute resources are low. This is especially important when dealing with incompressible compute resources, such as memory or disk space. 28 Eviction Signal memory.available nodefs.available nodefs.inodesFree imagefs.available imagefs.inodesFree Default hard eviction threshold memory.available<100Mi nodefs.available<10% nodefs.inodesFree<5% imagefs.available<15% https://kubernetes.io/docs/tasks/administer-cluster/out-of-resource/#eviction-policy
  22. Soft Eviction Thresholds • A soft eviction threshold pairs an

    eviction threshold with a required administrator-specified grace period. No action is taken by the kubelet to reclaim resources associated with the eviction signal until that grace period has been exceeded. • soft eviction thresholds flags are supported: ◆eviction-soft ◆eviction-soft-grace-period ◆eviction-max-pod-grace-period 29 https://kubernetes.io/docs/tasks/administer-cluster/out-of-resource/#soft-eviction-thresholds
  23. Hard Eviction Thresholds • A hard eviction threshold has no

    grace period, and if observed, the kubelet will take immediate action to reclaim the associated starved resource. If a hard eviction threshold is met, the kubelet kills the Pod immediately with no graceful termination. • hard eviction thresholds flags are supported: ◆eviction-hard 30 https://kubernetes.io/docs/tasks/administer-cluster/out-of-resource/#hard-eviction-thresholds
  24. Evicting end-user Pods • If the kubelet is unable to

    reclaim sufficient resource on the node, kubelet begins evicting Pods. • kubelet ranks and evicts Pods in the following order: 1. BestEffort 2. Burstable 3. Guaranteed 31 https://kubernetes.io/docs/tasks/administer-cluster/out-of-resource/#evicting-end-user-pods
  25. Node OOM Behavior • The kubelet sets a oom_score_adj value

    for each container based on the quality of service for the Pod. https://kubernetes.io/docs/tasks/administer-cluster/out-of-resource/#node-oom-behavior 32 Quality of Service oom_score_adj Guaranteed -998 BestEffort 1000 Burstable min(max(2, 1000 - (1000 * memoryRequestBytes) / machineMemoryCapacityBytes), 999)
  26. Node OOM Behavior • kubelet may not observe memory pressure

    right away • The kubelet currently polls cAdvisor to collect memory usage stats at a regular interval. If memory usage increases within that window rapidly, the kubelet may not observe MemoryPressure fast enough, and the OOMKiller will still be invoked. • viable workaround :set eviction thresholds at approximately 75% capacity 33
  27. Experience Sharing:cpuset • The static policy allows containers in Guaranteed

    pods with integer CPU requests access to exclusive CPUs on the node. 35 spec: containers: - name: nginx image: nginx resources: limits: memory: "200Mi" cpu: "2" requests: memory: "200Mi" cpu: "2" https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/#static-policy
  28. Experience Sharing: DaemonSet • Protect critical (System) Pods (DaemonSets, Controllers,

    Master Components) • It is never desired for kubelet to evict a DaemonSet Pod, since the Pod is immediately recreated and rescheduled back to the same node. • Instead DaemonSet should ideally launch Guaranteed Pods. 36 https://kubernetes.io/docs/tasks/administer-cluster/out-of-resource/#daemonset
  29. Ref. ➢ Inside Kubernetes Resource Management (QoS) – Mechanics and

    Lessons from the Field - Michael Gasch • https://www.youtube.com/watch?v=8-apJyr2gi0 • https://schd.ws/hosted_files/kccnceu18/33/Inside %20Kubernetes%20QoS%20M.%20Gasch%20Kube Con%20EU%20FINAL.pdf 38