CPU cores vs. threads § CPU cores frequencies: base, turbo, throttling § CPU usage: Shared, Exclusive, Isolated § Additional CPU resources: Cache, Memory Bandwidth § Interconnect is an expensive resource § “C” in “NUMA” stands for “CPU” – Vendors have BIOS configurable settings to redefine what means NUMA (SNC, NPS,…) § Workload migration cost: low to very low § CPUs for Kubelet might be not the same meaning inside VM based runtimes
0 CPU 1 Die 1 CPU 2 CPU 3 Socket 1 Die 0 CPU 4 CPU 5 Die 1 CPU 6 CPU 7 For each leaf node § Groups of CPU+Memory § Dynamic pools – Shared – Exclusive – Isolated – “Throttled” – … Parent nodes § Sum of subtree resources Topology-Aware CPU policy physical_package_id die_id core_id
Memory types – DRAM – “Persistent”, in ”volatile, system RAM mode” (PMEM) – High Bandwidth (HBM) § Kernel’s “Normal” vs. “Movable” § NUMA – Distances – Have CPU – Have ”normal” memory § Workload migration costs: medium to HIGH
0 Die 0 IO and Memory CPUs Core Thread Thread Core... DRAM PMEM HBM IO and Memory ... Die ... Socket ... Each Node § CPU – CPU-less NUMA nodes are linked to nodes with CPUs § All memory types – DRAM – PMEM – HBM § Placement Cost calculated based on – Requested memory type(s) and amount of available memory – Later: BW, WSS MemTier Topology-Aware policy
public cloud, VMs, bare metal is hard – Especially for non-trivial resources (block I/O, caches, …) § Placing workloads might lead to situations where it can’t be done – Reject? – Rebalance? § Rebalancing of the running workloads can be also hard – Assigned devices – Memory migrations – Priorities The story of jar, rocks, pebbles and sand…
§ User should expect “it just works great” by default § Advanced users should be able to utilize good patterns on resource groups – Affinity/anti-affinity pattern – Device pipelines § Solutions that we do now should be aligned with where hardware is evolving to § Maybe not in Kubelet…?
a Container Runtime Interface proxy • sits between CRI Clients and the CRI Runtime • applies (hardware) resource policies to containers CPU, Memory, Cache, Memory Bandwidth, Block I/O, … • policies are applied by • modifying proxied container requests, or • generating container update requests, or • triggering extra policy-specific actions during request processing