Upgrade to Pro — share decks privately, control downloads, hide ads and more …

CPU shielding on Docker and Kubernetes

Kenta Tada
December 05, 2020

CPU shielding on Docker and Kubernetes

Kenta Tada

December 05, 2020
Tweet

More Decks by Kenta Tada

Other Decks in Programming

Transcript

  1. About me ⚫Kenta Tada ⚫Software Engineer, Sony ⚫CloudNative Days Tokyo

    2020 • https://speakerdeck.com/kentatada/embedded-container- runtime-using-linux-capabilities-seccomp-cgroups 2
  2. Background ⚫We want to run realtime (RT) processes on our

    embedded container environment. ⚫There are many things to think about when RT processes run on the container environment. • Integrate tools for RT into Kubernetes : Today’s Topic • Security : https://blogs.oracle.com/linux/dealing-with-realtime-processes-in-linux-user-namespaces 4 Inside Container Core 0 Linux dockerd RT process Non-RT process Non-RT process Core 0 Core 1 Core 2 Core 3 kernel thread Interrupt CPU isolation
  3. What is RT process ⚫ First in my words, real-time

    is not about the lowest possible latency or the maximum possible throughput. Real-time is deterministic execution time. Deterministic execution time means performing tasks within a certain time, this not being affected by any external process. ⚫There are many tasks to make processes real-time. Especially, containerization makes it more difficult. ⚫Today, I just only introduce the issues of CPU shielding on the container environment. 5 https://www.redhat.com/en/blog/going-full-deterministic-using-real-time-openstack
  4. What is CPU shielding ⚫ CPU shielding is a practice

    where on a multiprocessor system or on a CPU with multiple cores, real-time tasks can run on one CPU or core while non-real-time tasks run on another. ⚫Use cases • Isolating RT processes • Thermal throttling –This use case just only uses cpuset. –Reduce power consumption by pining background threads that are not performance-critical on LITTLE CPUs. • NFV(Network Functions Virtualization) –Improve NFV performance and prevent spurious packet loss. 7 https://en.wikipedia.org/wiki/CPU_shielding
  5. CPU shielding ⚫User processes • Isolate the specified core to

    launch RT processes ⚫Kernel threads • Move kernel threads from the isolated core – Ex. Use cset. The “isolcpus“ kernel boot option cannot isolate kernel threads. “nohz_full” supports it except for CPU bounded threads since kernel version 5.9 •Set dynamic tickless behaviour – Ex. Set up the “nohz_full” kernel boot option • Stop RCU callbacks – Ex. Set up the “rcu_nocbs” kernel boot option • Set CPU affinity for work queue – Ex. Modify cpumasks in /sys/devices/virtual/workqueue and so on… 8
  6. CPU shielding ⚫Interrupts • Set CPU affinity for interrupts –

    Ex. Modify files under /proc/irq • Change the interrupt handler from irq context to kernel thread – Ex. Set up the “threadirqs” kernel boot option and so on… You should adjust settings to your use case. 9 OK! What about CPU shielding inside a container??
  7. Isolate the specified core to launch RT processes inside a

    container 11 Core 0 Linux dockerd RT process Non-RT process Non-RT process Core 0 Core 1 Core 2 Core 3 ⚫cpuset is incomplete for CPU shielding. • When --cpuset-cpus argument is used, Docker can set CPU affinity. • But it cannot isolate CPUs against other than user processes. kernel thread Interrupt Outside the scope of this presentation How to move??
  8. Move kernel threads from the isolated core using cset ⚫cset

    is a tool to manipulate cpusets. • cset can isolate both user processes and kernel threads except for CPU bounded threads. ⚫How to isolate • cset creates directories of 'system' and 'user' to operate cpuset on the root of cpuset controller. • The 'system' cpuset which contains CPUs which are used for unimportant tasks. • The 'user' cpuset which contains CPUs which are used for important tasks. – The 'user' cpuset is the shield. 12 https://github.com/lpechacek/cpuset/blob/v1.6/doc/tutorial.txt
  9. The problem of Docker Shielding with cset ⚫Docker cannot launch

    the container on the isolated core by cset. ⚫What happened?? • cset creates directories of 'system' and 'user’. • Docker launches the container with --cpuset-cpus argument –Docker(runc) also creates the directory of cpuset(Ex. /sys/fs/cgroup/cpuset/docker) and tries to launch the container from that. –But cset has already made cpuset exclusive as default. – # echo 1 > cpuset.cpu_exclusive –So Docker fails to launch the container. 13
  10. The problem of Docker Shielding with cset 14 Core 0

    Core 0 Core 1 Core 2 Core 3 /sys /fs /cpuset /user/cpuset.cpu_exclusive /system/cpuset.cpu_exclusive Created by cset Created by Docker Shielded by cset /cgroup exclusive /docker/cpuset.cpu_exclusive
  11. How to integrate cset into Docker ⚫How to fix when

    you use cgroupfs driver 1. Create the isolated cpuset as 'docker' # cset shield --userset=docker -c 0 -k on 2. Launch your Docker container 3. Move processes to the non-isolated cpuset when you launch the unimportant container if you need ⚫It is difficult to maintain cpuset… • Using systemd driver • Using KVM 15
  12. Users launch the container in the isolated core 16 Core

    0 Core 0 Core 1 Core 2 Core 3 /sys /fs /cpuset /docker/cpuset.cpu_exclusive /system/cpuset.cpu_exclusive Created by cset Created by Docker Shielded by cset /cgroup exclusive
  13. Explicitly Reserved CPU List ⚫What about CPU shielding and cpuset

    in Kuberntes? ⚫Support explicitly reserved CPU list since Kubernetes v1.17 • The new Kubelet Flag to define an explicit CPU set for OS system daemons and Kubernetes system daemons. • This option is specifically designed for Telco/NFV. • To move the system daemon, Kubernetes daemons and interrupts/timers are out of scope. –In CentOS, you can do this using the tuned toolset. 18 https://v1-18.docs.kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/#explicitly-reserved-cpu-list
  14. Goal 19 Core 0 Linux dockerd RT process Non-RT process

    kubelet Core 0 Core 1 Core 2 Core 3 ⚫Make architecture simple to reduce maintenance costs • Try new kernel features to reduce necessary tools ⚫Integrate tools for RT into Kubernetes kernel thread Interrupt Reserved CPU list isolcpus, nohz_full and so on Inside Container CPU isolation
  15. Key takeaways ⚫There are many caveats to isolate CPU cores.

    • Processes • Kernel threads •Interrupt ⚫Containerization makes CPU Shielding more difficult. • Integrate tools for RT into Kubernetes • Consider security ⚫Diversity is important for OSS. • To get patches into mainline, we need to understand different use cases. 20