Upgrade to Pro — share decks privately, control downloads, hide ads and more …

CPU shielding on Docker and Kubernetes

Kenta Tada
December 05, 2020

CPU shielding on Docker and Kubernetes

Kenta Tada

December 05, 2020
Tweet

More Decks by Kenta Tada

Other Decks in Programming

Transcript

  1. Copyright 2020 Sony Corporation
    CPU shielding on Docker/Kubernetes
    第13回 コンテナ技術の情報交換会@オンライン
    Kenta Tada
    R&D Center
    Sony Corporation

    View Slide

  2. About me
    ⚫Kenta Tada
    ⚫Software Engineer, Sony
    ⚫CloudNative Days Tokyo 2020
    • https://speakerdeck.com/kentatada/embedded-container-
    runtime-using-linux-capabilities-seccomp-cgroups
    2

    View Slide

  3. Agenda
    ⚫Overview of CPU shielding
    ⚫CPU shielding on Docker
    ⚫CPU shielding on Kubernetes
    3

    View Slide

  4. Background
    ⚫We want to run realtime (RT) processes on our embedded
    container environment.
    ⚫There are many things to think about when RT processes
    run on the container environment.
    • Integrate tools for RT into Kubernetes : Today’s Topic
    • Security : https://blogs.oracle.com/linux/dealing-with-realtime-processes-in-linux-user-namespaces
    4
    Inside
    Container
    Core 0
    Linux
    dockerd
    RT
    process
    Non-RT
    process
    Non-RT
    process
    Core 0 Core 1 Core 2 Core 3
    kernel thread
    Interrupt
    CPU isolation

    View Slide

  5. What is RT process
    ⚫ First in my words, real-time is not about the lowest possible latency
    or the maximum possible throughput. Real-time is deterministic
    execution time. Deterministic execution time means performing
    tasks within a certain time, this not being affected by any external
    process.
    ⚫There are many tasks to make processes real-time.
    Especially, containerization makes it more difficult.
    ⚫Today, I just only introduce the issues of CPU shielding on
    the container environment.
    5
    https://www.redhat.com/en/blog/going-full-deterministic-using-real-time-openstack

    View Slide

  6. Overview of CPU shielding
    6

    View Slide

  7. What is CPU shielding
    ⚫ CPU shielding is a practice where on a multiprocessor system or on a
    CPU with multiple cores, real-time tasks can run on one CPU or core
    while non-real-time tasks run on another.
    ⚫Use cases
    • Isolating RT processes
    • Thermal throttling
    –This use case just only uses cpuset.
    –Reduce power consumption by pining background threads that
    are not performance-critical on LITTLE CPUs.
    • NFV(Network Functions Virtualization)
    –Improve NFV performance and prevent spurious packet loss. 7
    https://en.wikipedia.org/wiki/CPU_shielding

    View Slide

  8. CPU shielding
    ⚫User processes
    • Isolate the specified core to launch RT processes
    ⚫Kernel threads
    • Move kernel threads from the isolated core
    – Ex. Use cset. The “isolcpus“ kernel boot option cannot isolate kernel threads.
    “nohz_full” supports it except for CPU bounded threads since kernel version 5.9
    •Set dynamic tickless behaviour
    – Ex. Set up the “nohz_full” kernel boot option
    • Stop RCU callbacks
    – Ex. Set up the “rcu_nocbs” kernel boot option
    • Set CPU affinity for work queue
    – Ex. Modify cpumasks in /sys/devices/virtual/workqueue
    and so on…
    8

    View Slide

  9. CPU shielding
    ⚫Interrupts
    • Set CPU affinity for interrupts
    – Ex. Modify files under /proc/irq
    • Change the interrupt handler from irq context to kernel thread
    – Ex. Set up the “threadirqs” kernel boot option
    and so on…
    You should adjust settings to your use case.
    9
    OK!
    What about CPU shielding inside a container??

    View Slide

  10. CPU shielding on Docker
    10

    View Slide

  11. Isolate the specified core to launch RT processes inside a container
    11
    Core 0
    Linux
    dockerd
    RT
    process
    Non-RT
    process
    Non-RT
    process
    Core 0 Core 1 Core 2 Core 3
    ⚫cpuset is incomplete for CPU shielding.
    • When --cpuset-cpus argument is used, Docker can set CPU
    affinity.
    • But it cannot isolate CPUs against other than user processes.
    kernel thread
    Interrupt
    Outside the scope of
    this presentation
    How to move??

    View Slide

  12. Move kernel threads from the isolated core using cset
    ⚫cset is a tool to manipulate cpusets.
    • cset can isolate both user processes and kernel threads except
    for CPU bounded threads.
    ⚫How to isolate
    • cset creates directories of 'system' and 'user' to operate cpuset on
    the root of cpuset controller.
    • The 'system' cpuset which contains CPUs which are used for
    unimportant tasks.
    • The 'user' cpuset which contains CPUs which are used for important
    tasks.
    – The 'user' cpuset is the shield. 12
    https://github.com/lpechacek/cpuset/blob/v1.6/doc/tutorial.txt

    View Slide

  13. The problem of Docker Shielding with cset
    ⚫Docker cannot launch the container on the isolated core by
    cset.
    ⚫What happened??
    • cset creates directories of 'system' and 'user’.
    • Docker launches the container with --cpuset-cpus argument
    –Docker(runc) also creates the directory of cpuset(Ex.
    /sys/fs/cgroup/cpuset/docker) and tries to launch the container
    from that.
    –But cset has already made cpuset exclusive as default.
    – # echo 1 > cpuset.cpu_exclusive
    –So Docker fails to launch the container. 13

    View Slide

  14. The problem of Docker Shielding with cset
    14
    Core 0
    Core 0 Core 1 Core 2 Core 3
    /sys
    /fs
    /cpuset
    /user/cpuset.cpu_exclusive /system/cpuset.cpu_exclusive
    Created by cset
    Created by Docker
    Shielded by cset
    /cgroup
    exclusive
    /docker/cpuset.cpu_exclusive

    View Slide

  15. How to integrate cset into Docker
    ⚫How to fix when you use cgroupfs driver
    1. Create the isolated cpuset as 'docker'
    # cset shield --userset=docker -c 0 -k on
    2. Launch your Docker container
    3. Move processes to the non-isolated cpuset when you launch
    the unimportant container if you need
    ⚫It is difficult to maintain cpuset…
    • Using systemd driver
    • Using KVM
    15

    View Slide

  16. Users launch the container in the isolated core
    16
    Core 0
    Core 0 Core 1 Core 2 Core 3
    /sys
    /fs
    /cpuset
    /docker/cpuset.cpu_exclusive /system/cpuset.cpu_exclusive
    Created by cset
    Created by Docker
    Shielded by cset
    /cgroup
    exclusive

    View Slide

  17. CPU shielding on Kubernetes
    17

    View Slide

  18. Explicitly Reserved CPU List
    ⚫What about CPU shielding and cpuset in Kuberntes?
    ⚫Support explicitly reserved CPU list since Kubernetes v1.17
    • The new Kubelet Flag to define an explicit CPU set for OS system
    daemons and Kubernetes system daemons.
    • This option is specifically designed for Telco/NFV.
    • To move the system daemon, Kubernetes daemons and
    interrupts/timers are out of scope.
    –In CentOS, you can do this using the tuned toolset.
    18
    https://v1-18.docs.kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/#explicitly-reserved-cpu-list

    View Slide

  19. Goal
    19
    Core 0
    Linux
    dockerd
    RT
    process
    Non-RT
    process
    kubelet
    Core 0 Core 1 Core 2 Core 3
    ⚫Make architecture simple to reduce maintenance costs
    • Try new kernel features to reduce necessary tools
    ⚫Integrate tools for RT into Kubernetes
    kernel thread
    Interrupt
    Reserved CPU list
    isolcpus, nohz_full and so on
    Inside
    Container CPU isolation

    View Slide

  20. Key takeaways
    ⚫There are many caveats to isolate CPU cores.
    • Processes
    • Kernel threads
    •Interrupt
    ⚫Containerization makes CPU Shielding more difficult.
    • Integrate tools for RT into Kubernetes
    • Consider security
    ⚫Diversity is important for OSS.
    • To get patches into mainline, we need to understand different
    use cases.
    20

    View Slide

  21. SONYはソニー株式会社の登録商標または商標です。
    各ソニー製品の商品名・サービス名はソニー株式会社またはグループ各社の登録商標または商標です。その他の製品および会社名は、各社の商号、登録商標または商標です。

    View Slide