aka unified hierarchy. It was released 29 october 2019. This talk was initially planned for the ParisContainerDay 2020, but we will see that things have evolved a lot and the adoption is much wider than in 2019. Stay tuned, it will be fun 🙂 Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Why talking about cgroupv2 ?
that limits, accounts for, and isolates the resource usage (CPU, memory, disk I/O, network, etc.) of a collection of processes. It provides: - Resource limiting - Prioritization - Account - Control It started in 2006 from Google under the name “process containers”. It was renamed "control groups" to avoid confusion caused by multiple meanings of the term "container" in the Linux kernel context, and the control groups functionality was merged into the Linux kernel mainline in kernel version 2.6.24, which was released in January 2008. Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? What are cgroup(v1) ?
aspect of the system: - cpu for managing user / system CPU time and usage. - cpuacct - hugetlb for accounting usage of huge pages by process group. - cpuset for binding a group to specific CPU. Useful for real time applications and NUMA systems with localized memory per CPU. - freezer for freezing a group. Useful for cluster batch scheduling, process migration and debugging without affecting prtrace. - net_cls,net_prio for tagging the traffic control. - devices for reading / writing access devices. - pids for controlling number of processes. - perf_event for per-cgroup perf monitoring. - rdma for distribution and accounting of RDMA resources. - memory for managing accounting, limits and notifications. - bulkio for measuring & limiting amount of blckIO by group. Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? What are cgroup(v1) ?
to setup virtual OSes (or containers ;-)) Each cgroup will only be able to use as much as the cgroup has defined Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? What are cgroup(v1) ?
can have cgroup that deals with CPU or memory. Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? What are cgroup(v1) ?
we mount the memory controller to limit a process memory Files that are starting with cgroup are common interfaces which allow interaction with the group. Files that are starting with memory are interfaces specific to the controller that has been activated Directories with *.slice are other cgroups automatically created. Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? How to use cgroupv1 ?
Here we are limiting the memory used by the process using this cgroup to 100Kib. Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? How to use cgroupv1 ?
tree. A process can join independent cgroups for example cgroup foo for CPU and bar for memory. It was designed at first to provide good flexibility, but wasn’t proved to be useful. Utility controllers (e.g.,freezer) that might be useful in all hierarchies could be used in only one Allowing thread granularity for cgroup membership proved problematic (e.g. memory controller (threads share memory...)) Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Problems with cgroupv1
controllers. When you create a new cgroup like newcgroup all controllers enabled for newcgroup will take the control of the process. cgroupv2 allows only process-granularity membership cgroupv2 has consistent names and values for interface files,consistent inheritance rules for all controllers Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? A new model : unified hierarchy
successor to v1 cpu and cpuacct controllers - cpuacct - hugetlb -> successor to v1 hugetlb controller - cpuset -> successor to v1 cpuset controller - freezer - net_cls,net_prio -> no direct equivalent - devices -> successor to v1 devices controller - pids -> exactly the same as v1 controller - perf_event -> same as v1 controller - rdma -> same as v1 controller - memory -> successor to v1 memory controller - bulkio - io -> successor to v1 blkio controller - misc -> new cgroup controller Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? A new model : unified hierarchy
in a unified hierarchy No need to explicitly bind controllers to mount point Each v2 cgroup has a (read-only) cgroup.controllers file, which lists available controllers this cgroup can enable Controllers are enabled/disabled by writing some subset of available controllers to cgroup.subtree_control Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? A new model : unified hierarchy
resource to be controlled in child cgroups Creates controller-specific attribute files in each child directory If a controller is disabled in a cgroup (i.e., not written to cgroup.subtree_control in parent cgroup), it cannot be enabled in any descendants of the cgroup Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? A new model : unified hierarchy
In cgroupv2, the device access control is implemented by attaching an eBPF program. Here is the same configuration in cilium-flavored assembler syntax. eBPF is a technology from the kernel that can, among other things, analyze network traffic Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? cgroupv2 : eBPF oriented
to the ability for an unprivileged user to create, run and otherwise manage containers. When we say Rootless Containers, it means running the entire container runtime as well as the containers without the root privileges. Allowing a non-root user to access to /var/run/docker.sock, by adding the user to docker group (sudo usermod -aG docker somebody) is NOT an example of a rootless container. Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? cgroupv2 : rootless containers
default, a non-root user can only get memory controller and pids controller to be delegated. Create a new file in /etc/systemd/system/[email protected]/delegate .conf to delegate cpu and io. Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Enabling cgroupv2
version 4.5 of kernel on march 2016 However, it wasn’t considered to be useful for containers until the release of kernel 5.2 (July 7, 2019), due to the lack of the support for the device controller and the freezer. After the introduction of cgroupv2 device controller in kernel 4.15 (Jan 28, 2018) and cgroupv2 freezer in kernel 5.2, now cgroupv2 is considered to be ready for containers. Although there is “hybrid” configuration that allows mounting both v1 hierarchy and cgroupv2 hierarchy, the “hybrid” mode is underutilized for containers because you can’t enable cgroupv2 controllers that are already enabled for cgroupv1. Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Adoption status
User namespaces must be compiled and enable in your kernel. Confirm CONFIG_USER_NS=y is set in your kernel configuration. Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Adoption status : low-level runtimes
It is written in C, much smaller (around 300ko vs 15Mo), twice as fast as runc. It has support for cgroupv2 since late 2019 and was the default runtime from Fedora 31 Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Adoption status : low-level runtimes
You can start containerd as a user with containerd-rootless-setuptool.sh Don’t forget to install CNI plugins inside /opt/cni/bin To start/stop the daemon: systemctl --user <start|stop> containerd Enabling resource limitations: nerdctl run --cpus | --memory | --blkio-weight | --pids-limit Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Adoption status : high-level runtimes
https://github.com/moby/moby/pull/40657 - https://github.com/moby/moby/pull/40662 Docker 20.10 add support for cgroupv2. It is now out of experimental : https://github.com/moby/moby/pull/42263 To install Docker in rootless: Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Adoption status : high-level runtimes
multi-container networking since 2.1 Enabling resource limitations: podman run --cpus | --memory | --blkio-weight | --pids-limit To use CPU controller, you need to add it to your configuration: [Service] # default: Delegate=pids memory Delegate=pids memory cpu Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Adoption status : high-level runtimes
rootkit to launch the daemon. Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Adoption status : high-level runtimes
a user’s home https://github.com/rootless-containers/usernetes k3s It supports rootless mode using usernetes : k3s server --rootless https://rancher.com/docs/k3s/latest/en/advanced/#running-k3s-with-rootlesskit-experimental kubernetes The PR has been merged 20 May 2021 ! https://github.com/kubernetes/enhancements/pull/1371 It should be available for 1.22 Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Adoption status : Kubernetes
level of containers (from OCI, to runtime, to Kubernetes) • cgroupv2 enables a new unified hierarchy with naming and developing conventions • rootless containers allow for increased security enabling running containers in non-root user Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Qu’est ce que le Paris Container Day ? Key takeaways