“The word ‘container’ doesn’t mean anything super precise. Basically there are a few new Linux kernel features (‘namespaces’ and ’cgroups’) that let you isolate processes from each other. When you use those features, you call it ‘containers’” J U L I A E V A N S https://jvns.ca/blog/2016/10/10/what-even-is-a-container/
“Containers are processes, born from tarballs, anchored to namespaces, controlled by cgroups” A L I C E G O L D F U S S https://twitter.com/lucacanducci/status/1011909897640927232
Built off chroot BSD Jails • Processes created in the chrooted environment cannot access files or resources outside of it* • Processes are only limited by the part of the filesystem they can access • Four key elements: • Directory subtree • Hostname • IP Address • Command to run BSD Jails
Virtualizing operating system services Solaris Zones • A zone is a virtualized operating system environment created within a single instance of the Solaris Operating System • Root zone – Default zone for the system & system-wide administrative tasks • Non-global zone - Zones for running specific workloads Solaris Zones
Solaris Zones FEATURES Processes cannot change zones Security Zone can provide isolation at almost any level of granularity Granularity Applications are prevented from monitoring or intercepting each other's network traffic, file system data, or process activity. Isolation Flexible network segmentation options Network Isolation The same application environment can be maintained on different physical machines Virtualization https://docs.oracle.com/cd/E19044-01/sol.containers/817-1592/zones.intro-1/index.html
Emulation of a Computer System Virtual Machine • Hypervisor uses native execution to share & manage hardware • Multiple environments isolated from each other • Separate kernel & operating system instances Virtual Machine
Containers Limiting the resources that can be used by a process/ set of processes cgroups Isolating filesystem resources Namespaces Implicit sharing or shadowing Copy on Write Locking down container privileges Linux Security Modules
cgroups • CPU – Limit CPU bandwidth • Cpuacct • Cpuset • Memory – Control the userland memory, kernel data structures, TCP socket buffers • IO – Control bandwidth or IOPS • PID-Limit number of PIDs • Network – Control Bandwidth * • And more… * With use of tc/ iptables Containers in Detail
Namespaces • CGroup – Cgroup root directory • IPC – Control the userland memory, kernel data structures, TCP socket buffers • Network (net) – Network devices/ stacks/ ports • Mount – Mount points can be private or shared • Process ID (pid) – Only see PID’s in same PID namespace • User ID (user) – Mapping of UID’s • UTS – Set the hostname in cgroup Containers in Detail
Copy on Write • Reduces memory footprint • Helps to reduce container boot times • Details: • Memory “resource” can be shared if only read • Copy of data is deferred until first write Containers in Detail
Other Resources ZONES Oracle: System Administration Guide: Oracle Solaris Containers-Resource Management and Oracle Solaris Zones https://docs.oracle.com/cd/E19044-01/sol.containers/817-1592/zones.intro-1/index.html https://docs.oracle.com/cd/E19253-01/817-1592/zone/index.html Brendan Gregg: Documentation: Zones http://www.brendangregg.com/zones.html#resource0
Other Resources CONTAINERS Jerome Petazzoni: Anatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxCon https://www.slideshare.net/jpetazzo/anatomy-of-a-container-namespaces-cgroups-some-filesystem-magic- linuxcon Jessie Frazelle: Containers from User Space (LinuxConfAU 2018) https://docs.google.com/presentation/d/1UuHvR_kvZ3BF1pSXyv4mMKX9vmGr7GXm97USx7mzTXY/ Julia Evans: What is even a container https://jvns.ca/blog/2016/10/10/what-even-is-a-container/ Redhat: Managing system resources on Red Hat Enterprise Linux 6 & 7 https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html- single/resource_management_guide/index https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html- single/resource_management_guide/index Akihiro Suzuki: Real-Time Task Partitioning using Cgroups https://elinux.org/images/8/84/Real-Time_Tasks_Partitioning_using_Cgroups.pdf