virtualization method The Best Linux Blog In the Unixverse @nixcraft History of containers on unix like system: 1. chroot 1982 2. Freebsd jails 2000 3. Linux vserver 2001 4. Solaris zones 2004 5. OpenVZ 2005 6. LXC 2008 7. Systemd-nspawn 2010 8. Docker 2013 #sysadmin #linux #unix #macos #devops 1157 PM · Aug 10, 2018 694 365 people are Tweeting about this 2
instances But want to use alternate kernel ? no extensibility (latest kernel, out-of-tree module) wish to avoid kernel crash by your app ? VM ? ref: https://www.redhat.com/cms/managed- les/virtualization-vs-containers.png 4
qemu (lightweight version), or recracker OCI runtime: runv Run independent kernel in a container instance (isolation) Run (small) guest Linux kernel (compatibility) Still slight overhead Compatibility: ++, Portability: ++, Lightweight: - ref: https://katacontainers.io/ 6
(Windows/macOS) Small Linux VM Run (most of) components in Linux Goal: Transparent usage from host OS Useful for development environment Compatibility: ++, Portability: ++, Lightweight: - ref: https://docs.docker.com/docker-for-mac/images/docker-for-mac-install.png 7
a userspace process upstream (since kernel 2.2.x?) Support i386/x86_64 Linux host experimental ppc/Linux and windows host (not maintained) ptrace-based syscall interpose less portability Compatibility: ++, Portability: -, Lightweight: +/- 8
(liblkl.{so,a}) run Linux code on various ways with a reusable library h/w dependent layer on Linux/Windows /FreeBSD/macOS/Android uspace, unikernel, on UEFI network simulator (ns-3) code 2.4KLoC (h/w independent) 6.6KLoC (h/w dep) 12
it's hard to upstream) almost 30 years old (since 1991) still growing in a rapid pace (new features + bug xes) we don't want to rewrite from scratch Reuse instead of Rewrite (NetBSD rump kernel) 13
(containerd/dockerd port (macOS)) Type of Images runu-private image (statically-linked LKL application) public image (e.g., alpine:latest) (libc replacement) 14
Run docker images without Hypervisor.framework as Mach-O (user space) programs Programs except container image are Mach-O binaries syscalls are invoked inside LKLed programs Bene ts native experience while doing Linux Currently only x86_64 works (both mac and container image) with Apple Silicon may work w/ a slight e ort 16
: How it works 0. (Mach-O) Run LKL as init process 1. (Mach-O) (v)fork/execve Linux ELF binary 2. (ELF) interpreter (musl+) loads (downloaded) ELF program 3. (ELF) call main() function 4. (ELF) syscall => LKL syscall (libc replacement) 5. (Mach-O) handle lkl syscall from ELF 18
process until children exit no glibc-based image support (will work on) libc-replacement doesn't work with static binaries no x86_64 (will work on) 19
exit) socket(2)/listen(2), to be ready for accepting HTTP requests time docker run --runtime=XXX python-hello native < µKontainer < docker/runc < gvisor < nabla < kata 23
(p-t-p) native (host kernel) == runu Factors for better performance of runu low syscall overhead (µKontainer) help of o oad features (TSO, checksum) 24
on LKML (2015) Recently restarted (Oct. 2019) as a mode of UML (UMMODE=library) 1st step: eliminate duplicated features (devices) still ongoing latest: v5 patch (July 2020) 25