Upgrade to Pro — share decks privately, control downloads, hide ads and more …

What are linux containers?

Avatar for zeynel zeynel
February 21, 2019

What are linux containers?

What are linux containers and how to build one from scratch using golang

Avatar for zeynel

zeynel

February 21, 2019
Tweet

Other Decks in Technology

Transcript

  1. Agenda • Virtualization and Containers • Brief History of Containers

    • Linux Container Internals (namespaces, cgroups, capabilities) • Union Filesystem • Demo
  2. What is Virtualization? It is a layer of abstraction for

    emulation/simulation of various resources. why? Isolation, scalability, utilization, reducing costs, compatibility…
  3. • Hardware virtualization (Virtual machine, virtual memory?) • Application virtualization

    (JVM) • Operating system level virtualization (Containers) • Network virtualization • …
  4. What is a Container? A container is a standard unit

    of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another.
  5. Infrastructure / Hardware Operating System (kernel, libraries, system programs, configurations

    etc…) CHROME (PID, env, address space…) SLACK Containers are just regular processes with some isolation and security features CONTAINER CONTAINER
  6. Brief history of containers 1979 chroot 2000 FreeBSD Jails 2001

    Linux VServer 2004 Solaris Zones OpenVZ 2008 LXC 2013 Docker rkt 2018 Kata Containers
  7. Containers are combinations of many different technologies Linux Namespaces, cgroups,

    Linux Capabilities, bridge networks, Union Filesystem…
  8. 1) Namespaces They allow for isolation of global system resources

    between independent processes. History: The Linux Namespaces originated in 2002 in the 2.4.19 kernel with work on the mount namespace. “What happens in namespace stays in namespace”
  9. Types Of Namespaces • Mount: isolate the set of filesystem

    mount points seen by a group of processes • UTS: isolate domain and host name • IPC: isolate certain interprocess communication resources (semaphores, queues…) • PID: isolate the PID number space • Network: isolate network related system resources (network devices, ip, ports …) • User: isolate user and group ID number spaces • Cgroup: hides the identity of the control group of which process is a member
  10. • When a Linux kernel boots up, it creates a

    default namespace for each type, used by all processes. • Processes can create additional namespaces with the unshare command or as new flags in a clone syscall. • nsenter command can bu used to enter a namespace P.S. Google Chrome make use of namespaces to isolate its own processes which are at risk from attack on the internet.
  11. Parent PID Namespace Child PID Namespace 1 2 1 3

    5 4 6 7 2 1 3 PID tree view of parent PID tree view of child PID namespace
  12. User namespace UID GID 0 0 1 1 … …

    503 20 509 509 UID GID 0 0 … … Users on Host OS Users on Container
  13. 2) CGROUPS Control Groups are a Linux kernel feature which

    allow processes to be organized into hierarchical groups whose usage of various types of resources can then be limited and monitored. History: cgroups are originally developed by Google and merged into the Linux kernel in 2008
  14. Cgroups allow you to allocate resources — such as CPU

    time, system memory, network bandwidth, storage i/o or combinations of these resources — among processes (or threads) running on a system. In other words:
  15. Resource limiting: Limit the memory usage of a process to

    100 MB Prioritisation: Some groups may get a larger share of CPU utilization Accounting: Measure a group's resource usage Control: Stop, freeze or restart group of processes Group Profile 1 Group Profile • % 60 CPU • 5 GB Memory • %90 Network • % 70 blkio Applications • NGINX, postgresql, httpd…
  16. cpu ├── 1 ├── 100 ├── 2 ├── browsers │

    ├── 44 │ ├── 45 │ └── 47 └── important ├── 60 ├── 61 ├── containers │ ├── 200 │ ├── 202 │ └── 204 └── scripts ├── 604 └── 800 • Process with PID 1 belongs to root cpu group • PID 60 belongs to important cgroup • PID 202 belongs to important/containers cgroup
  17. Container Security • Traditional UNIX has a very simple permission

    check • Privileged processes (root) or unprivileged (non root users) • Root (UID 0) user is too powerful, dangerous • Other users have very restricted access (can’t open raw socket, load module etc..)
  18. 3) Linux Capabilities • Break up root privileges into distinct

    units, known as capabilities • They can be assigned to processes independently • Parent processes might pass capabilities to child • There are around 40 capabilities on current Linux kernel
  19. Examples of Linux Capabilities • CAP_CHOWN: Make arbitrary changes to

    file UIDs and GIDs • CAP_KILL: Bypass permission checks for sending signals • CAP_NET_RAW: Use RAW and PACKET sockets • CAP_SYS_BOOT: Use reboot P.S. The child process created by clone() with the CLONE_NEWUSER flag starts out with a complete set of capabilities in the new user namespace
  20. Other security features • Seccomp • AppArmor/SELinux • TOMOYO •

    Nested Containers • Hardware assisted containerization • …
  21. 4) Union File System • Combines multiple file systems together

    to create a single unified filesystem • Docker uses it to layer images $ docker pull python Using default tag: latest latest: Pulling from library/python b6f892c0043b: Pull complete 55010f332b04: Pull complete 2955fb827c94: Pull complete 3deef3fcbd30: Pull complete cf9722e506aa: Pull complete Digest: sha256:382452f82a8bbd34443b2c727650af46aced0f94a44463c62a9848133ecb1aa8 Status: Downloaded newer image for python:latest
  22. Base Layer (debian:jessie) /bin /lib /dev /etc … ….. Python

    Layer /bin/python /bin/pip /lib/libc.so.6 /lib/… … /bin /lib /dev /etc /usr /tmp … Resulting File System
  23. FILE1 FILE4 FILE5 FILE2 FILE3 FILE4 FILE2 FILE4 FILE5 FILE2

    FILE1 FILE3 FILE4 FILE5 Layer 3 Layer 2 Layer 1 Container