Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Containers from Scratch: what are they made from?

Containers from Scratch: what are they made from?

Talk from Docker meetup Jakarta on June 2018. Presented and demonstrated various Linux kernel features that enable container runtime, i.e. chroot, namespaces, cgroups, capabilities.

Giri Kuncoro

June 28, 2018
Tweet

More Decks by Giri Kuncoro

Other Decks in Technology

Transcript

  1. Raspberry Pi A+ 256 MB Adafruit FONA - Mini GSM

    Breakout GSM Antenna Electret Microphone 1200 mAh Lithium Ion Battery Ingredients
  2. Ingredient #1: Container Image Build root: http://www.buildroot.org/ Debootstrap: https://wiki.debian.org/Debootstrap YUM

    / DNF Gentoo: https://www.gentoo.org/downloads/ Buildah: https://github.com/projectatomic/buildah
  3. $ mkdir rootfs $ sudo dnf -y \ --installroot=$PWD/rootfs \

    --releasever=24 install \ @development-tools \ procps-ng \ python3 \ which \ iproute \ net-tools $ ls rootfs
  4. Ingredient #2: chroot Execute a process in our container filesystem

    chroot(2): http://man7.org/linux/man-pages/man2/chroot.2.html
  5. Ingredient #3: namespaces Limit the “view” of a container: Process

    namespace (pid) Network namespace (net) Mount namespace (mnt) https://en.wikipedia.org/wiki/Linux_namespaces
  6. Ingredient #3: namespaces chroot of other systems: clone(2): http://man7.org/linux/man-pages/man2/clone.2.html unshare(2):

    http://man7.org/linux/man-pages/man2/unshare.2.html Process trees Network interfaces Mount volumes
  7. Ingredient #4: enter namespaces Namespaces are composable Example: Kubernetes pod

    setns(2): http://man7.org/linux/man-pages/man2/setns.2.html k8s pod di r p o s , di r c o t sa t o k, sa un
  8. # PID=321 # ls /proc/$PID/ns cgroup ipc mnt net pid

    user uts # nsenter \ --pid=/proc/$PID/ns/pid \ --mnt=/proc/$PID/ns/mnt \ chroot $PWD/rootfs /bin/bash
  9. Ingredient #5: volume mounts Inject files into our chroot $

    docker run -d \ --name=nginxtest \ -v nginx-vol:/usr/share/nginx/html \ nginx:latest
  10. apiVersion: v1 kind: Pod metadata: name: test-pd spec: containers: -

    image: k8s.gcr.io/test-webserver name: test-container volumeMounts: - mountPath: /test-pd name: test-volume volumes: - name: test-volume hostPath: path: /data
  11. # ls /sys/fs/cgroup # mkdir /sys/fs/cgroup/memory/demo # echo $$ >

    /sys/fs/cgroup/memory/demo/tasks # cat /proc/self/cgroup
  12. Ingredient #7: cgroup namespace Q: How do you restrict a

    process from reassigning cgroup? A: More namespaces!
  13. # (how to remove cgroups: reassign) # echo $$ >

    /sys/fs/cgroup/memory/tasks # rmdir /sys/fs/cgroup/memory/demo
  14. Ingredient #8: capabilities “Docker is about running random code downloaded

    from Internet and running it as root” - Dan Walsh (Red Hat)
  15. Ingredient #8: capabilities SELinux, seccomp, AppArmor should’ve been covered Show

    Linux capabilities instead http://man7.org/linux/man-pages/man7/capabilities.7.html
  16. Ingredient #9: network namespace Huge topic, will do simple demo

    for now For the impatient, probably next talk: https://github.com/girikuncoro/netns-demo
  17. $ sudo ip link add veth0 type veth peer name

    veth1 $ sudo ip link set veth1 netns $PID $ sudo ip address add 10.1.1.2/24 dev veth0 $ sudo ip link set dev veth0 up # (inside namespace) # ip address add 10.1.1.3/24 dev veth1 # ip link set dev veth1 up
  18. Conclusion Containers are a combination between Linux kernel features Docker,

    rkt, lxc (container runtime) are just opinionated wrapper around these
  19. References Containers from scratch, Eric Chiang https://ericchiang.github.io/post/containers-from-scratch/ Building minimal containers,

    Brian Redbeard https://github.com/brianredbeard/minimal_containers Namespaces in operation, Michael Kerrisk https://lwn.net/Articles/531114/ cgroups v1, Paul Menage https://www.kernel.org/doc/Documentation/cgroup-v1/cgroups.txt Bocker, Docker implemented in 100 lines of bash https://github.com/p8952/bocker