Namespace.go

 Namespace.go

Cgroups and Namespace are the shoes and shorts of the container race, not in any particular order. They have been around for a while but not too many see the usage and power they have. The talk is a consortium of Golang cookbooks to help you understand how to reach a container using these constructs.

Ee5407f7a79eb620c4fd54c136847b33?s=128

Piyush Verma

February 16, 2018
Tweet

Transcript

  1. Namespace.go @meson10 http://oogway.in Piyush Verma Oogway Consulting

  2. xps:~$ whoami meson10 Whoami

  3. What’s a namespace Namespaces are a fundamental aspect of containers

    on Linux.
  4. Containers

  5. Types of namespace • Mount • UTS • IPC •

    PID • Network • User • Cgroup
  6. Where are they? meson10@xps:$ ls -al /proc/10504/ns/ total 0 lrwxrwxrwx

    1 cgroup -> cgroup:[4026531835] lrwxrwxrwx 1 ipc -> ipc:[4026531839] lrwxrwxrwx 1 mnt -> mnt:[4026531840] lrwxrwxrwx 1 net -> net:[4026532009] lrwxrwxrwx 1 pid -> pid:[4026531836] lrwxrwxrwx 1 user -> user:[4026531837] lrwxrwxrwx 1 uts -> uts:[4026531838] meson10@xps:$ ls -al /proc/3330/ns/ total 0 lrwxrwxrwx 1 cgroup -> cgroup:[4026531835] lrwxrwxrwx 1 ipc -> ipc:[4026531839] lrwxrwxrwx 1 mnt -> mnt:[4026531840] lrwxrwxrwx 1 net -> net:[4026532009] lrwxrwxrwx 1 pid -> pid:[4026531836] lrwxrwxrwx 1 user -> user:[4026531837] lrwxrwxrwx 1 uts -> uts:[4026531838]
  7. Syscall? package main import ( "fmt" "syscall" ) func main()

    { pid, _, _ := syscall.Syscall(syscall.SYS_GETPID, 0, 0, 0) fmt.Println("process id: ", pid) }
  8. UTS namespace

  9. UTS namespace

  10. UTS namespace func main() { cmd := exec.Command("/bin/sh") cmd.Stdin =

    os.Stdin cmd.Stdout = os.Stdout cmd.Stderr = os.Stderr cmd.SysProcAttr = &syscall.SysProcAttr{ Cloneflags: syscall.CLONE_NEWUTS, } cmd.Run() }
  11. UTS namespace meson10@xps$ sudo ./main [meson10]$ hostname hello [meson10]$ hostname

    Hello [meson10]$ exit meson10@xps$ hostname xps.piyushverma.net
  12. User namespace

  13. User namespace func main() { cmd := exec.Command("/bin/sh") cmd.Stdin =

    os.Stdin cmd.Stdout = os.Stdout cmd.Stderr = os.Stderr cmd.Env = []string{"PS1=[meson10]$ "} cmd.SysProcAttr = &syscall.SysProcAttr{ Cloneflags: syscall.CLONE_NEWUSER, } cmd.Run() }
  14. User namespace meson10@xps:$ go run user_ns.go [meson10]$ whoami nobody

  15. Level UP

  16. UID/ GID meson10@xps:$ echo $UID 1000 meson10@xps:$ go run user_ns.go

    [meson10]$ echo $UID [meson10]$ whoami nobody
  17. UID/GID cmd := exec.Command("/bin/sh") cmd.SysProcAttr = &syscall.SysProcAttr{ Cloneflags: syscall.CLONE_NEWUSER, UidMappings:

    []syscall.SysProcIDMap{ { ContainerID: 109, HostID: os.Getuid(), Size: 1, }, }, GidMappings: []syscall.SysProcIDMap{ { ContainerID: 114, HostID: os.Getgid(), Size: 1, }, }, } cmd.Run()
  18. UID/GID meson10@xps:$ go run user_ns.go [meson10]$ whoami grafana

  19. Helper functions helper.go // Attaches stdin, stdout, stderr to Cmd.

    func makeCmd(cmd *exec.Cmd) *exec.Cmd { cmd.Stdin = os.Stdin cmd.Stdout = os.Stdout cmd.Stderr = os.Stderr return cmd }
  20. Problem func main() { cmd := makeCmd(exec.Command("/bin/sh ")) cmd.Env =

    []string{"PS1=[meson10]$ "} cmd.SysProcAttr = &syscall.SysProcAttr{ Cloneflags: syscall.CLONE_NEWUTS, } syscall.Sethostname([]byte("inner-system")) if err := cmd.Run(); err != nil { panic(err) } }
  21. Level UP

  22. /proc/self/exe

  23. Helper functions helper.go // Attaches stdin, stdout, stderr to Cmd.

    func makeCmd(cmd *exec.Cmd) *exec.Cmd { cmd.Stdin = os.Stdin cmd.Stdout = os.Stdout cmd.Stderr = os.Stderr return cmd } // Is it an inside matter now? func isRupa() bool { i := flag.Bool("rupa", false, "child") flag.Parse() return *i } main.go func main() { var err error if isRupa() { err = inner() } else { err = run() } if err != nil { panic(err) } }
  24. UTS namespace func run() error { cmd := makeCmd(exec.Command("/proc/self/exe", "-rupa"))

    cmd.Env = []string{"PS1=[meson10]$ "} cmd.SysProcAttr = &syscall.SysProcAttr{ Cloneflags: syscall.CLONE_NEWUTS, } return cmd.Run() } func inner() error { syscall.Sethostname([]byte("inner-system")) cmd := makeCmd(exec.Command("/bin/sh")) return cmd.Run() }
  25. UTS namespace meson10@xps:~/workspace/gophercon$ go build main.go util.go uts.go meson10@xps:~/workspace/gophercon$ sudo

    ./main [meson10]$ hostname inner-system
  26. Reexec https://github.com/moby/moby/tree/master/pkg/reexec

  27. PID namespace

  28. Container: [meson10]$ sleep 100 & [meson10]$ ps ax PID TTY

    STAT TIME COMMAND 1 pts/0 Sl 0:00 /proc/self/exe -inner 6 pts/0 S 0:00 /bin/sh 9 pts/0 S 0:00 sleep 100 Host: 12884 pts/0 S 0:00 | \_ sudo ./main 12885 pts/0 Sl 0:00 | \_ ./main 12890 pts/0 Sl 0:00 | \_ /proc/self/exe -inner 12895 pts/0 S+ 0:00 | \_ /bin/sh 12920 pts/0 S 0:00 | \_ sleep 100 PID namespace
  29. PID namespace func run() error { cmd := makeCmd(exec.Command("/proc/self/exe", "-rupa"))

    cmd.Env = []string{"PS1=[meson10]$ "} cmd.SysProcAttr = &syscall.SysProcAttr{ Cloneflags: syscall.CLONE_NEWPID, } return cmd.Run() } func inner() error { fmt.Println("Inner code PID", os.Getpid()) cmd := makeCmd(exec.Command("/bin/sh")) return cmd.Run() }
  30. PID namespace meson10@xps:~/workspace/gophercon$ go build main.go util.go pid.go meson10@xps:~/workspace/gophercon$ sudo

    ./main Inner code PID 1 [meson10]$
  31. Problem [meson10]$ ps ax PID TTY STAT TIME COMMAND 1

    ? Ss 0:03 /sbin/init splash 2 ? S 0:00 [kthreadd] 4 ? S< 0:00 [kworker/0:0H] 6 ? S< 0:00 [mm_percpu_wq] 7 ? S 0:01 [ksoftirqd/0] 8 ? S 1:32 [rcu_sched] 9 ? S 0:00 [rcu_bh] 10 ? S 0:00 [migration/0] 11 ? S 0:00 [watchdog/0] 12 ? S 0:00 [cpuhp/0] 13 ? S 0:00 [cpuhp/1] 14 ? S 0:00 [watchdog/1] 15 ? S 0:00 [migration/1] 16 ? S 0:01 [ksoftirqd/1]
  32. Level UP

  33. Mnt namespace

  34. Perspective

  35. Mnt namespace func run() error { cmd := makeCmd(exec.Command("/proc/self/exe", "-inner"))

    cmd.Env = []string{"PS1=[meson10]$ "} cmd.SysProcAttr = &syscall.SysProcAttr{ Cloneflags: syscall.CLONE_NEWNS } return cmd.Run() } func inner() error { cmd := makeCmd(exec.Command("/bin/sh")) return cmd.Run() }
  36. Mnt namespace meson10@xps:~/workspace/gophercon$ sudo ./main Inner code PID 1 [meson10]$

    cd /home/meson10/workspace/gophercon [meson10]$ mount --bind tmp_sys ok [meson10]$ [meson10]$ findmnt -o+PROPAGATION |-/home /dev/nvme0n1p8 ext4 | `-/home/meson10/workspace/gophercon/ok /dev/nvme0n1p8[/meson10/workspace/gophercon/tmp_sys] ext4 meson10@xps:~/workspace/gophercon$ cat /proc/mounts /dev/nvme0n1p8 /home/meson10/workspace/gophercon/ok ext4 rw,noatime,data=ordered 0 0
  37. Level UP

  38. Mnt problem meson10@xps:~/workspace/gophercon$ sudo ./main Inner code PID 1 [meson10]$

    findmnt -o+PROPAGATION TARGET OPTIONS PROPAGATION / rw,noatime,errors=remount-ro,data=ordered shared |-/dev rw,nosuid,relatime,size=7622836k,nr_inodes=1905709,mode=755 shared | |-/dev/pts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 shared | |-/dev/shm rw,nosuid,nodev shared | |-/dev/mqueue rw,relatime shared | `-/dev/hugepages rw,relatime,pagesize=2M shared |-/run rw,nosuid,noexec,relatime,size=1530088k,mode=755 shared | |-/run/lock rw,nosuid,nodev,noexec,relatime,size=5120k shared | `-/run/user/1000 rw,nosuid,nodev,relatime,size=1530084k,mode=700,uid=1000,gid=1000 shared |-/sys rw,nosuid,nodev,noexec,relatime shared | |-/sys/kernel/security rw,nosuid,nodev,noexec,relatime shared
  39. Mnt problem meson10@xps:~/workspace/gophercon$ sudo ./main inner code pid 1 [meson10]$

    findmnt -o+propagation target options propagation / rw,noatime,errors=remount-ro,data=ordered private |-/dev rw,nosuid,relatime,size=7622836k,nr_inodes=1905709,mode=755 private | |-/dev/pts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 private | |-/dev/shm rw,nosuid,nodev private | |-/dev/mqueue rw,relatime private | `-/dev/hugepages rw,relatime,pagesize=2m private |-/run rw,nosuid,noexec,relatime,size=1530088k,mode=755 private | |-/run/lock rw,nosuid,nodev,noexec,relatime,size=5120k private | `-/run/user/1000 rw,nosuid,nodev,relatime,size=1530084k,mode=700,uid=1000,gid=1000 private |-/sys rw,nosuid,nodev,noexec,relatime private | |-/sys/kernel/security rw,nosuid,nodev,noexec,relatime private
  40. Mount Propagation - Per process filesystem namespace - Too restrictive

    - Mount propagation in 2006. - Slave - Shared - Private - Unbindable
  41. Unshareflags func run() error { cmd := makeCmd(exec.Command("/proc/self/exe", "-inner")) cmd.Env

    = []string{"PS1=[meson10]$ "} cmd.SysProcAttr = &syscall.SysProcAttr{ Cloneflags: syscall.CLONE_NEWNS | syscall.CLONE_NEWPID, Unshareflags: syscall.CLONE_NEWNS, } return cmd.Run() } func inner() error { os.Chdir("/") syscall.Mount("proc", "proc", "proc", uintptr(0), "") cmd := makeCmd(exec.Command("/bin/sh")) return cmd.Run() }
  42. PID namespace meson10@xps:~/workspace/gophercon$ go build main.go util.go pid.go meson10@xps:~/workspace/gophercon$ sudo

    ./main [meson10]$ ps -aexf PID TTY STAT TIME COMMAND 1 pts/0 Sl 0:00 /proc/self/exe -inner PS1=[meson10]$ 6 pts/0 S 0:00 /bin/sh PS1=[meson10]$ 7 pts/0 R+ 0:00 \_ ps -aexf PS1=[meson10]$ PWD=/
  43. Added in go 1.9 https://github.com/golang/go/issues/19661 "It turns out that the

    systemd developers decided to override the kernel's default setting of 'private' to their own default setting of 'shared'. This means that on Linux machines with systemd, the default is shared , while on Linux machines without systemd, the default is private. Essentially, systemd decided to make it so that there is no default that end programs can rely on. All programs must instead mark the root filesystem as private if they want private namespaces, or as shared if they want shared namespaces if they want to work across all Linux distributions. I'm pretty sure this was done to frustrate as many people as possible."
  44. Other namespaces - IPC - Net - CGroup - ….

    To be continued
  45. @meson10 http://oogway.in