Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Namespace.go

 Namespace.go

Cgroups and Namespace are the shoes and shorts of the container race, not in any particular order. They have been around for a while but not too many see the usage and power they have. The talk is a consortium of Golang cookbooks to help you understand how to reach a container using these constructs.

Piyush Verma

February 16, 2018
Tweet

More Decks by Piyush Verma

Other Decks in Technology

Transcript

  1. Types of namespace • Mount • UTS • IPC •

    PID • Network • User • Cgroup
  2. Where are they? meson10@xps:$ ls -al /proc/10504/ns/ total 0 lrwxrwxrwx

    1 cgroup -> cgroup:[4026531835] lrwxrwxrwx 1 ipc -> ipc:[4026531839] lrwxrwxrwx 1 mnt -> mnt:[4026531840] lrwxrwxrwx 1 net -> net:[4026532009] lrwxrwxrwx 1 pid -> pid:[4026531836] lrwxrwxrwx 1 user -> user:[4026531837] lrwxrwxrwx 1 uts -> uts:[4026531838] meson10@xps:$ ls -al /proc/3330/ns/ total 0 lrwxrwxrwx 1 cgroup -> cgroup:[4026531835] lrwxrwxrwx 1 ipc -> ipc:[4026531839] lrwxrwxrwx 1 mnt -> mnt:[4026531840] lrwxrwxrwx 1 net -> net:[4026532009] lrwxrwxrwx 1 pid -> pid:[4026531836] lrwxrwxrwx 1 user -> user:[4026531837] lrwxrwxrwx 1 uts -> uts:[4026531838]
  3. Syscall? package main import ( "fmt" "syscall" ) func main()

    { pid, _, _ := syscall.Syscall(syscall.SYS_GETPID, 0, 0, 0) fmt.Println("process id: ", pid) }
  4. UTS namespace func main() { cmd := exec.Command("/bin/sh") cmd.Stdin =

    os.Stdin cmd.Stdout = os.Stdout cmd.Stderr = os.Stderr cmd.SysProcAttr = &syscall.SysProcAttr{ Cloneflags: syscall.CLONE_NEWUTS, } cmd.Run() }
  5. UTS namespace meson10@xps$ sudo ./main [meson10]$ hostname hello [meson10]$ hostname

    Hello [meson10]$ exit meson10@xps$ hostname xps.piyushverma.net
  6. User namespace func main() { cmd := exec.Command("/bin/sh") cmd.Stdin =

    os.Stdin cmd.Stdout = os.Stdout cmd.Stderr = os.Stderr cmd.Env = []string{"PS1=[meson10]$ "} cmd.SysProcAttr = &syscall.SysProcAttr{ Cloneflags: syscall.CLONE_NEWUSER, } cmd.Run() }
  7. UID/ GID meson10@xps:$ echo $UID 1000 meson10@xps:$ go run user_ns.go

    [meson10]$ echo $UID [meson10]$ whoami nobody
  8. UID/GID cmd := exec.Command("/bin/sh") cmd.SysProcAttr = &syscall.SysProcAttr{ Cloneflags: syscall.CLONE_NEWUSER, UidMappings:

    []syscall.SysProcIDMap{ { ContainerID: 109, HostID: os.Getuid(), Size: 1, }, }, GidMappings: []syscall.SysProcIDMap{ { ContainerID: 114, HostID: os.Getgid(), Size: 1, }, }, } cmd.Run()
  9. Helper functions helper.go // Attaches stdin, stdout, stderr to Cmd.

    func makeCmd(cmd *exec.Cmd) *exec.Cmd { cmd.Stdin = os.Stdin cmd.Stdout = os.Stdout cmd.Stderr = os.Stderr return cmd }
  10. Problem func main() { cmd := makeCmd(exec.Command("/bin/sh ")) cmd.Env =

    []string{"PS1=[meson10]$ "} cmd.SysProcAttr = &syscall.SysProcAttr{ Cloneflags: syscall.CLONE_NEWUTS, } syscall.Sethostname([]byte("inner-system")) if err := cmd.Run(); err != nil { panic(err) } }
  11. Helper functions helper.go // Attaches stdin, stdout, stderr to Cmd.

    func makeCmd(cmd *exec.Cmd) *exec.Cmd { cmd.Stdin = os.Stdin cmd.Stdout = os.Stdout cmd.Stderr = os.Stderr return cmd } // Is it an inside matter now? func isRupa() bool { i := flag.Bool("rupa", false, "child") flag.Parse() return *i } main.go func main() { var err error if isRupa() { err = inner() } else { err = run() } if err != nil { panic(err) } }
  12. UTS namespace func run() error { cmd := makeCmd(exec.Command("/proc/self/exe", "-rupa"))

    cmd.Env = []string{"PS1=[meson10]$ "} cmd.SysProcAttr = &syscall.SysProcAttr{ Cloneflags: syscall.CLONE_NEWUTS, } return cmd.Run() } func inner() error { syscall.Sethostname([]byte("inner-system")) cmd := makeCmd(exec.Command("/bin/sh")) return cmd.Run() }
  13. Container: [meson10]$ sleep 100 & [meson10]$ ps ax PID TTY

    STAT TIME COMMAND 1 pts/0 Sl 0:00 /proc/self/exe -inner 6 pts/0 S 0:00 /bin/sh 9 pts/0 S 0:00 sleep 100 Host: 12884 pts/0 S 0:00 | \_ sudo ./main 12885 pts/0 Sl 0:00 | \_ ./main 12890 pts/0 Sl 0:00 | \_ /proc/self/exe -inner 12895 pts/0 S+ 0:00 | \_ /bin/sh 12920 pts/0 S 0:00 | \_ sleep 100 PID namespace
  14. PID namespace func run() error { cmd := makeCmd(exec.Command("/proc/self/exe", "-rupa"))

    cmd.Env = []string{"PS1=[meson10]$ "} cmd.SysProcAttr = &syscall.SysProcAttr{ Cloneflags: syscall.CLONE_NEWPID, } return cmd.Run() } func inner() error { fmt.Println("Inner code PID", os.Getpid()) cmd := makeCmd(exec.Command("/bin/sh")) return cmd.Run() }
  15. Problem [meson10]$ ps ax PID TTY STAT TIME COMMAND 1

    ? Ss 0:03 /sbin/init splash 2 ? S 0:00 [kthreadd] 4 ? S< 0:00 [kworker/0:0H] 6 ? S< 0:00 [mm_percpu_wq] 7 ? S 0:01 [ksoftirqd/0] 8 ? S 1:32 [rcu_sched] 9 ? S 0:00 [rcu_bh] 10 ? S 0:00 [migration/0] 11 ? S 0:00 [watchdog/0] 12 ? S 0:00 [cpuhp/0] 13 ? S 0:00 [cpuhp/1] 14 ? S 0:00 [watchdog/1] 15 ? S 0:00 [migration/1] 16 ? S 0:01 [ksoftirqd/1]
  16. Mnt namespace func run() error { cmd := makeCmd(exec.Command("/proc/self/exe", "-inner"))

    cmd.Env = []string{"PS1=[meson10]$ "} cmd.SysProcAttr = &syscall.SysProcAttr{ Cloneflags: syscall.CLONE_NEWNS } return cmd.Run() } func inner() error { cmd := makeCmd(exec.Command("/bin/sh")) return cmd.Run() }
  17. Mnt namespace meson10@xps:~/workspace/gophercon$ sudo ./main Inner code PID 1 [meson10]$

    cd /home/meson10/workspace/gophercon [meson10]$ mount --bind tmp_sys ok [meson10]$ [meson10]$ findmnt -o+PROPAGATION |-/home /dev/nvme0n1p8 ext4 | `-/home/meson10/workspace/gophercon/ok /dev/nvme0n1p8[/meson10/workspace/gophercon/tmp_sys] ext4 meson10@xps:~/workspace/gophercon$ cat /proc/mounts /dev/nvme0n1p8 /home/meson10/workspace/gophercon/ok ext4 rw,noatime,data=ordered 0 0
  18. Mnt problem meson10@xps:~/workspace/gophercon$ sudo ./main Inner code PID 1 [meson10]$

    findmnt -o+PROPAGATION TARGET OPTIONS PROPAGATION / rw,noatime,errors=remount-ro,data=ordered shared |-/dev rw,nosuid,relatime,size=7622836k,nr_inodes=1905709,mode=755 shared | |-/dev/pts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 shared | |-/dev/shm rw,nosuid,nodev shared | |-/dev/mqueue rw,relatime shared | `-/dev/hugepages rw,relatime,pagesize=2M shared |-/run rw,nosuid,noexec,relatime,size=1530088k,mode=755 shared | |-/run/lock rw,nosuid,nodev,noexec,relatime,size=5120k shared | `-/run/user/1000 rw,nosuid,nodev,relatime,size=1530084k,mode=700,uid=1000,gid=1000 shared |-/sys rw,nosuid,nodev,noexec,relatime shared | |-/sys/kernel/security rw,nosuid,nodev,noexec,relatime shared
  19. Mnt problem meson10@xps:~/workspace/gophercon$ sudo ./main inner code pid 1 [meson10]$

    findmnt -o+propagation target options propagation / rw,noatime,errors=remount-ro,data=ordered private |-/dev rw,nosuid,relatime,size=7622836k,nr_inodes=1905709,mode=755 private | |-/dev/pts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 private | |-/dev/shm rw,nosuid,nodev private | |-/dev/mqueue rw,relatime private | `-/dev/hugepages rw,relatime,pagesize=2m private |-/run rw,nosuid,noexec,relatime,size=1530088k,mode=755 private | |-/run/lock rw,nosuid,nodev,noexec,relatime,size=5120k private | `-/run/user/1000 rw,nosuid,nodev,relatime,size=1530084k,mode=700,uid=1000,gid=1000 private |-/sys rw,nosuid,nodev,noexec,relatime private | |-/sys/kernel/security rw,nosuid,nodev,noexec,relatime private
  20. Mount Propagation - Per process filesystem namespace - Too restrictive

    - Mount propagation in 2006. - Slave - Shared - Private - Unbindable
  21. Unshareflags func run() error { cmd := makeCmd(exec.Command("/proc/self/exe", "-inner")) cmd.Env

    = []string{"PS1=[meson10]$ "} cmd.SysProcAttr = &syscall.SysProcAttr{ Cloneflags: syscall.CLONE_NEWNS | syscall.CLONE_NEWPID, Unshareflags: syscall.CLONE_NEWNS, } return cmd.Run() } func inner() error { os.Chdir("/") syscall.Mount("proc", "proc", "proc", uintptr(0), "") cmd := makeCmd(exec.Command("/bin/sh")) return cmd.Run() }
  22. PID namespace meson10@xps:~/workspace/gophercon$ go build main.go util.go pid.go meson10@xps:~/workspace/gophercon$ sudo

    ./main [meson10]$ ps -aexf PID TTY STAT TIME COMMAND 1 pts/0 Sl 0:00 /proc/self/exe -inner PS1=[meson10]$ 6 pts/0 S 0:00 /bin/sh PS1=[meson10]$ 7 pts/0 R+ 0:00 \_ ps -aexf PS1=[meson10]$ PWD=/
  23. Added in go 1.9 https://github.com/golang/go/issues/19661 "It turns out that the

    systemd developers decided to override the kernel's default setting of 'private' to their own default setting of 'shared'. This means that on Linux machines with systemd, the default is shared , while on Linux machines without systemd, the default is private. Essentially, systemd decided to make it so that there is no default that end programs can rely on. All programs must instead mark the root filesystem as private if they want private namespaces, or as shared if they want shared namespaces if they want to work across all Linux distributions. I'm pretty sure this was done to frustrate as many people as possible."