$30 off During Our Annual Pro Sale. View Details »

Namespace.go

 Namespace.go

Cgroups and Namespace are the shoes and shorts of the container race, not in any particular order. They have been around for a while but not too many see the usage and power they have. The talk is a consortium of Golang cookbooks to help you understand how to reach a container using these constructs.

Piyush Verma

February 16, 2018
Tweet

More Decks by Piyush Verma

Other Decks in Technology

Transcript

  1. Namespace.go
    @meson10
    http://oogway.in
    Piyush Verma
    Oogway Consulting

    View Slide

  2. xps:~$ whoami
    meson10
    Whoami

    View Slide

  3. What’s a namespace
    Namespaces are a fundamental aspect of containers on Linux.

    View Slide

  4. Containers

    View Slide

  5. Types of namespace
    ● Mount
    ● UTS
    ● IPC
    ● PID
    ● Network
    ● User
    ● Cgroup

    View Slide

  6. Where are they?
    meson10@xps:$ ls -al /proc/10504/ns/
    total 0
    lrwxrwxrwx 1 cgroup -> cgroup:[4026531835]
    lrwxrwxrwx 1 ipc -> ipc:[4026531839]
    lrwxrwxrwx 1 mnt -> mnt:[4026531840]
    lrwxrwxrwx 1 net -> net:[4026532009]
    lrwxrwxrwx 1 pid -> pid:[4026531836]
    lrwxrwxrwx 1 user -> user:[4026531837]
    lrwxrwxrwx 1 uts -> uts:[4026531838]
    meson10@xps:$ ls -al /proc/3330/ns/
    total 0
    lrwxrwxrwx 1 cgroup -> cgroup:[4026531835]
    lrwxrwxrwx 1 ipc -> ipc:[4026531839]
    lrwxrwxrwx 1 mnt -> mnt:[4026531840]
    lrwxrwxrwx 1 net -> net:[4026532009]
    lrwxrwxrwx 1 pid -> pid:[4026531836]
    lrwxrwxrwx 1 user -> user:[4026531837]
    lrwxrwxrwx 1 uts -> uts:[4026531838]

    View Slide

  7. Syscall?
    package main
    import (
    "fmt"
    "syscall"
    )
    func main() {
    pid, _, _ := syscall.Syscall(syscall.SYS_GETPID, 0, 0, 0)
    fmt.Println("process id: ", pid)
    }

    View Slide

  8. UTS namespace

    View Slide

  9. UTS namespace

    View Slide

  10. UTS namespace
    func main() {
    cmd := exec.Command("/bin/sh")
    cmd.Stdin = os.Stdin
    cmd.Stdout = os.Stdout
    cmd.Stderr = os.Stderr
    cmd.SysProcAttr = &syscall.SysProcAttr{
    Cloneflags: syscall.CLONE_NEWUTS,
    }
    cmd.Run()
    }

    View Slide

  11. UTS namespace
    meson10@xps$ sudo ./main
    [meson10]$ hostname hello
    [meson10]$ hostname
    Hello
    [meson10]$ exit
    meson10@xps$ hostname
    xps.piyushverma.net

    View Slide

  12. User namespace

    View Slide

  13. User namespace
    func main() {
    cmd := exec.Command("/bin/sh")
    cmd.Stdin = os.Stdin
    cmd.Stdout = os.Stdout
    cmd.Stderr = os.Stderr
    cmd.Env = []string{"PS1=[meson10]$ "}
    cmd.SysProcAttr = &syscall.SysProcAttr{
    Cloneflags: syscall.CLONE_NEWUSER,
    }
    cmd.Run()
    }

    View Slide

  14. User namespace
    meson10@xps:$ go run user_ns.go
    [meson10]$ whoami
    nobody

    View Slide

  15. Level UP

    View Slide

  16. UID/ GID
    meson10@xps:$ echo $UID
    1000
    meson10@xps:$ go run user_ns.go
    [meson10]$ echo $UID
    [meson10]$ whoami
    nobody

    View Slide

  17. UID/GID
    cmd := exec.Command("/bin/sh")
    cmd.SysProcAttr = &syscall.SysProcAttr{
    Cloneflags: syscall.CLONE_NEWUSER,
    UidMappings: []syscall.SysProcIDMap{
    {
    ContainerID: 109,
    HostID: os.Getuid(),
    Size: 1,
    },
    },
    GidMappings: []syscall.SysProcIDMap{
    {
    ContainerID: 114,
    HostID: os.Getgid(),
    Size: 1,
    },
    },
    }
    cmd.Run()

    View Slide

  18. UID/GID
    meson10@xps:$ go run user_ns.go
    [meson10]$ whoami
    grafana

    View Slide

  19. Helper functions
    helper.go
    // Attaches stdin, stdout, stderr to Cmd.
    func makeCmd(cmd *exec.Cmd) *exec.Cmd {
    cmd.Stdin = os.Stdin
    cmd.Stdout = os.Stdout
    cmd.Stderr = os.Stderr
    return cmd
    }

    View Slide

  20. Problem
    func main() {
    cmd := makeCmd(exec.Command("/bin/sh "))
    cmd.Env = []string{"PS1=[meson10]$ "}
    cmd.SysProcAttr = &syscall.SysProcAttr{
    Cloneflags: syscall.CLONE_NEWUTS,
    }
    syscall.Sethostname([]byte("inner-system"))
    if err := cmd.Run(); err != nil {
    panic(err)
    }
    }

    View Slide

  21. Level UP

    View Slide

  22. /proc/self/exe

    View Slide

  23. Helper functions
    helper.go
    // Attaches stdin, stdout, stderr to Cmd.
    func makeCmd(cmd *exec.Cmd) *exec.Cmd {
    cmd.Stdin = os.Stdin
    cmd.Stdout = os.Stdout
    cmd.Stderr = os.Stderr
    return cmd
    }
    // Is it an inside matter now?
    func isRupa() bool {
    i := flag.Bool("rupa", false, "child")
    flag.Parse()
    return *i
    }
    main.go
    func main() {
    var err error
    if isRupa() {
    err = inner()
    } else {
    err = run()
    }
    if err != nil {
    panic(err)
    }
    }

    View Slide

  24. UTS namespace
    func run() error {
    cmd := makeCmd(exec.Command("/proc/self/exe", "-rupa"))
    cmd.Env = []string{"PS1=[meson10]$ "}
    cmd.SysProcAttr = &syscall.SysProcAttr{
    Cloneflags: syscall.CLONE_NEWUTS,
    }
    return cmd.Run()
    }
    func inner() error {
    syscall.Sethostname([]byte("inner-system"))
    cmd := makeCmd(exec.Command("/bin/sh"))
    return cmd.Run()
    }

    View Slide

  25. UTS namespace
    meson10@xps:~/workspace/gophercon$ go build main.go util.go uts.go
    meson10@xps:~/workspace/gophercon$ sudo ./main
    [meson10]$ hostname
    inner-system

    View Slide

  26. Reexec
    https://github.com/moby/moby/tree/master/pkg/reexec

    View Slide

  27. PID namespace

    View Slide

  28. Container:
    [meson10]$ sleep 100 &
    [meson10]$ ps ax
    PID TTY STAT TIME COMMAND
    1 pts/0 Sl 0:00 /proc/self/exe -inner
    6 pts/0 S 0:00 /bin/sh
    9 pts/0 S 0:00 sleep 100
    Host:
    12884 pts/0 S 0:00 | \_ sudo ./main
    12885 pts/0 Sl 0:00 | \_ ./main
    12890 pts/0 Sl 0:00 | \_ /proc/self/exe -inner
    12895 pts/0 S+ 0:00 | \_ /bin/sh
    12920 pts/0 S 0:00 | \_ sleep 100
    PID namespace

    View Slide

  29. PID namespace
    func run() error {
    cmd := makeCmd(exec.Command("/proc/self/exe", "-rupa"))
    cmd.Env = []string{"PS1=[meson10]$ "}
    cmd.SysProcAttr = &syscall.SysProcAttr{
    Cloneflags: syscall.CLONE_NEWPID,
    }
    return cmd.Run()
    }
    func inner() error {
    fmt.Println("Inner code PID", os.Getpid())
    cmd := makeCmd(exec.Command("/bin/sh"))
    return cmd.Run()
    }

    View Slide

  30. PID namespace
    meson10@xps:~/workspace/gophercon$ go build main.go util.go pid.go
    meson10@xps:~/workspace/gophercon$ sudo ./main
    Inner code PID 1
    [meson10]$

    View Slide

  31. Problem
    [meson10]$ ps ax
    PID TTY STAT TIME COMMAND
    1 ? Ss 0:03 /sbin/init splash
    2 ? S 0:00 [kthreadd]
    4 ? S< 0:00 [kworker/0:0H]
    6 ? S< 0:00 [mm_percpu_wq]
    7 ? S 0:01 [ksoftirqd/0]
    8 ? S 1:32 [rcu_sched]
    9 ? S 0:00 [rcu_bh]
    10 ? S 0:00 [migration/0]
    11 ? S 0:00 [watchdog/0]
    12 ? S 0:00 [cpuhp/0]
    13 ? S 0:00 [cpuhp/1]
    14 ? S 0:00 [watchdog/1]
    15 ? S 0:00 [migration/1]
    16 ? S 0:01 [ksoftirqd/1]

    View Slide

  32. Level UP

    View Slide

  33. Mnt namespace

    View Slide

  34. Perspective

    View Slide

  35. Mnt namespace
    func run() error {
    cmd := makeCmd(exec.Command("/proc/self/exe", "-inner"))
    cmd.Env = []string{"PS1=[meson10]$ "}
    cmd.SysProcAttr = &syscall.SysProcAttr{
    Cloneflags: syscall.CLONE_NEWNS
    }
    return cmd.Run()
    }
    func inner() error {
    cmd := makeCmd(exec.Command("/bin/sh"))
    return cmd.Run()
    }

    View Slide

  36. Mnt namespace
    meson10@xps:~/workspace/gophercon$ sudo ./main
    Inner code PID 1
    [meson10]$ cd /home/meson10/workspace/gophercon
    [meson10]$ mount --bind tmp_sys ok
    [meson10]$
    [meson10]$ findmnt -o+PROPAGATION
    |-/home /dev/nvme0n1p8 ext4
    | `-/home/meson10/workspace/gophercon/ok /dev/nvme0n1p8[/meson10/workspace/gophercon/tmp_sys] ext4
    meson10@xps:~/workspace/gophercon$ cat /proc/mounts
    /dev/nvme0n1p8 /home/meson10/workspace/gophercon/ok ext4 rw,noatime,data=ordered 0 0

    View Slide

  37. Level UP

    View Slide

  38. Mnt problem
    meson10@xps:~/workspace/gophercon$ sudo ./main
    Inner code PID 1
    [meson10]$ findmnt -o+PROPAGATION
    TARGET OPTIONS PROPAGATION
    / rw,noatime,errors=remount-ro,data=ordered shared
    |-/dev rw,nosuid,relatime,size=7622836k,nr_inodes=1905709,mode=755 shared
    | |-/dev/pts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 shared
    | |-/dev/shm rw,nosuid,nodev shared
    | |-/dev/mqueue rw,relatime shared
    | `-/dev/hugepages rw,relatime,pagesize=2M shared
    |-/run rw,nosuid,noexec,relatime,size=1530088k,mode=755 shared
    | |-/run/lock rw,nosuid,nodev,noexec,relatime,size=5120k shared
    | `-/run/user/1000 rw,nosuid,nodev,relatime,size=1530084k,mode=700,uid=1000,gid=1000 shared
    |-/sys rw,nosuid,nodev,noexec,relatime shared
    | |-/sys/kernel/security rw,nosuid,nodev,noexec,relatime shared

    View Slide

  39. Mnt problem
    meson10@xps:~/workspace/gophercon$ sudo ./main
    inner code pid 1
    [meson10]$ findmnt -o+propagation
    target options propagation
    / rw,noatime,errors=remount-ro,data=ordered private
    |-/dev rw,nosuid,relatime,size=7622836k,nr_inodes=1905709,mode=755 private
    | |-/dev/pts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 private
    | |-/dev/shm rw,nosuid,nodev private
    | |-/dev/mqueue rw,relatime private
    | `-/dev/hugepages rw,relatime,pagesize=2m private
    |-/run rw,nosuid,noexec,relatime,size=1530088k,mode=755 private
    | |-/run/lock rw,nosuid,nodev,noexec,relatime,size=5120k private
    | `-/run/user/1000 rw,nosuid,nodev,relatime,size=1530084k,mode=700,uid=1000,gid=1000 private
    |-/sys rw,nosuid,nodev,noexec,relatime private
    | |-/sys/kernel/security rw,nosuid,nodev,noexec,relatime private

    View Slide

  40. Mount Propagation
    - Per process filesystem namespace
    - Too restrictive
    - Mount propagation in 2006.
    - Slave
    - Shared
    - Private
    - Unbindable

    View Slide

  41. Unshareflags
    func run() error {
    cmd := makeCmd(exec.Command("/proc/self/exe", "-inner"))
    cmd.Env = []string{"PS1=[meson10]$ "}
    cmd.SysProcAttr = &syscall.SysProcAttr{
    Cloneflags: syscall.CLONE_NEWNS | syscall.CLONE_NEWPID,
    Unshareflags: syscall.CLONE_NEWNS,
    }
    return cmd.Run()
    }
    func inner() error {
    os.Chdir("/")
    syscall.Mount("proc", "proc", "proc", uintptr(0), "")
    cmd := makeCmd(exec.Command("/bin/sh"))
    return cmd.Run()
    }

    View Slide

  42. PID namespace
    meson10@xps:~/workspace/gophercon$ go build main.go util.go pid.go
    meson10@xps:~/workspace/gophercon$ sudo ./main
    [meson10]$ ps -aexf
    PID TTY STAT TIME COMMAND
    1 pts/0 Sl 0:00 /proc/self/exe -inner PS1=[meson10]$
    6 pts/0 S 0:00 /bin/sh PS1=[meson10]$
    7 pts/0 R+ 0:00 \_ ps -aexf PS1=[meson10]$ PWD=/

    View Slide

  43. Added in go 1.9
    https://github.com/golang/go/issues/19661
    "It turns out that the systemd developers decided to override the kernel's default setting of 'private' to their
    own default setting of 'shared'. This means that on Linux machines with systemd, the default is shared , while
    on Linux machines without systemd, the default is private. Essentially, systemd decided to make it so that there
    is no default that end programs can rely on. All programs must instead mark the root filesystem as private if
    they want private namespaces, or as shared if they want shared namespaces if they want to work across all Linux
    distributions. I'm pretty sure this was done to frustrate as many people as possible."

    View Slide

  44. Other namespaces
    - IPC
    - Net
    - CGroup
    - …. To be continued

    View Slide

  45. @meson10 http://oogway.in

    View Slide