$30 off During Our Annual Pro Sale. View Details »

The Enemy Within: Running untrusted code with gVisor

The Enemy Within: Running untrusted code with gVisor

Containers are a great way to deploy and isolate application resources but they can fall short when it comes to security isolation. How do you improve the security of a container while maintaining the flexible and dynamic resource usage of a container? There are many options for sandbox containers but which is right for you?

In this talk we will explore gVisor sandbox runtime in depth. gVisor is a unique open-source sandbox runtime that allows you to run unmodified applications in containers with a higher level of isolation and low overhead. It implements the OCI runtime specification and integrates well with containerd and Kubernetes. In this talk I will dive into the container security model and use cases for sandbox pods. I will discuss various approaches and their tradeoffs before diving into the architecture of gVisor and how it differs from virtual machine based sandboxes.

Ian Lewis

June 25, 2019
Tweet

More Decks by Ian Lewis

Other Decks in Technology

Transcript

  1. Ian Lewis
    Developer Advocate, Google Cloud Platform
    The Enemy Within
    Running Untrusted Code
    with gVisor

    View Slide

  2. 2
    gVisor
    Ian Lewis (@IanMLewis)
    Developer Advocate, Google

    View Slide

  3. 3
    gVisor
    ● Running untrusted code
    ● User uploaded code
    ● Third-party code
    ● Complex code/Complex user input
    ● Code you wrote but you don't trust yourself….
    So you want to run some code...

    View Slide

  4. 4
    gVisor
    ● SaaS/Serverless
    ● Video/Image transcoding
    ● Machine learning
    Use Cases

    View Slide

  5. 5
    gVisor
    Too much privileged code
    Application
    Host Kernel

    View Slide

  6. 6
    gVisor
    Too much privileged code
    Application
    Host Kernel
    open("/path/to/file", O_RDWR)

    View Slide

  7. 7
    gVisor
    Too much privileged code
    Application
    Host Kernel

    View Slide

  8. 8
    gVisor
    Too much privileged code
    Application
    Host Kernel
    file descriptor

    View Slide

  9. 9
    gVisor
    ● Protects attackers from escaping the runtime environment
    ● Code running in the sandbox is untrusted
    Container Sandboxes

    View Slide

  10. 10
    gVisor
    ● Goal of the sandbox is to reduce execution of trusted, privileged code
    (e.g. kernel code)
    ● Achieved through abstraction/virtualization of host.
    ● Don't want to expose the system to risk of any single bug
    ○ Need two layers of isolation
    Sandbox Isolation

    View Slide

  11. 11
    gVisor
    OS-Level Virtualization (containers)
    Application
    Host Kernel
    Namespace

    View Slide

  12. 12
    gVisor
    Unikernels
    Application
    Host Kernel
    Guest OS
    Hypervisor

    View Slide

  13. 13
    gVisor
    ● Containers
    ○ They aren't good security isolation boundaries
    ○ Only one layer of isolation
    ○ Any one bug in the host kernel could lead to a full host compromise
    ● Unikernels
    ○ Can't bring your own container (must be specially crafted)
    Containers & Unikernels are cool but...

    View Slide

  14. 14
    gVisor
    Virtual Machines
    Application
    Host Kernel
    Guest OS
    Hypervisor
    Hardware

    View Slide

  15. 15
    gVisor
    (Type 2) Virtual Machines
    Application
    Host Kernel
    Guest OS
    Hypervisor
    Hardware

    View Slide

  16. 16
    gVisor
    (Type 1) Virtual Machines
    Application
    Host Kernel
    Guest OS
    Hypervisor
    Hardware

    View Slide

  17. 17
    gVisor
    ● We want more container-like properties
    ○ Flexible resource usage
    ■ Don't want to assign full sets of memory or CPU to the sandbox
    ■ Want to be able to reclaim memory if possible
    ○ Quick X0ms startup time
    ■ Don't want to have a lot of guest OS boot time.
    ○ Easier maintenance and integration into container infrastructure
    VMs are cool but...

    View Slide

  18. 18
    gVisor

    View Slide

  19. 19
    gVisor
    Virtual Machines
    Application
    OS
    Virtualized Hardware

    View Slide

  20. 20
    gVisor
    gVisor Virtualization
    Application
    Virtualized OS

    View Slide

  21. 21
    gVisor
    gVisor: Two Layers of Isolation
    Application
    Guest OS (Sentry)
    Host Kernel
    Namespace

    View Slide

  22. 22
    gVisor
    gVisor: Two Layers of Isolation
    Application
    Guest OS (Sentry)
    Host Kernel
    Namespace

    View Slide

  23. 23
    gVisor
    ● Two layers of isolation
    ● Uses the same principle of virtualization as VMs
    ○ Virtualization at the OS; Linux Syscall layer
    ● Reduces the host attack surface
    ○ Calls to the host OS are controlled by the Sentry
    ○ Most syscall logic handled by Sentry
    ○ No syscalls are "passed through". Applications cannot pass arbitrary
    arguments to the host kernel.
    gVisor

    View Slide

  24. 24
    gVisor
    gVisor Architecture
    Host Linux Kernel
    User
    Kernel
    runsc
    OCI
    Kubernetes

    View Slide

  25. 25
    gVisor
    gVisor Architecture
    KVM/ptrace
    Gofer
    Host Linux Kernel
    Sentry
    Sandbox
    User
    Kernel
    9P
    runsc
    OCI
    Kubernetes

    View Slide

  26. 26
    gVisor
    gVisor Architecture
    KVM/ptrace
    Gofer
    Host Linux Kernel
    Sentry
    Sandbox
    User
    Kernel
    9P
    runsc
    OCI
    Kubernetes
    seccomp + ns
    seccomp + ns

    View Slide

  27. 27
    gVisor
    gVisor Architecture
    KVM/ptrace
    Gofer
    Host Linux Kernel
    Container Sentry
    Sandbox
    User
    Kernel
    9P
    runsc
    OCI
    Kubernetes
    seccomp + ns
    seccomp + ns

    View Slide

  28. 28
    gVisor
    gVisor Architecture
    KVM/ptrace
    Gofer
    Gofer
    Gofers
    Containers
    Containers
    Host Linux Kernel
    Containers Sentry
    Sandbox
    User
    Kernel
    9P
    runsc
    OCI
    Kubernetes
    seccomp + ns
    seccomp + ns

    View Slide

  29. 29
    gVisor
    ● Two security layers
    ● Minimal access to host
    ○ No syscall is passed thru the host
    ○ Limited host syscalls allowed
    ○ User mode
    ● Pure Go
    ○ No cgo allowed
    ● Unsafe code is carefully reviewed
    ● Statically linked, few external dependencies
    ● Trust nobody
    Design Principles

    View Slide

  30. 30
    gVisor
    ● Sentry is first layer of defense
    ○ Assume it will be compromised
    ○ User mode
    ● Pod cgroup
    ● Namespaces
    ● Terminal chroot
    ● uid/gid: nobody
    ○ Drop all capabilities
    ● Seccomp
    ○ # of syscalls is the wrong metric
    Defense in Depth

    View Slide

  31. 31
    gVisor
    ● Sentry is first layer of defense
    ○ Assume it will be compromised
    ○ User mode
    ● Pod cgroup
    ● Namespaces
    ● Terminal chroot
    ● uid/gid: nobody
    ○ Drop all capabilities
    ● Seccomp
    ○ # of syscalls is the wrong metric
    Defense in Depth: Sandbox
    Sandbox

    View Slide

  32. 32
    gVisor
    ● Sentry is first layer of defense
    ○ Assume it will be compromised
    ○ User mode
    ● Pod cgroup
    ● Namespaces
    ● Terminal chroot
    ● uid/gid: nobody
    ○ Drop all capabilities
    ● Seccomp
    ○ # of syscalls is the wrong metric
    Defense in Depth: Sandbox cgroup
    Sandbox

    View Slide

  33. 33
    gVisor
    ● Sentry is first layer of defense
    ○ Assume it will be compromised
    ○ User mode
    ● Pod cgroup
    ● Namespaces
    ● Terminal chroot
    ● uid/gid: nobody
    ○ Drop all capabilities
    ● Seccomp
    ○ # of syscalls is the wrong metric
    Defense in Depth: Sandbox cgroup
    namespace
    Sandbox

    View Slide

  34. 34
    gVisor
    ● Sentry is first layer of defense
    ○ Assume it will be compromised
    ○ User mode
    ● Pod cgroup
    ● Namespaces
    ● Terminal chroot
    ● uid/gid: nobody
    ○ Drop all capabilities
    ● Seccomp
    ○ # of syscalls is the wrong metric
    Defense in Depth: Sandbox cgroup
    namespace
    chroot
    Sandbox

    View Slide

  35. 35
    gVisor
    ● Sentry is first layer of defense
    ○ Assume it will be compromised
    ○ User mode
    ● Pod cgroup
    ● Namespaces
    ● Terminal chroot
    ● uid/gid: nobody
    ○ Drop all capabilities
    ● Seccomp
    ○ # of syscalls is the wrong metric
    Defense in Depth: Sandbox cgroup
    namespace
    chroot
    user / group / capabilities
    Sandbox

    View Slide

  36. 36
    gVisor
    ● Sentry is first layer of defense
    ○ Assume it will be compromised
    ○ User mode
    ● Pod cgroup
    ● Namespaces
    ● Terminal chroot
    ● uid/gid: nobody
    ○ Drop all capabilities
    ● Seccomp
    ○ # of syscalls is the wrong metric
    Defense in Depth: Sandbox cgroup
    namespace
    chroot
    user / group / capabilities
    seccomp
    Sandbox

    View Slide

  37. 37
    gVisor
    ● Isolated from user code
    ● Pod cgroup
    ● Caller’s user namespace
    ● Chroot to rootfs
    ○ Bind mounts
    ● Runs as root
    ○ Similar to “docker run” as root
    ○ Drop non-FS capabilities
    ● seccomp
    Defense In Depth: Gofer

    View Slide

  38. 38
    gVisor
    ● Isolated from user code
    ● Pod cgroup
    ● Caller’s user namespace
    ● Chroot to rootfs
    ○ Bind mounts
    ● Runs as root
    ○ Similar to “docker run” as root
    ○ Drop non-FS capabilities
    ● seccomp
    Defense In Depth: Gofer
    Gofer

    View Slide

  39. 39
    gVisor
    ● Isolated from user code
    ● Pod cgroup
    ● Caller’s user namespace
    ● Chroot to rootfs
    ○ Bind mounts
    ● Runs as root
    ○ Similar to “docker run” as root
    ○ Drop non-FS capabilities
    ● seccomp
    Defense In Depth: Gofer cgroup
    Gofer

    View Slide

  40. 40
    gVisor
    ● Isolated from user code
    ● Pod cgroup
    ● Caller’s user namespace
    ● Chroot to rootfs
    ○ Bind mounts
    ● Runs as root
    ○ Similar to “docker run” as root
    ○ Drop non-FS capabilities
    ● seccomp
    Defense In Depth: Gofer cgroup
    reduced namespaces
    Gofer

    View Slide

  41. 41
    gVisor
    ● Isolated from user code
    ● Pod cgroup
    ● Caller’s user namespace
    ● Chroot to rootfs
    ○ Bind mounts
    ● Runs as root
    ○ Similar to “docker run” as root
    ○ Drop non-FS capabilities
    ● seccomp
    Defense In Depth: Gofer cgroup
    reduced namespaces
    chroot
    Gofer

    View Slide

  42. 42
    gVisor
    ● Isolated from user code
    ● Pod cgroup
    ● Caller’s user namespace
    ● Chroot to rootfs
    ○ Bind mounts
    ● Runs as root
    ○ Similar to “docker run” as root
    ○ Drop non-FS capabilities
    ● seccomp
    Defense In Depth: Gofer cgroup
    reduced namespaces
    chroot
    reduced capabilities
    Gofer

    View Slide

  43. 43
    gVisor
    ● Isolated from user code
    ● Pod cgroup
    ● Caller’s user namespace
    ● Chroot to rootfs
    ○ Bind mounts
    ● Runs as root
    ○ Similar to “docker run” as root
    ○ Drop non-FS capabilities
    ● seccomp
    Defense In Depth: Gofer cgroup
    reduced namespaces
    chroot
    reduced capabilities
    seccomp
    Gofer

    View Slide

  44. 44
    gVisor
    ● Be aware of defaults
    ○ K8s is optimized for ease-of-use, not security
    ○ CPU/Memory/Disk limits
    ● Network/Disk isolation
    ○ Network access: Use NetworkPolicy
    ○ Arbitrary packet injection: Sentry provides isolation
    ○ File writes/permissions: Use read-only filesystems
    ○ No throttling mechanism: use cgroups
    What's not protected?

    View Slide

  45. 45
    gVisor
    ● Integrated with RuntimeClass
    ○ RuntimeClassName: gvisor
    ● Minikube
    ○ minikube addons enable gvisor
    ○ github.com/kubernetes/minikube/tree/master/deploy/addons/gvisor
    ● GKE SandboxBETA
    ○ cloud.google.com/kubernetes-engine/sandbox
    ● gvisor-containerd-shim
    ○ github.com/google/gvisor-containerd-shim
    gVisor &

    View Slide

  46. 46
    gVisor
    • https://gvisor.dev/
    • https://github.com/google/gvisor
    • Gitter: https://gitter.im/gvisor/community
    • Mailing lists: gvisor-users, gvisor-dev
    gVisor is Open Source & Thanks!

    View Slide