$30 off During Our Annual Pro Sale. View Details »

A modern services SDK for LinuxKit

A modern services SDK for LinuxKit

A very very early talk about the SDK work for building modern privilege separated applications on LinuxKit. For the Moby Security Sig #2: https://forums.mobyproject.org/t/2017-06-07-linuxkit-security-sig-meeting/58

Anil Madhavapeddy

June 07, 2017
Tweet

More Decks by Anil Madhavapeddy

Other Decks in Technology

Transcript

  1. A modern services
    SDK for LinuxKit
    Type safety, container-native daemons, minimum
    privilege, easy development, unikernel protection, ..
    hacked to you by Thomas Gazagnaire, Thomas Leonard,

    Martin Lucina, Anil Madhavapeddy, Mindy Preston
    7th June 2017 - Moby Security SIG #2

    View Slide

  2. A modern services
    SDK for LinuxKit
    Type safety, container-native daemons, minimum
    privilege, easy development, unikernel protection, ..
    disclaimer: this is active work in progress,
    and we're showing this early to the Moby
    Security SIG community to get feedback on
    the work.
    interruptions and feedback are welcome.
    and patches :)

    View Slide

  3. Motivation
    • Base daemons in LinuxKit are typically wrapped versions of
    existing system software (e.g. dhcpcd, ntpd).
    • Often written in C, different configuration mechanisms, no
    structured logging, require lots of privilege for system
    operations.
    • Want to make these less monolithic and more container-
    native, and fit with LinuxKit philosophy of a lean, secure
    container runtime.
    • This project provides us with a vehicle to deploy more
    advanced security protections in LinuxKit in a practical way.

    View Slide

  4. Approach
    • LinuxKit has a single build-time yaml file and everything
    except init runs in a container namespace.
    • We build privilege separated applications that use this
    architecture to avoid common security vulnerabilities by:
    1. Specifying the process layout for an application in yaml
    2. Enforcing isolated, minimal privileges per process
    3. Separating every process in a container namespace
    4. Coordinating the containers with standard RPC tooling
    Developer experience matters: containerisation complexity
    is hidden inside the SDK tooling and not the application.

    View Slide

  5. Approach
    • First daemon being developed is a DHCP client.
    • This is a difficult daemon to privilege separate due
    the deep (and non-portable) system hooks required
    for handling IP and routing tables (e.g. netlink).
    • Implementation flushes out a lot of architectural
    questions and makes subsequent protocol
    implementations such as HTTPS or NTP more
    straightforward.
    https://github.com/linuxkit/linuxkit/tree/master/projects/miragesdk
    https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=dhcp

    View Slide

  6. dhcp-
    network
    dhcp-
    actuator
    dhcp-
    engine
    eth0
    kernel/
    netlink
    • Three processes, each with very
    minimal privileges.
    • dhcp-network can only access
    eth0 for networking
    • dhcp-engine can see nothing
    except the other two processes
    • dhcp-actuator can manipulate
    routing tables but cannot see
    network.
    • Each process can be written in safe
    language best suited to the task
    (OCaml, Rust in this case).

    View Slide

  7. - name: dhcp-network
    capabilities:
    - CAP_NET_ADMIN # bring eth0 up
    - CAP_NET_RAW # read /dev/eth0
    dhcp-
    network
    dhcp-
    actuator
    dhcp-
    engine
    eth0
    kernel/
    netlink

    View Slide

  8. - name: dhcp-network
    capabilities:
    - CAP_NET_ADMIN # bring eth0 up
    - CAP_NET_RAW # read /dev/eth0
    - name: dhcp-actuator
    image:
    capabilities:
    - CAP_NET_ADMIN # for netlink
    binds:
    - /state # to write resolv.conf
    dhcp-
    network
    dhcp-
    actuator
    dhcp-
    engine
    eth0
    kernel/
    netlink

    View Slide

  9. - name: dhcp-network
    capabilities:
    - CAP_NET_ADMIN # bring eth0 up
    - CAP_NET_RAW # read /dev/eth0
    - name: dhcp-actuator
    image:
    capabilities:
    - CAP_NET_ADMIN # for netlink
    binds:
    - /state # to write resolv.conf
    - name: dhcp-engine
    image:
    rpc:
    - dhcp-network
    - dhcp-actuator
    dhcp-
    network
    dhcp-
    actuator
    dhcp-
    engine
    eth0
    kernel/
    netlink

    View Slide

  10. - name: dhcp-network
    capabilities:
    - CAP_NET_ADMIN # bring eth0 up
    - CAP_NET_RAW # read /dev/eth0
    - name: dhcp-actuator
    image:
    capabilities:
    - CAP_NET_ADMIN # for netlink
    binds:
    - /state # to write resolv.conf
    - name: dhcp-engine
    image:
    rpc:
    - dhcp-network
    - dhcp-actuator
    - name: dhcp-init
    image:
    files:
    - path: /var/run/dhcp-client/README
    contents: 'data for dhcp-client'
    dhcp-
    network
    dhcp-
    actuator
    dhcp-
    engine
    eth0
    kernel/
    netlink

    View Slide

  11. dhcp-
    network
    dhcp-
    actuator
    dhcp-
    engine
    eth0
    kernel/
    netlink
    @0xb224be3ea8450819;
    struct DhcpNetworkRequest {
    id @0 :Int32;
    path @1 :List(Text);
    union {
    write @2 :Data;
    read @3 :Void;
    delete @4 :Void;
    }
    }
    struct DhcpNetworkResponse {
    id @0: Int32;
    union {
    ok @1 :Data;
    error @2 :Data;
    }
    }
    struct DhcpActuatorRequest {
    id @0 :Int32;
    interface @1 :Text;
    ipv4Addr @2 :List(Text);
    resolvConf @3 :List(Text);
    }
    struct DhcpActuatorResponse {
    id @0: Int32;
    union {
    ok @1 :Data;
    error @2 :Data;
    }
    }
    • RPC via Capnp
    transport layer.
    • Provides RPC making
    it easy to generate
    bindings to languages.

    https://github.com/
    capnproto
    • LinuxKit SDK takes
    care of starting the
    containers with an
    initial config and
    connecting the file
    descriptors.

    View Slide

  12. Demo: Capnp RPC
    • Capnp has an interface file and stub code generator for
    many languages.
    • Very simple binary format to parse (e.g. no HTTP2
    dependency) so is a viable small attack surface to
    depend on for privileged components.
    • The CLI checks your interface specs and makes it
    relatively easy to glue pieces together.
    • Here is an example of an HTTPS server built like this:

    https://github.com/talex5/linuxkit/tree/https-unikernel/
    projects/https-unikernel

    (see https://github.com/linuxkit/linuxkit/pull/1981)

    View Slide

  13. Going deeper for security
    • Need protections at all levels of the stack for defence in depth:
    • application level: static type safety when parsing network
    traffic (via OCaml, Rust logic) and secure RPC (via Capnp)
    • protocol state machine: fuzz testing for rapid state space
    exploration (via American Fuzzy Lop aka AFL)
    • runtime process: container namespacing and KVM hardware
    protection if available (via unikernel Solo5).
    • kernel interface: eBPF sandboxing for fine-grained access to
    syscalls.
    • implementation diversity: the container/rpc approach lets
    many languages/runtimes work together without tight coupling.
    • What else? LinuxKit lets us patch kernel and use facility directly
    in the base daemons, just like a BSD distro. SGX, TrustZone, etc...

    View Slide

  14. Demo: ukvm service
    • For service isolation, we can further protect
    processes against exploit by using /dev/kvm
    • This is a unikernel (standalone specialised VM)
    running as a normal Linux process in a container.
    • External channel setup will be handled by the RPC
    layer, but for now is just a tap device.
    • Demo: here is a DNS service running as a KVM
    process on Linux and serving network traffic.

    View Slide

  15. Demo: fuzz testing
    • Fuzz testing: throw a lot of random input at a program,
    see where it breaks, fix it, repeat.
    • AFL is helpful as it can figure out an effective fuzz path
    quickly, and minimise test cases. Comes with a CLI
    afl-fuzz: http://lcamtuf.coredump.cx
    • Writing adapters for AFL to the LinuxKit SDK (which
    uses file descriptors) to make fuzzing easier to start.
    • Demo: afl-fuzz working on the DHCP state machine.

    Details at 

    Asciicast: https://asciinema.org/a/3ljccmn19m25uj02kve678xp6

    View Slide

  16. Putting it all together
    • WIP: wanted to explain the architecture early to the Security SIG
    community. Another update at the Moby Summit in a few weeks to
    show the frontend tooling.
    • DHCP, DNS, HTTPS are our first targets to have safe system services
    by default. Anything else to focus on?
    • Config interface is as similar to existing daemons as possible so they
    can be swapped easily.
    • @mato is working on integrating Solo5 so that isolated services (e.g.
    dhcp-engine) can be unikernel-protected if hardware virt is available,
    and fall back to eBPF/seccomp sandboxing if not.
    • @talex5 is working on the RPC substrate.
    • @samoht @avsm are building the system daemons and CLI frontend.
    • @yomimono is hacking on fuzz testing all the things with AFL.

    View Slide

  17. Where its going
    • Initially it is very LinuxKit specific since we depend on a
    specific containerd featureset, but everything is intended to
    be portable (including to FreeBSD jails, OpenBSD pledge, ...)
    • The Moby CLI should be able to package up as deb or rpms
    though, so it can be deployable more widely.
    • We want to take a structured approach to classifying CVEs for
    common system services to determine what to fuzz on.
    Memory safety, logic bugs, container breakouts, ...
    • Support more languages, build an ecosystem for practical
    correct-by-construction services.
    • https://github.com/linuxkit/linuxkit projects/miragesdk

    View Slide