Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A modern services SDK for LinuxKit

A modern services SDK for LinuxKit

A very very early talk about the SDK work for building modern privilege separated applications on LinuxKit. For the Moby Security Sig #2: https://forums.mobyproject.org/t/2017-06-07-linuxkit-security-sig-meeting/58

Anil Madhavapeddy

June 07, 2017
Tweet

More Decks by Anil Madhavapeddy

Other Decks in Technology

Transcript

  1. A modern services SDK for LinuxKit Type safety, container-native daemons,

    minimum privilege, easy development, unikernel protection, .. hacked to you by Thomas Gazagnaire, Thomas Leonard,
 Martin Lucina, Anil Madhavapeddy, Mindy Preston 7th June 2017 - Moby Security SIG #2
  2. A modern services SDK for LinuxKit Type safety, container-native daemons,

    minimum privilege, easy development, unikernel protection, .. disclaimer: this is active work in progress, and we're showing this early to the Moby Security SIG community to get feedback on the work. interruptions and feedback are welcome. and patches :)
  3. Motivation • Base daemons in LinuxKit are typically wrapped versions

    of existing system software (e.g. dhcpcd, ntpd). • Often written in C, different configuration mechanisms, no structured logging, require lots of privilege for system operations. • Want to make these less monolithic and more container- native, and fit with LinuxKit philosophy of a lean, secure container runtime. • This project provides us with a vehicle to deploy more advanced security protections in LinuxKit in a practical way.
  4. Approach • LinuxKit has a single build-time yaml file and

    everything except init runs in a container namespace. • We build privilege separated applications that use this architecture to avoid common security vulnerabilities by: 1. Specifying the process layout for an application in yaml 2. Enforcing isolated, minimal privileges per process 3. Separating every process in a container namespace 4. Coordinating the containers with standard RPC tooling Developer experience matters: containerisation complexity is hidden inside the SDK tooling and not the application.
  5. Approach • First daemon being developed is a DHCP client.

    • This is a difficult daemon to privilege separate due the deep (and non-portable) system hooks required for handling IP and routing tables (e.g. netlink). • Implementation flushes out a lot of architectural questions and makes subsequent protocol implementations such as HTTPS or NTP more straightforward. https://github.com/linuxkit/linuxkit/tree/master/projects/miragesdk https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=dhcp
  6. dhcp- network dhcp- actuator dhcp- engine eth0 kernel/ netlink •

    Three processes, each with very minimal privileges. • dhcp-network can only access eth0 for networking • dhcp-engine can see nothing except the other two processes • dhcp-actuator can manipulate routing tables but cannot see network. • Each process can be written in safe language best suited to the task (OCaml, Rust in this case).
  7. - name: dhcp-network capabilities: - CAP_NET_ADMIN # bring eth0 up

    - CAP_NET_RAW # read /dev/eth0 dhcp- network dhcp- actuator dhcp- engine eth0 kernel/ netlink
  8. - name: dhcp-network capabilities: - CAP_NET_ADMIN # bring eth0 up

    - CAP_NET_RAW # read /dev/eth0 - name: dhcp-actuator image: <image> capabilities: - CAP_NET_ADMIN # for netlink binds: - /state # to write resolv.conf dhcp- network dhcp- actuator dhcp- engine eth0 kernel/ netlink
  9. - name: dhcp-network capabilities: - CAP_NET_ADMIN # bring eth0 up

    - CAP_NET_RAW # read /dev/eth0 - name: dhcp-actuator image: <image> capabilities: - CAP_NET_ADMIN # for netlink binds: - /state # to write resolv.conf - name: dhcp-engine image: <image> rpc: - dhcp-network - dhcp-actuator dhcp- network dhcp- actuator dhcp- engine eth0 kernel/ netlink
  10. - name: dhcp-network capabilities: - CAP_NET_ADMIN # bring eth0 up

    - CAP_NET_RAW # read /dev/eth0 - name: dhcp-actuator image: <image> capabilities: - CAP_NET_ADMIN # for netlink binds: - /state # to write resolv.conf - name: dhcp-engine image: <image> rpc: - dhcp-network - dhcp-actuator - name: dhcp-init image: <image> files: - path: /var/run/dhcp-client/README contents: 'data for dhcp-client' dhcp- network dhcp- actuator dhcp- engine eth0 kernel/ netlink
  11. dhcp- network dhcp- actuator dhcp- engine eth0 kernel/ netlink @0xb224be3ea8450819;

    struct DhcpNetworkRequest { id @0 :Int32; path @1 :List(Text); union { write @2 :Data; read @3 :Void; delete @4 :Void; } } struct DhcpNetworkResponse { id @0: Int32; union { ok @1 :Data; error @2 :Data; } } struct DhcpActuatorRequest { id @0 :Int32; interface @1 :Text; ipv4Addr @2 :List(Text); resolvConf @3 :List(Text); } struct DhcpActuatorResponse { id @0: Int32; union { ok @1 :Data; error @2 :Data; } } • RPC via Capnp transport layer. • Provides RPC making it easy to generate bindings to languages.
 https://github.com/ capnproto • LinuxKit SDK takes care of starting the containers with an initial config and connecting the file descriptors.
  12. Demo: Capnp RPC • Capnp has an interface file and

    stub code generator for many languages. • Very simple binary format to parse (e.g. no HTTP2 dependency) so is a viable small attack surface to depend on for privileged components. • The CLI checks your interface specs and makes it relatively easy to glue pieces together. • Here is an example of an HTTPS server built like this:
 https://github.com/talex5/linuxkit/tree/https-unikernel/ projects/https-unikernel
 (see https://github.com/linuxkit/linuxkit/pull/1981)
  13. Going deeper for security • Need protections at all levels

    of the stack for defence in depth: • application level: static type safety when parsing network traffic (via OCaml, Rust logic) and secure RPC (via Capnp) • protocol state machine: fuzz testing for rapid state space exploration (via American Fuzzy Lop aka AFL) • runtime process: container namespacing and KVM hardware protection if available (via unikernel Solo5). • kernel interface: eBPF sandboxing for fine-grained access to syscalls. • implementation diversity: the container/rpc approach lets many languages/runtimes work together without tight coupling. • What else? LinuxKit lets us patch kernel and use facility directly in the base daemons, just like a BSD distro. SGX, TrustZone, etc...
  14. Demo: ukvm service • For service isolation, we can further

    protect processes against exploit by using /dev/kvm • This is a unikernel (standalone specialised VM) running as a normal Linux process in a container. • External channel setup will be handled by the RPC layer, but for now is just a tap device. • Demo: here is a DNS service running as a KVM process on Linux and serving network traffic.
  15. Demo: fuzz testing • Fuzz testing: throw a lot of

    random input at a program, see where it breaks, fix it, repeat. • AFL is helpful as it can figure out an effective fuzz path quickly, and minimise test cases. Comes with a CLI afl-fuzz: http://lcamtuf.coredump.cx • Writing adapters for AFL to the LinuxKit SDK (which uses file descriptors) to make fuzzing easier to start. • Demo: afl-fuzz working on the DHCP state machine.
 Details at <https://somerandomidiot.com/blog/2017/04/26/crowbar-dhcp/>
 Asciicast: https://asciinema.org/a/3ljccmn19m25uj02kve678xp6
  16. Putting it all together • WIP: wanted to explain the

    architecture early to the Security SIG community. Another update at the Moby Summit in a few weeks to show the frontend tooling. • DHCP, DNS, HTTPS are our first targets to have safe system services by default. Anything else to focus on? • Config interface is as similar to existing daemons as possible so they can be swapped easily. • @mato is working on integrating Solo5 so that isolated services (e.g. dhcp-engine) can be unikernel-protected if hardware virt is available, and fall back to eBPF/seccomp sandboxing if not. • @talex5 is working on the RPC substrate. • @samoht @avsm are building the system daemons and CLI frontend. • @yomimono is hacking on fuzz testing all the things with AFL.
  17. Where its going • Initially it is very LinuxKit specific

    since we depend on a specific containerd featureset, but everything is intended to be portable (including to FreeBSD jails, OpenBSD pledge, ...) • The Moby CLI should be able to package up as deb or rpms though, so it can be deployable more widely. • We want to take a structured approach to classifying CVEs for common system services to determine what to fuzz on. Memory safety, logic bugs, container breakouts, ... • Support more languages, build an ecosystem for practical correct-by-construction services. • https://github.com/linuxkit/linuxkit projects/miragesdk