$30 off During Our Annual Pro Sale. View Details »

Docker Security: A Deep Dive

Docker Security: A Deep Dive

Presentation on Docker security, held as part of the Seneca workshop: senecaproject.github.io/workshop-cloud/

Antonis Kalipetis

June 23, 2016
Tweet

More Decks by Antonis Kalipetis

Other Decks in Technology

Transcript

  1. Antonis Kalipetis CTO @ SourceLair Docker Captain and big fan

    Python enthusiast Coffee lover @akalipetis
  2. Agenda • Docker internals ◦ What is a container? ◦

    Possible Docker attack vectors ◦ Controlling resources ◦ Isolating processes • Custom security policies ◦ Docker metrics ◦ Authentication/Authorization plugins ◦ Examples ◦ Other tools
  3. Containers are the use of a collection of kernel tools

    and features, in order to jail and limit a process according to our needs and wants.
  4. cgroups (abbreviated from control groups) is a Linux kernel feature

    that limits, accounts for, and isolates the resource usage (CPU, memory, disk I/O, network, etc.) of a collection of processes.
  5. • memory • cpu/cpuset • devices • blkio • network*

    *network is not a real cgroup, it can be used though for metering Useful cgroups
  6. memory cgroup • Tracks memory pages used by processes •

    Soft limit ◦ Reclaim under high memory usage • Hard limit ◦ OOM killed
  7. cpu/cpuset cgroups • cpuset limits processes to specific cores •

    cpu tracks CPU time used by processes ◦ Imposes weights, not limits ◦ A process can consume all the available CPU, if no other process uses it
  8. devices cgroup • Allows read/write/mknod to certain devices • Typically

    defaults to only allow tty, zero, random, null • Can give access to other devices, if this is required by the application
  9. blkio and network cgroups • They meter network and I/O

    usage • Can be used for throttling usage, or identifying malicious containers
  10. Other cgroups • cpuacct - reports CPU usage • freezer

    - suspends or resumes tasks ◦ Processes or container migration to another node, using memory dumps • perf_event - allows monitoring using perf • hugetlb - controls the amount of large pages
  11. A namespace wraps a global system resource in an abstraction

    that makes it appear to the processes within the namespace that they have their own isolated instance of the global resource.
  12. • Each container gets its own network stack ◦ Docker

    creates veths pairs - acting as eth0 inside the container ◦ Each container gets its own IP address ◦ All veths bridged to docker0 • Docker handles routing ◦ Container to container links possible through Docker NET Namespace
  13. MNT Namespace • Each container gets its own root filesystem

    • Host directories bound privately to the container • Together with CoW filesystems allow for ultra-fast boots ◦ AUFS, overlay ◦ Device mapper ◦ BTRFS, ZFS
  14. USER namespace • Allows remapping of users and groups from

    host to container ◦ Container’s “root”, is not actually root ◦ 0-10000 in a container can be 10000-20000 in host • Landed in Docker 1.10 ◦ https://integratedcode.us/2016/02/05/docker-1-10-security-userns/
  15. PID namespace • Every container has its own “PID 1”

    ◦ If PID 1 dies, all other processes get killed • Container PID 1 is mapped to another PID in the host ◦ Host can see all processes running inside containers • PID namespaces can be nested ◦ There’s a PID-ception • Shared namespaces supported in Docker 1.12
  16. • uts namespace - allows for custom domain/hostname ◦ sethostname/gethostname

    • ipc namespace - allows interprocess communication ◦ semaphores, message queues, shared memory Other Namespaces
  17. • Namespaces and cgroups support • Zero day vulnerabilities •

    Vulnerabilities of cgroups / namespaces Solutions • Make use of recent kernels • Be informed • Take additional measures Intrinsic kernel security
  18. The Docker daemon • The daemon runs as root in

    your host ◦ Can do pretty much anything if compromised Solutions • Restrict access to the daemon only to the ones really needing it (users, processes etc) • Don’t expose the daemon to the outside world ◦ If you do so, make sure you have put this behind a secure proxy, like NGINX • Don’t make it easy to SSH with the users that have access to the daemon
  19. • Containers might have elevated privileges, allowing container escaping •

    Containers might have access to system resources they shouldn’t ◦ ie broad volume mounts Solutions • Mount only volumes you need to ◦ Try to mount them as readonly if the container should not write • Don’t use the --privileged flag ◦ Use the --cap-add flag, only for the capabilities that you really need • If you can, don’t run containers as root ◦ Or use user remapping Loopholes in container config
  20. • CPU • Memory • Network • Disk I/O •

    Process names and trees ◦ Container processes are visible in the host, or host PID namespaced containers Using Docker metrics
  21. Using and combining Docker metrics, allows you to create profiles

    for containers and spot malicious ones. Also, having information that spans multiple containers of the same origin can enhance your tracking mechanisms.
  22. • Authenticate requests to Docker daemon ◦ Reject unauthenticated requests

    ◦ Identify the user that is doing the request • Authorize requests for users ◦ Check if the user is allowed to make the given request ◦ Reject requests that don’t comply with the user’s allowance Authentication and authorization plugins
  23. Image scanning • You can scan your images for known

    vulnerabilities • There are tools for that, like Docker Nautilus and CoreOS Clair • Find known vulnerable binaries
  24. • Externally imposed limits ◦ Listen for new containers being

    spawned ◦ Switch to the container network namespace (using setns) ◦ Use tc and iptables to impose limits • Distributed network initialization ◦ Create a network initialization container ◦ Initialize the network stack using imposed limits ◦ Make other containers use the same network stack ◦ CAP_NET_ADMIN to the rescue, since such action is not allowed by default Use case: Impose network limits
  25. Use case: Impose storage limits (RootFS) • Create a watcher

    and watch the size of each container • Use device mapper and ensure max size of a container • Make the root filesystem read-only ◦ In combination with --tmpfs flag, available in Docker 1.10 for volumes that the container should write to
  26. Use case: Impose storage limits (Volumes) • Create a watcher

    and watch the size of each directory • Use filesystem quotas to the mounted directories • Use loopback devices from sparse files • Use a logical volume manager (LVM) ◦ ZFS, etc
  27. Other tools • AppArmor ◦ AppArmor is a Mandatory Access

    Control (MAC) system which is a kernel (LSM) enhancement to confine programs to a limited set of resources. AppArmor's security model is to bind access control attributes to programs rather than to users. ◦ Bane to the rescue: https://github.com/jfrazelle/bane • SELinux ◦ Security-Enhanced Linux (SELinux) is a Linux kernel security module that provides a mechanism for supporting access control security policies, including United States Department of Defense– style mandatory access controls (MAC).
  28. Some resources • Jérôme Petazzoni - http://www.slideshare.net/jpetazzo/cgroups-namespaces-and- beyond-what-are-containers-made-from-dockercon-europe-2015 • Dan

    Walsh - http://www.projectatomic.io/blog/2015/12/making-docker-images- write-only-in-production/ • Linux Advanced Routing & Traffic Control - http://lartc.org/howto/lartc.ratelimit. single.html • Crosby Michael - http://crosbymichael.com/creating-containers-part-1.html • Jessie Frazelle - https://blog.jessfraz.com/post/getting-towards-real-sandbox- containers/