Pro Yearly is on sale from $80 to $50! »

Experimental Docker Checkpoint and Restore with CRIU

05f395ddf8f8cb4f8ecc1fffea88cf0d?s=47 Saied Kazemi
September 18, 2014

Experimental Docker Checkpoint and Restore with CRIU

Slide deck presented at Docker Meetup Mountain View on September 17, 2014.

05f395ddf8f8cb4f8ecc1fffea88cf0d?s=128

Saied Kazemi

September 18, 2014
Tweet

Transcript

  1. Experimental Docker Checkpoint and Restore with CRIU Docker Meetup September

    17, 2014 Saied Kazemi (saied@)
  2. Docker Meetup 9/17/14 Ultimate Goal • Native container Checkpoint and

    Restore (C/R) support in Docker to facilitate container migration host_a$ docker checkpoint <container_id> host_b$ docker restore <container_id> • Actual C/R to be done with the Checkpoint Restore In Userspace (CRIU) utility
  3. Docker Meetup 9/17/14 What is CRIU? • An open source

    software tool for Linux (http://criu.org) ◦ Freeze a running application ◦ Checkpoint to a collection of “image” files ◦ Restore later from image files ◦ Application resumes execution from the point it was frozen • Implemented mainly in userspace in C ◦ Presented by Pavel Emelianov, OpenVZ team leader, in July 2011 ◦ Version 1.0 released in November 2013 ◦ Version 1.3 with Docker container support released in August 2014
  4. Docker Meetup 9/17/14 CRIU Usage Scenarios • A set of

    ideas listed at http://criu.org/Usage_scenarios ◦ Container live migration ◦ Slow-boot services speed up ◦ Reboot-less kernel upgrade ◦ Networking load balancing ◦ HPC issues ◦ Desktop environment suspend/resume ◦ Freeze for inspection and/or debugging ◦ ... • Container live migration was the main use case for CRIU
  5. Docker Meetup 9/17/14 CRIU and Docker • There were a

    number of issues C/R’ing Docker containers ◦ see backup slides for details • Excellent support from upstream CRIU developers and community • With CRIU 1.3, now possible to C/R ◦ Works with AUFS (default) as well as VFS and UnionFS ◦ Device Mapper not tested • No native support in Docker yet • No container migration yet
  6. Docker Meetup 9/17/14 Docker and its Containers docker run ...

    docker -d init grandchild Container 1 Container 2 Global namesapce external bind mount
  7. Docker Meetup 9/17/14 Live Demo • Demo using docker_cr.sh helper

    script • Demo using nsinit binary
  8. Docker Meetup 9/17/14 Docker C/R Options • There are two

    options to checkpoint and restore: A) The Docker daemon and (all) its containers and B) An individual container (without the Docker daemon) • Option A isn’t currently possible with CRIU due to nested namespaces ◦ Option B is possible on the same machine ◦ Will look into adding migration support
  9. Docker Meetup 9/17/14 Native C/R Support • Why do we

    need native C/R support in Docker? ◦ Container state ▪ After checkpoint, Docker thinks the container has finished and exited ▪ After restore, Docker doesn’t know container has resumed ◦ Process tree ownership ▪ Restored process tree is a child of init, not Docker daemon ◦ Other uncovered issue...
  10. Docker Meetup 9/17/14 Next Steps • Add C/R support to

    libcontainer ◦ use nsinit for validation • Add C/R support to Docker • Add migration support
  11. Docker Meetup 9/17/14 Backup Slides

  12. container docker exec driver nsinit libcontainer criu

  13. Docker Meetup 9/17/14 Issues and Solutions • Issue: nested PID

    namespaces ◦ two ways to start a container: interactive ($ docker run -i ...) or detached ($ docker run -d ...) ◦ in both cases the process is a child of the docker daemon (not the docker client) running in global PID namespace ◦ CRIU does not support nested PID namespaces • Solution: C/R is done on process tree without Docker
  14. Docker Meetup 9/17/14 Issues and Solutions • Issue: external bind

    mounts ◦ /etc/{hosts,hostname} from container’s config dir ◦ /etc/resolv.conf from container’s config dir (or /etc/resolv.conf in older versions) ◦ /.dockerinit from Docker’s init dir in older versions ◦ bind mount paths for files in /etc can be obtained with docker inspect, but not for /.dockerinit • Solution: external bind mount support with --ext-mount-map
  15. Docker Meetup 9/17/14 Issues and Solutions • Issue: /dev/null bind

    mount over /proc/kcore ◦ appeared in Docker 0.10.0, caused dump failure • Solution: patch 494c044 • Issue: dumpable flag ◦ appeared in Docker 0.11.1 (libcontainer dropping all capabilities, keeping those specified in config) ◦ value is set to 2 by which cannot be restored • Solution: patch 8870aa1
  16. Docker Meetup 9/17/14 Issues and Solutions • Issue: restoring cgroups

    subdirs and properties ◦ after checkpointing, Docker daemon would remove container’s cgroups subdirs (because the container has “exited”) ◦ after restoring subdirs, properties were not restored • Solution: cgroups restoration support with --manage-cgroups
  17. Docker Meetup 9/17/14 Issues and Solutions • Issue: stdin in

    detached mode ◦ container’s stdin set to the global /dev/null in detached mode $ docker run -d … • Solution: fixed in Docker use --evasive-devices for older Docker versions
  18. Docker Meetup 9/17/14 Issues and Solutions • Issue: AUFS ◦

    /proc/<pid>/map_files symbolic link paths point inside AUFS branches ◦ CRIU gets confused seeing the same file in its physical location (in the branch) and its logical location (from the root of mount namespace) ◦ fixing the kernel is the right solution but time-consuming to roll out • Solution: ◦ fixed in AUFS (but will take time to be available in all distros) ◦ in the meantime, CRIU patch d8b41b6 will compensate for the problem