Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Experimental Docker Checkpoint and Restore with CRIU

Saied Kazemi
September 18, 2014

Experimental Docker Checkpoint and Restore with CRIU

Slide deck presented at Docker Meetup Mountain View on September 17, 2014.

Saied Kazemi

September 18, 2014
Tweet

Other Decks in Technology

Transcript

  1. Docker Meetup 9/17/14 Ultimate Goal • Native container Checkpoint and

    Restore (C/R) support in Docker to facilitate container migration host_a$ docker checkpoint <container_id> host_b$ docker restore <container_id> • Actual C/R to be done with the Checkpoint Restore In Userspace (CRIU) utility
  2. Docker Meetup 9/17/14 What is CRIU? • An open source

    software tool for Linux (http://criu.org) ◦ Freeze a running application ◦ Checkpoint to a collection of “image” files ◦ Restore later from image files ◦ Application resumes execution from the point it was frozen • Implemented mainly in userspace in C ◦ Presented by Pavel Emelianov, OpenVZ team leader, in July 2011 ◦ Version 1.0 released in November 2013 ◦ Version 1.3 with Docker container support released in August 2014
  3. Docker Meetup 9/17/14 CRIU Usage Scenarios • A set of

    ideas listed at http://criu.org/Usage_scenarios ◦ Container live migration ◦ Slow-boot services speed up ◦ Reboot-less kernel upgrade ◦ Networking load balancing ◦ HPC issues ◦ Desktop environment suspend/resume ◦ Freeze for inspection and/or debugging ◦ ... • Container live migration was the main use case for CRIU
  4. Docker Meetup 9/17/14 CRIU and Docker • There were a

    number of issues C/R’ing Docker containers ◦ see backup slides for details • Excellent support from upstream CRIU developers and community • With CRIU 1.3, now possible to C/R ◦ Works with AUFS (default) as well as VFS and UnionFS ◦ Device Mapper not tested • No native support in Docker yet • No container migration yet
  5. Docker Meetup 9/17/14 Docker and its Containers docker run ...

    docker -d init grandchild Container 1 Container 2 Global namesapce external bind mount
  6. Docker Meetup 9/17/14 Docker C/R Options • There are two

    options to checkpoint and restore: A) The Docker daemon and (all) its containers and B) An individual container (without the Docker daemon) • Option A isn’t currently possible with CRIU due to nested namespaces ◦ Option B is possible on the same machine ◦ Will look into adding migration support
  7. Docker Meetup 9/17/14 Native C/R Support • Why do we

    need native C/R support in Docker? ◦ Container state ▪ After checkpoint, Docker thinks the container has finished and exited ▪ After restore, Docker doesn’t know container has resumed ◦ Process tree ownership ▪ Restored process tree is a child of init, not Docker daemon ◦ Other uncovered issue...
  8. Docker Meetup 9/17/14 Next Steps • Add C/R support to

    libcontainer ◦ use nsinit for validation • Add C/R support to Docker • Add migration support
  9. Docker Meetup 9/17/14 Issues and Solutions • Issue: nested PID

    namespaces ◦ two ways to start a container: interactive ($ docker run -i ...) or detached ($ docker run -d ...) ◦ in both cases the process is a child of the docker daemon (not the docker client) running in global PID namespace ◦ CRIU does not support nested PID namespaces • Solution: C/R is done on process tree without Docker
  10. Docker Meetup 9/17/14 Issues and Solutions • Issue: external bind

    mounts ◦ /etc/{hosts,hostname} from container’s config dir ◦ /etc/resolv.conf from container’s config dir (or /etc/resolv.conf in older versions) ◦ /.dockerinit from Docker’s init dir in older versions ◦ bind mount paths for files in /etc can be obtained with docker inspect, but not for /.dockerinit • Solution: external bind mount support with --ext-mount-map
  11. Docker Meetup 9/17/14 Issues and Solutions • Issue: /dev/null bind

    mount over /proc/kcore ◦ appeared in Docker 0.10.0, caused dump failure • Solution: patch 494c044 • Issue: dumpable flag ◦ appeared in Docker 0.11.1 (libcontainer dropping all capabilities, keeping those specified in config) ◦ value is set to 2 by which cannot be restored • Solution: patch 8870aa1
  12. Docker Meetup 9/17/14 Issues and Solutions • Issue: restoring cgroups

    subdirs and properties ◦ after checkpointing, Docker daemon would remove container’s cgroups subdirs (because the container has “exited”) ◦ after restoring subdirs, properties were not restored • Solution: cgroups restoration support with --manage-cgroups
  13. Docker Meetup 9/17/14 Issues and Solutions • Issue: stdin in

    detached mode ◦ container’s stdin set to the global /dev/null in detached mode $ docker run -d … • Solution: fixed in Docker use --evasive-devices for older Docker versions
  14. Docker Meetup 9/17/14 Issues and Solutions • Issue: AUFS ◦

    /proc/<pid>/map_files symbolic link paths point inside AUFS branches ◦ CRIU gets confused seeing the same file in its physical location (in the branch) and its logical location (from the root of mount namespace) ◦ fixing the kernel is the right solution but time-consuming to roll out • Solution: ◦ fixed in AUFS (but will take time to be available in all distros) ◦ in the meantime, CRIU patch d8b41b6 will compensate for the problem