checkpoint/restore(C/R) functionality for Linux • Generally, VMs are able to be dumped and restored. • CRIU is this functionality for processes/containers • https://www.criu.org/Main_Page • ex. crtools
• https://github.com/checkpoint-restore/p.haul/blob/master/test/ mtouch/HOWTO • There is also a example for docker 1.9.0... and cannot reproduce now • https://github.com/checkpoint-restore/p.haul/blob/master/test/ docker/HOWTO
JVM, ... • These applications cannot enjoy enough the merits of lightweight aspect of containers. • e.g. A small Rails project takes 2,500ms~ to become ready. • Jenkins project takes 5,000ms~ to listen 8080...
container will be bootstrapped on first request, and automatically shut down after some minutes. • This means containers are restarted repeatedly, and this force containers to be refreshed and clean. • cf. “Phoenix Server” in the book “Infrastructure as Code” • Used in our PaaS service: https://mc.lolipop.jp • See @matsumotory’s paper/slide https://speakerdeck.com/matsumoto_r/fastcontainer-shi-xing-huan-jing-falsebian-hua- nisu-zao-kushi-ying-dekiruheng-chang-xing-wochi-tusisutemuakitekutiya
Non OCI-compatible for now (I am planning...) • Implemented basic container features: • Linux namespace, cgroup, chroot/pivot_root, capability/uid/ gid, rlimit, seccomp, apparmor... • Implemented some “hooks”: • Lifetime hooks, async timeout/interval hooks, sighandlers
invoked as the checkpointing or restoration is processed: Action Script. • e.g. post-dump, post-restore, setup-namespaces... • Haconiwa use this action script to change container’s IP from dumped one as written in a new DSL.
restore process should also restore these hooks by DSL. • This is out of CRIU’s feature • Hooks are implemented in “container supervisor”, rather than container itself • So I implemented to set “supervisor for restored containers” upon a restored container. And hooks are invoked in SV