inside of CRIU •How can we use CRIU? • Migration • Reduction of bootstrap time •How to combine CRIU into a runtime? For Developers/Operators Using Containers For RUNTIME Developers
checkpoint/restore(C/R) functionality for Linux • Generally, VMs are able to be dumped and restored. • CRIU is this functionality for processes/containers • https://www.criu.org/Main_Page • ex. crtools
project of Virtuozzo • https://www.virtuozzo.com/ • CRIU is currently used by OpenVZ (https://openvz.org/), LXC/LXD and Docker. You can use criu command alone.
memory • e.g. • Then create checkpoint: • And restart with --checkpoint option • Thus, the count is rollbacked to checkpoint! • (If no --checkpoint, count is restarted by 0)
• Client can access this service via socket, using protobuf • CRIU provides some of protobuf wrapper: • C wrapper (called <libcriu>) • Python wrapper • Go wrapper(experimental)
- normally processes will be killed at this time. Memory dump Network conf File descriptors cgroup params Process attrs ...... CRIU Target process Syscalls, /proc files ... Kernel
• https://github.com/checkpoint-restore/p.haul/blob/master/test/ mtouch/HOWTO • There is also a example for docker 1.9.0... and cannot reproduce now • https://github.com/checkpoint-restore/p.haul/blob/master/test/ docker/HOWTO
JVM, ... • These applications cannot enjoy enough the merits of lightweight aspect of containers. • e.g. A small Rails project takes 2,500ms~ to become ready. • Jenkins project takes 5,000ms~ to listen 8080...
container will be bootstrapped on first request, and automatically shut down after some minutes. • This means containers are restarted repeatedly, and this force containers to be refreshed and clean. • cf. “Phoenix Server” in the book “Infrastructure as Code” • Used in our PaaS service: https://mc.lolipop.jp • See @matsumotory’s paper/slide https://speakerdeck.com/matsumoto_r/fastcontainer-shi-xing-huan-jing-falsebian-hua- nisu-zao-kushi-ying-dekiruheng-chang-xing-wochi-tusisutemuakitekutiya
want more info) • A project in UBC class refers this: • https://www.cs.ubc.ca/~bestchai/teaching/cs416_2017w2/ project2/project_m6r8_s8u8_v5v8_y6x8_proposal.pdf
Non OCI-compatible for now (I am planning...) • Implemented basic container features: • Linux namespace, cgroup, chroot/pivot_root, capability/uid/ gid, rlimit, seccomp, apparmor... • Implemented some “hooks”: • Lifetime hooks, async timeout/interval hooks, sighandlers
Haconiwa • haconiwa checkpoint: • To create checkpoint from a running container • haconiwa restore: • To make a restored container, with some spec changes
invoked as the checkpointing or restoration is processed: Action Script. • e.g. post-dump, post-restore, setup-namespaces... • Haconiwa use this action script to change container’s IP from dumped one as written in a new DSL.
option, which decides its cgroup name. • When you want to change name between dumped and restored containers, you must also change new one’s cgroup name. • Criu’s --cgroup-root option to solve this
restore process should also restore these hooks by DSL. • This is out of CRIU’s feature • Hooks are implemented in “container supervisor”, rather than container itself • So I implemented to set “supervisor for restored containers” upon a restored container. And hooks are invoked in SV
• I introduced 2 use cases: • Migration • Reduction of bootstrap cost • There is no Kubernetes integration yet, but may be soon? • I have been developing CRIU integration with my container runtime :)