Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Introduction to CRIU

KONDO Uchio
December 05, 2018

Introduction to CRIU

- Let’s take a glance at the future of containers!

@ JapanContainerDays 2018/12/05

KONDO Uchio

December 05, 2018
Tweet

More Decks by KONDO Uchio

Other Decks in Technology

Transcript

  1. Let’s take a glance at the future of containers! Uchio

    Kondo / GMO Pepabo, Inc. 2018.12.05 JapanContainerDays v18.12 Introduction to CRIU
  2. Señor-Principal Engineer @ GMO Pepabo, Inc. Uchio Kondo https://blog.udzura.jp/ @udzura

    Technical department, Dev Productivity/R&D Team RubyKaigi 2019 at Fukuoka Local Organizer Chair on CNDJ at Fukuoka, 2019.04
  3. Scope of Today’s Talk •What is CRIU? •Dive into the

    inside of CRIU •How can we use CRIU? • Migration • Reduction of bootstrap cost •How to combine CRIU into a runtime?
  4. Scope of Today’s Talk •What is CRIU? •Dive into the

    inside of CRIU •How can we use CRIU? • Migration • Reduction of bootstrap time •How to combine CRIU into a runtime? For Developers/Operators Using Containers For RUNTIME Developers
  5. CRIU is: C/R In Userspace • a project to implement

    checkpoint/restore(C/R) functionality for Linux • Generally, VMs are able to be dumped and restored. • CRIU is this functionality for processes/containers • https://www.criu.org/Main_Page • ex. crtools
  6. Whet CRIU is for • CRIU is developed as a

    project of Virtuozzo • https://www.virtuozzo.com/ • CRIU is currently used by OpenVZ (https://openvz.org/), 
 LXC/LXD and Docker. You can use criu command alone.
  7. Containers are PROCESSES • So CRIU can create checkpoints for

    containers! • CRIU has many of functionalities to make container’s checkpoint. e.g. Network, Namespace, cgroup...
  8. Enable docker checkpoint • Following instruction in https://github.com/docker/cli/blob/ master/experimental/checkpoint-restore.md •

    Preparation: Install CRIU by yourself (and Docker v17.06 :) • (Ubuntu Bionic has criu package v3.6) • Add --experimental flag to dockerd startup command, then restart
  9. Checkpoint/Restore demo • Run simple container that count number on

    memory • e.g. • Then create checkpoint: • And restart with --checkpoint option
 • Thus, the count is rollbacked to checkpoint! • (If no --checkpoint, count is restarted by 0)
  10. Resources about CRIU • Slide from OpenVZ team: • https://www.slideshare.net/openvz/criu-13dusseldorf

    • One of most reliable articles written in Japanese: • https://gihyo.jp/admin/serial/01/linux_containers/0032
  11. How can we invoke CRIU • There are 2 modes:

    • Via cli: criu command. Normally we use this • Via API: server/client model
  12. Server/client model • CRIU can be a service: criu service

    • Client can access this service via socket, using protobuf • CRIU provides some of protobuf wrapper: • C wrapper (called <libcriu>) • Python wrapper • Go wrapper(experimental)
  13. Server/client model Program CRIU service UNIX domain socket Kernel libcriu

    Target
 process Syscalls, /proc files ... protobuf
  14. Assigned cgroup dockerd docker-containerd containerd-shim Container’s process \_ \_ \_

    Systemd-managed cgroup (docker.service) Container’s Each cgroup
  15. How CRIU make images CRIU Target
 process Syscalls, /proc files

    ... Kernel • CRIU gets the information of process via syscall, /proc file, iproute2 utilities...
  16. How CRIU make images • Then dump them into images

    - normally processes will be killed at this time. Memory dump Network conf File descriptors cgroup params Process attrs ...... CRIU Target
 process Syscalls, /proc files ... Kernel
  17. How CRIU restore images CRIU Restored
 process • CRIU will

    use these images on restore Memory dump Network conf File descriptors cgroup params Process attrs ...... Kernel
  18. crit: image utility • CRIU is bundled with crit command,

    which can decode images in CRIU format.
  19. P.Haul Project https://criu.org/P.Haul • Extension to make live migration with

    CRIU possible. • Super experimental • Not so active
  20. P.Haul works? • Example of node-to-node migration using sample process

    • https://github.com/checkpoint-restore/p.haul/blob/master/test/ mtouch/HOWTO • There is also a example for docker 1.9.0... and cannot reproduce now • https://github.com/checkpoint-restore/p.haul/blob/master/test/ docker/HOWTO
  21. Migration demo P.Haul looks too inactive! So I implemented it

    Using my container... I’ll show later!
  22. Containers with slow bootstrap • Especially big applications: Legacy Rails,

    JVM, ... • These applications cannot enjoy enough the merits of lightweight aspect of containers. • e.g. A small Rails project takes 2,500ms~ to become ready. • Jenkins project takes 5,000ms~ to listen 8080...
  23. FYI: “FastContainer” • An architecture to handle containers • A

    container will be bootstrapped on first request, and automatically shut down after some minutes. • This means containers are restarted repeatedly, and this force containers to be refreshed and clean. • cf. “Phoenix Server” in the book “Infrastructure as Code” • Used in our PaaS service: https://mc.lolipop.jp • See @matsumotory’s paper/slide https://speakerdeck.com/matsumoto_r/fastcontainer-shi-xing-huan-jing-falsebian-hua- nisu-zao-kushi-ying-dekiruheng-chang-xing-wochi-tusisutemuakitekutiya
  24. FYI: “FastContainer” Web Proxy Web Request Dispatcher FastContainer Runtime CMDB

    ❌ FastContainer Killed 1. Check 2. Boot 3. Forward 4. Terminate
  25. Experiment codes ab -g bench-rails.tsv \ -s 120 -c 1

    -t 90 -n 1000000 -k -l http://192.168.199.10/ import numpy as np import matplotlib.pyplot as plt data = np.loadtxt("/path/to/bench-rails.tsv", delimiter="\t", skiprows=1, usecols=(1,4), dtype=int) data = np.rot90(sorted(data, key=lambda x:x[0]), k=-1) plt.plot(data[0], data[1], linewidth=1, color="orange") plt.ylim(0, 2700) plt.show() Benchmarker Script For Visualize
  26. Needs fast boot up • One of bottleneck of this

    architecture is “slow boot” apps • Comparison of Apache HTTPD vs Rails application: ms/r unixtime Apache(phpinfo) RoR(no bootsnap)
  27. Lifecycle with CRIU ngx_mruby Haconiwa Containers ReSTORE on next request

    Make image just before stop, In async process haconiwa restore Image
  28. Using CRIU to make boot fast • Comparison of hot-start

    Rails application and cold-start (from criu image) Rails: RoR(no bootsnap/From CRIU image) RoR(no bootsnap)
  29. Kubernetes integration? • There seems to be no plan yet...(I

    want more info) • A project in UBC class refers this: • https://www.cs.ubc.ca/~bestchai/teaching/cs416_2017w2/ project2/project_m6r8_s8u8_v5v8_y6x8_proposal.pdf
  30. Haconiwa • Highly Configurable container runtime written in mruby •

    Non OCI-compatible for now (I am planning...) • Implemented basic container features: • Linux namespace, cgroup, chroot/pivot_root, capability/uid/ gid, rlimit, seccomp, apparmor... • Implemented some “hooks”: • Lifetime hooks, async timeout/interval hooks, sighandlers
  31. What I’m working on now • Bundling CRIU features into

    Haconiwa • haconiwa checkpoint: • To create checkpoint from a running container • haconiwa restore: • To make a restored container, with some spec changes
  32. CRIU deep features: • These are what I used in

    haconiwa development: • Restoration process hooks(action script) • Change cgroup name on restore • Replace supervisor program by --exec-cmd
  33. Restoration process hooks • CRIU has a hooks which are

    invoked as the checkpointing or restoration is processed: Action Script. • e.g. post-dump, post-restore, setup-namespaces... • Haconiwa use this action script to change container’s IP from dumped one as written in a new DSL.
  34. Change cgroup name on restore • Haconiwa’s container has name

    option, which decides its cgroup name. • When you want to change name between dumped and restored containers, you must also change new one’s cgroup name. • Criu’s --cgroup-root option to solve this
  35. Replace supervisor program • Haconiwa has its own hooks, and

    restore process should also restore these hooks by DSL. • This is out of CRIU’s feature • Hooks are implemented in “container supervisor”, rather than container itself • So I implemented to set “supervisor for restored containers” upon a restored container. And hooks are invoked in SV
  36. Replacement process Haconiwa sv \- criu restore \- Container Haconiwa

    sv \- haconiwa _restored \- Container exec() wait() in new program Restore done!
  37. Demo Overview Load Balancer Victim container Restored container Image On

    shared storage Victim Host Dest Host http://Mac:10080 http://Mac:11080 Nonstop! https://github.com/udzura/nginx-haconiwa/tree/haconiwa-migration
  38. Conclusion • CRIU can create checkpoints for containers, and restore.

    • I introduced 2 use cases: • Migration • Reduction of bootstrap cost • There is no Kubernetes integration yet, but may be soon? • I have been developing CRIU integration with my container runtime :)