Slide 1

Slide 1 text

01. Deep Dive into Runtime Shim ContainerRuntime Meetup #2 August 22, 2020
 by @_moricho_

Slide 2

Slide 2 text

02. Morito Ikeda Twitter: @_moricho_ Github: moricho

Slide 3

Slide 3 text

High/Low level runtime ͷ ֓ཁ Runtime Shimͱ͸Կ͔ 03. ಘΒΕΔ͜ͱ ίϯςφͷstdout/stderr͸Ͳ͏؅ཧ͞ΕͯΔ͔ ίϯςφϓϩηε͕Ͳ͏؅ཧ͞Ε͍ͯΔ͔

Slide 4

Slide 4 text

04. Introduction high/low level runtimeͷ͓͞Β͍ 1

Slide 5

Slide 5 text

imageͷ؅ཧ (pull, rm, …) ΍ ίϯςφͷ͋Β ΏΔૢ࡞ΛΩοΫ͢Δ gRPCαʔϏε
 ࣮ࡍͷίϯςφૢ࡞ʹ͸ ɺ
 low level runtime (ޙड़) Λ࢖༻ 05. High level runtime (CRI runtime) https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/api.proto Kubelet ͔Β CRI (Container Runtime Interface) Λ௨ͯ͠ݺ͹ΕΔ ୅දతͳ΋ͷ͸ container-d, cri-o ͳͲ

Slide 6

Slide 6 text

06. High level runtime (CRI runtime) Kubelet ͔Β CRI (Container Runtime Interface) Λ௨ͯ͠ݺ͹ΕΔ

Slide 7

Slide 7 text

high level rutimeͷ໋ྩʹΑͬͯɺ ࣮ࡍʹίϯςφϓϩηεΛ࣮ߦ͢Δ෦෼ 07. Low level runtime (OCI runtime) ୅දతͳ΋ͷ͸ runc, runsc (gVisor) ͳͲ ͨͩͷόΠφϦ
 state, create, start, kill, delete Λඋ͍͑ͯΔ
 opencontainers/runtime-specͷruntime.mdࢀর

Slide 8

Slide 8 text

08. Low level runtime (OCI runtime) create࣌ʹɺcapability, hostname, mount, ,,,ͳ Ͳίϯςφ࣮ߦʹඞཁͳ৘ใ͕ॻ͔Εͨ config.json ͕౉͞ΕΔ ৄࡉ͸ opencontainers/runtime-spec ͷ
 config.md

Slide 9

Slide 9 text

09. Runtime Shim ࠓ೔ͷຊ୊ 2

Slide 10

Slide 10 text

ίϯςφϓϩηεͱhigh level runtime (containerdͳͲ) ͷؒͷίϛϡχέʔγϣϯΛऔΓ࣋ͭAPI
 ίϯςφͷ໘౗ΛݟΔdaemon 10. Runtime Shimͱ͸

Slide 11

Slide 11 text

runcͷdetached modeͰͷىಈͷྫ (ӈਤ)
 
 low level runtime͸ίϯςφΛ্ཱͪ͛ͨΒ
 exitͯ͠͠·͏
 ͦͯ͠ίϯςφ͸defaultͰhostͷinitϓϩηεʹ
 reparent͞ΕΔ(high level runtime͔ΒΩοΫͨ͠
 ৔߹͸ͦͪΒ)
 
 => ίϯςφϓϩηε(ݽࣇϓϩηε)͕ࢮΜͩ
 ͱ͖ʹ௥͍੾Εͳ͍ɺhigh level runtimeΛ࠶ىಈ
 ͨ͠Γఀࢭ͢Δͱίϯςφ·Ͱࢮ͵ 11. low level runtime ͸Ͳ͜ʹ͍ͬͨʁ https://iximiuz.com/en/posts/implementing-container-runtime-shim/ runc container

Slide 12

Slide 12 text

shim͕ low level runtime ΛΩοΫ
 low level runtime͕exitͨ͠ޙ΋ίϯςφͷ
 ໘౗Λݟͯ͘ΕΔ
 
 ɾίϯςφcreate࣌ͷerror handling΍
 statusͷreport
 ɾίϯςφͷstdout/stderrΛϩάϑΝΠϧ΁
 stream
 ɾexitίʔυͷtrack
 ͜ΕΒΛhigh level runtimeͱڞ༗ 12. Runtime Shimͷ໾ׂ https://iximiuz.com/en/posts/implementing-container-runtime-shim/ runc shim

Slide 13

Slide 13 text

13. Runtime Shimͷ໾ׂ ྫ͑͹conteinerdͷ৔߹ɺcontainerd-shim ͱ͍͏ίϯϙʔωϯτ͕ಉҰϦϙδτϦ಺Ͱ࣮૷͞Ε͍ͯΔ

Slide 14

Slide 14 text

14. ༨ஊ runsc (gVisor) ͱ࿈ܞ͢Δ༻ͷshim΋ଘࡏɻgVisorଆͰϝϯς͞Ε͍ͯΔɻ
 ͪͳΈʹҎલ͸”gvisor-containerd-shim”ͱ͍͏ผϦϙδτϦ͕ͩͬͨɺͪΐͬͱલʹ౷Ұ͞Εͨ

Slide 15

Slide 15 text

15. ༨ஊ gvisor-containerd-shimͰͷissue
 
 ίϯςφ͕OOMͰࢮΜͰΔͬΆ͍͕ɺ
 KubernetesͷํͰϩά͕දࣔ͞Εͳ͍ɻ
 ௐࠪͨ͠ΒɺgVisorͷshimͰOOMΛ
 ఻ୡ͢Δ༻ͷepollͷ࣮૷ൈ͚͕͋ͬͨ
 
 shimʹ͸ίϯςφͷঢ়ଶ΍ϩάΛ
 ্ͷϨΠϠʔʹਖ਼͘͠఻ୡ͢Δ੹຿͕͋Δ

Slide 16

Slide 16 text

16. Runtime Shimͷ໾ׂ ~subreaper~ low level runtime͕exit͢Δͱίϯςφϓϩηε͕hostͷinitϓϩηεʹreparent͞ΕΔ໰୊ shimϓϩηεΛsubreaperͱ͢Δ͜ͱͰɺinitͰ͸ͳ͘shimϓϩηεʹreparent
 
 => shimϓϩηε͕ίϯςφͷexitΛtrackͯ͠ϑΝΠϧͳͲʹॻ͖ࠐΈɺ
 high level runtime͕ޙ͔Βࢀর͢Δ

Slide 17

Slide 17 text

17. Runtime Shimͷ໾ׂ ~subreaper~ ͋Δࢠϓϩηε͕͞Βʹforkͯ͠ଙϓϩηε͕ੜ·ΕΔ
 and ͦͷޙʹࢠϓϩηε͕ࢮΜͩ৔߹
 => ଙϓϩηε͸ݽࣇϓϩηεͱͳΓɺࣗಈతʹPID=1ʹ
 reparent

Slide 18

Slide 18 text

18. Runtime Shimͷ໾ׂ ~subreaper~ subreaperΛ࢖͏ͱ
 
 ΋ͱͷϓϩηε͔Β prctl(2) Λ
 “PR_SET_CHILD_SUBREAPER” ͜ͱҾ਺ʹ͠ ͯݺͿ
 ͜ͷϓϩηεͷࢠϓϩηε΍ͦͷࢠଙʹ͸͢΂ ͯ”subreaper”ͷϚʔΫ͕෇༩͞ΕΔ
 
 ݽࣇϓϩηε͕ࢮΜͩ৔߹
 => ࠷΋͍ۙઌ૆ͷ subreaper ϓϩηε ʹ”SIGCHLD”͕ૹΒΕɺwaitΛ࢖ͬͯऴྃεςʔ λεΛ஌Δ

Slide 19

Slide 19 text

19. Runtime Shimͷ໾ׂ ~ίϯςφͷstdout/stderrͷอ࣋~ high level runtime͕࠶ىಈ/ఀࢭͯ͠΋ɺshim͕ίϯςφͷstdout/stderrͷstreamΛಛఆϑΝΠϧʹྲྀ͢
 docker logs ΍ kubectl logs Ͱ׆͖ͯ͘Δ Container Shim ϩά

Slide 20

Slide 20 text

20. Runtime Shimͷ໾ׂ ~ίϯςφͷstdout/stderrͷอ࣋~ Container Shim ϩά kubectl logs -c hoge

Slide 21

Slide 21 text

21. Runtime Shimͷ໾ׂ ~ίϯςφͷstdout/stderrͷอ࣋~ Container Shim ϩά kubectl logs -c hoge kubelet

Slide 22

Slide 22 text

22. Runtime Shimͷ໾ׂ ~ίϯςφͷstdout/stderrͷอ࣋~ Container Shim ϩά kubectl logs -c hoge kubelet High level

Slide 23

Slide 23 text

23. Runtime Shimͷ໾ׂ ~ίϯςφͷstdout/stderrͷอ࣋~ Container Shim ϩά kubectl logs -c hoge kubelet High level

Slide 24

Slide 24 text

23. Runtime Shimͷ໾ׂ Shim͕ίϯςφͷ؅ཧपΓͷ༷ʑͳλεΫΛר͖औͬͯ͘ΕΔ
 => High level runtime͸ίϯςφͷΩοΫ΍Πϝʔδ؅ཧʹઐ೦

Slide 25

Slide 25 text

24. Wrap Up Runtime Shim ɾHigh/Low level runtime͕஫໨͞Ε͕͕ͪͩɺ͔ܽͤͳ͍ॏཁͳίϯϙʔωϯτ
 ɾLow level runtime͸ίϯςφ࡞ͬͯૣʑexit => Shim͕໘౗ΛݟΔ
 ɾHigh level runtimeʹίϯςφʹؔ͢Δ৘ใΛڞ༗
 ɾ͋Μ·Γ೔ຊޠ৘ใམͪͯͳ͍
 
 Φεεϝͷӳޠهࣄ: https://iximiuz.com/en/posts/implementing-container-runtime-shim/
 minimamͳRuntime ShimΛRustͰ࣮૷͍ͯ͠Δ

Slide 26

Slide 26 text

25. એ఻ ɾίϯςφࣗ࡞ͷిࢠॻ੶ΛΠϯϓϨε͞Μ͔Βग़͠·͢ - @gorilla0513 ͞Μͱڞஶ ɾCNDT2020ͰgVisorͷ࿩Ͱొஃ͢ΔͷͰੋඇ

Slide 27

Slide 27 text

26. ࢀߟࢿྉ ɾImplementing Container Runtime Shim: runc https://iximiuz.com/en/posts/implementing-container-runtime-shim/ ɾDon’t Fear the Subreaper
 https://medium.com/@william.la.martin/dont-fear-the-subreaper-19c8127c031e ɾDealing with process termination in Linux (with Rust examples)
 https://iximiuz.com/en/posts/dealing-with-processes-termination-in-Linux/#awaiting-a-grandchild-process-termination ɾprctl(2) — Linux manual page https://man7.org/linux/man-pages/man2/prctl.2.html