Slide 1

Slide 1 text

͘͞ΒΠϯλʔωοτגࣜձࣾ (C) Copyright 1996-2019 SAKURA Internet Inc ͘͞ΒΠϯλʔωοτݚڀॴ PodͱίϯςφϥϯλΠϜͷΞʔΩςΫνϟ 2019/03/25 ্ڃݚڀһ দຊ ྄հ Hosting Casual Talk #5 @͘͞ΒΠϯλʔωοτ෱ԬΦϑΟε

Slide 2

Slide 2 text

2 ɾ͘͞ΒΠϯλʔωοτݚڀॴ ্ڃݚڀһ ɾגࣜձࣾGrooves Forkewll ٕज़ސ໰ ɾϖύϘݚڀॴ ٬һݚڀһ ݚڀސ໰ ɾηΩϡϦςΟɾΩϟϯϓߨࢣ ɾ৘ใॲཧֶձ Πϯλʔωοτͱӡ༻ٕज़ݚڀձ ֤छҕһ ɾژ౎େֶത࢜ʢ৘ใֶʣ দຊ྄հ / ·ͭ΋ͱΓʔ / @matsumotory

Slide 3

Slide 3 text

3 1. ίϯςφϥϯλΠϜͷ෼ྨ 2. PodͱίϯςφϥϯλΠϜΞʔΩςΫνϟ 3. ·ͱΊ ໨࣍

Slide 4

Slide 4 text

1. ίϯςφϥϯλΠϜͷ෼ྨ

Slide 5

Slide 5 text

5 ίϯςφϥϯλΠϜͷϨΠϠʔϞσϧԽ CRI ίϯςφϥϯλΠϜ ϥϯλΠϜ ্هͷΑ͏ʹఆٛ͞ΕΔ͜ͱ͕ଟ͍͕ɺ ίϯςφϥϯλΠϜͷதʹruncͳͲͷ ϥϯλΠϜ͕͋Δͱ͍͏ͷ͸গ͠Θ͔ Γʹ͍͘ɻ CRI CRIϥϯλΠϜ OCI OCIϥϯλΠϜ ίϯςφϥϯλΠϜ ΛϥϯλΠϜͷ໾ׂ ͰϨΠϠʔϞσϧԽ CRIϥϯλΠϜͱOCIϥϯλΠϜͱఆٛ※1ɻ͜ͷ2ͭ ͷϥϯλΠϜΛ·ͱΊͯίϯςφϥϯλΠϜͱ͢Δɻ CRI : Container Runtime Interface OCI: Open Container Initiative Runtime/Image Format Specification ※1 Google CloudͷIan Lewisࢯ͸CRIϥϯλΠϜΛHigh-Level RuntimeɺOCIϥϯλΠϜΛLow-Level Runtimesͱఆٛ https://www.ianlewis.org/en/container-runtimes-part-1-introduction-container-r

Slide 6

Slide 6 text

6 ίϯςφपลͷجຊϨΠϠʔϞσϧ ΦʔέετϨʔγϣϯ CRI CRIϥϯλΠϜ OCI OCIϥϯλΠϜ Podͱίϯςφ܈ ίϯςφͷߏ੒৘ใ΍ΠϝʔδͳͲ͔Β ίϯςφͷϦιʔεׂ౰΍ݖݶ෼཭Λߦͬ ͯίϯςφΛىಈͤ͞ΔOCIϥϯλΠϜ ʢrunCɺrunscɺrunncɺrunVɺkata- runtimeɺcc-runtimeͳͲʣ CRIܦ༝ͰΦʔέετϨʔγϣϯʹجͮ ͖ίϯςφߏ੒৘ใΛड͚औͬͨΓɼ Pod΍ίϯςφΠϝʔδΛ؅ཧ͢ΔCRI ϥϯλΠϜʢcri-oɺcontainerdͳͲʣ

Slide 7

Slide 7 text

7 ྫɿίϯςφपลͷجຊϨΠϠʔϞσϧ kubelet CRI containerd OCI runC Podͱίϯςφ܈ ίϯςφͷߏ੒৘ใ΍ΠϝʔδͳͲ͔Β ίϯςφͷϦιʔεׂ౰΍ݖݶ෼཭Λߦͬ ͯίϯςφΛىಈͤ͞ΔOCIϥϯλΠϜ ʢrunCɺrunscɺrunncɺrunVɺkata- runtimeɺcc-runtimeͳͲʣ CRIͱOCIʹ४ڌ͍ͯ͠Ε͹ɺ ΦʔέετϨʔγϣϯ૚͸ kubernetesΛ࢖͍ͭͭɺ޷͖ʹ CRIϥϯλΠϜ΍OCIϥϯλΠϜ Λஔ͖׵͑Մೳ CRIܦ༝ͰΦʔέετϨʔγϣϯʹجͮ ͖ίϯςφߏ੒৘ใΛड͚औͬͨΓɼ Pod΍ίϯςφΠϝʔδΛ؅ཧ͢ΔCRI ϥϯλΠϜʢcri-oɺcontainerdͳͲʣ

Slide 8

Slide 8 text

2. PodͱίϯςφϥϯλΠϜΞʔΩςΫνϟ

Slide 9

Slide 9 text

9 Podͱίϯςφ • kubernetes͸ΦʔέετϨʔγϣϯπʔϧͱͯ͠CNCFʹΑΔඪ४Խ͕ਐΉ • ૬ޓʹ઀ଓੑͷ͋Δෳ਺ͷίϯςφΛแׅ͢ΔPod • cgroup()΍unshare()ͰαϯυϘοΫεͰ͋ΔPodΛ࡞Δ • ίʔυ্͸Pod͸Sandboxͱ໋໊͞Ε͍ͯΔ͜ͱ͕΄ͱΜͲ • PodʹٻΊΒΕΔཁ݅ • ηΩϡϦςΟɾੑೳɾαʔό΁ͷऩ༰ޮ཰ɾӡ༻ٕज़ͳͲ • Podͷॏཁੑ͕ඇৗʹߴ͘ͳ͖͍ͬͯͯΔ

Slide 10

Slide 10 text

10 PodͷॏཁੑͱPod΁ͷ஫໨͕ߴ·Δ • Pod࣍ୈͰηΩϡϦςΟ΍ੑೳɼऩ༰ޮ཰ɼӡ༻ٕज़͕େ͖͘ӨڹΛड͚Δ • ֤ࣾPodʹؔ࿈͢Δ༷ʑͳιϑτ΢ΣΞΛ࣮૷ɾެ։࢝͠Ί͍ͯΔ • GoogleͷgVisor (ϢʔβϥϯυͰͷΞΫηε੍ޚͰίϯςφΛִ཭) • Nable-Containers (ϢʔβϥϯυͷϢχΧʔωϧͰίϯςφΛִ཭) • AWSͷFirecracker (MicroVMͰίϯςφΛִ཭) • Kata-Containers (VMͰίϯςφΛִ཭)

Slide 11

Slide 11 text

11 PodͱCRI / OCIϥϯλΠϜͷجຊ • Pod͸جຊతʹCRIϥϯλΠϜʹΑͬͯ࡞ΒΕΔ • Podʹؔ͢ΔAPIͷ࢓༷͸CRI࢓༷ʹॻ͔Ε͍ͯΔ • https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/ apis/cri/runtime/v1alpha2/api.proto • CRIϥϯλΠϜ͕ `crictl runp` ͳͲʹΑͬͯPodΛ࡞੒ • CRI Spec΍containerdͷGoͷίʔυతʹ͸ `RunPodSandbox()` • Pod࡞੒ޙʹPodྖҬͰOCIϥϯλΠϜ (runc) ͰίϯςφΛىಈ

Slide 12

Slide 12 text

12 PodͱOCIϥϯλΠϜ • ͦ΋ͦ΋OCI Specʹ͸Podͷݴٴ͸ແ͍ • https://www.opencontainers.org/release-notices/v1-0-0 • OCI Spec͸ίϯςφىಈʹඞཁͳύϥϝʔλ΍handlerͷఆٛͳͲ • runtime-specͱimage-spec • Podͱίϯςφ͸ີ઀ʹؔ܎͕͋Δ • PodͷৼΔ෣͍Λม͍͑ͨͱ͖ʹ͸CRIϥϯλΠϜʹ௥Ճ࣮૷͢΂͖ʁ

Slide 13

Slide 13 text

13 containerdͷRunPodSandboxΛݟΔ • CRIϥϯλΠϜͷ୅දతͳ࣮૷Ͱ͋Δcontainerd • RunPodSandbox()ͷதͰgetSandboxRuntime()ʹΑͬͯݸผ࣮૷Λݺͼग़͢ • ݸผ࣮૷Ͱ͋ΔRuntime Handler͸OCIϥϯλΠϜ͔Βݺͼग़͍ͯ͠Δ • `ociRuntime, err := c.getSandboxRuntime(config, r.GetRuntimeHandler())`

Slide 14

Slide 14 text

14 containerdͷRunPodSandboxΛݟΔ • getSandboxRuntime()ͷதͰworkloadΛνΣοΫͯ͠ϥϯλΠϜΛݺͼग़͢ if untrustedWorkload(config) { if runtimeHandler != "" && runtimeHandler != criconfig.RuntimeUntrusted { return criconfig.Runtime{}, errors.New("untrusted workload with explicit runtime handler is not allowed") } if hostAccessingSandbox(config) { return criconfig.Runtime{}, errors.New("untrusted workload with host access is not allowed") } if c.config.ContainerdConfig.UntrustedWorkloadRuntime.Type != "" { return c.config.ContainerdConfig.UntrustedWorkloadRuntime, nil } runtimeHandler = criconfig.RuntimeUntrusted }

Slide 15

Slide 15 text

15 OCIϥϯλΠϜʹPodΛόΠόε͢Δఆٛ༗Γ • PodͷॲཧΛόΠόε͢ΔͨΊͷ `untrusted-workload` ઃఆ • CRIϥϯλΠϜʹPod࣮૷Λ࠶࣮૷͢ΔͷͰ͸ͳ͘OCIͷ࣮૷Λ࢖͏ • `untrusted-workload` ʹΑͬͯPodͷॲཧΛOCIϥϯλΠϜ΁όΠύε • docker΍OCIϥϯλΠϜ୯ମͰ࢖͏ͱ͖΋sandboxػೳΛఏڙͰ͖ΔΑ͏ʹ apiVersion: v1 kind: Pod metadata: name: container-untrusted annotations: io.kubernetes.cri.untrusted-workload: "true"

Slide 16

Slide 16 text

16 OCIϥϯλΠϜଆͰPodͷॲཧΛ࣮૷͢Δ • ྫ͑͹gVisor͸ϢʔβʔϥϯυΧʔωϧΛPodͱͯ͠࡞੒͢Δ • `crictl runp --runtime=runsc pod-config.json` • gVisorͷOCIϥϯλΠϜͰ͋ΔrunscʹPodͷॲཧΛόΠύε • gvisor-containerd-shimΛ࢖ͬͯcontainerdͷruntime handlerʹϑοΫ • runscଆͰPodͷॲཧΛड͚ͯPod૬౰ͷsandboxΛ࡞੒͢Δ • `createSandboxProcess()` in `gvisor/runsc/sandbox/sandbox.go`

Slide 17

Slide 17 text

17 containerdͷόʔδϣϯͰ֦ுํ๏ͷ͕ࠩ͋Δ 1. containerd v1.1Ҏ߱ͷUntrusted Workload CRI extention͸deprecated 2. containerd v1.2Ҏ্ͰCRI Runtime handlerͰOCIʹόΠύε • https://github.com/google/gvisor-containerd-shim/blob/master/docs/runtime-handler-quickstart.md 3. containerd v1.2Ҏ্Ͱshim v2Λ࢖ͬͨCRI Runtime handlerͰόΠύε • https://github.com/google/gvisor-containerd-shim/blob/master/docs/runtime-handler-shim-v2-quickstart.md • Runtime v2 • https://github.com/containerd/containerd/tree/master/runtime/v2 • containerd-shim-runsc-v1Λ࢖ͬͯઃఆ΋γϯϓϧʹ

Slide 18

Slide 18 text

18 Kata-Containersͷ৔߹΋ಉ༷ • Pod (ίʔυ্͸CreateSandbox())ͷ؅ཧΛCRIϥϯλΠϜ͔ΒόΠύε • CRIϥϯλΠϜͷcri-o͔Βkata-runtimeͰPodͷ؅ཧ΋ड͚औΔ • CRI → RunPodSandbox() → cri-o →create αϒίϚϯυ → kata-runtime → CreateSandbox() → virtcontainers → VM৭ʑઃఆ → hypervisor → proxyىಈ → shim-podىಈ → VM಺agentىಈ → kata-runtime → Podىಈ ׬ྃ → cri-oʹ׬ྃ௨஌ • https://github.com/kata-containers/runtime/blob/master/cli/create.go#L89

Slide 19

Slide 19 text

19 Docker͔Βͷίϯςφىಈͷ৔߹ • DockerίϚϯυͰ࣮ߦ͢Δ৔߹΋VMΛىಈ͔ͤͯ͞ΒίϯςφΛىಈ • OCIϥϯλΠϜʹ͓͚Δ `Create` ίϚϯυͰVMͱίϯςφΛ྆ํىಈ • CreateSandbox() ͔ͯ͠ΒίϯςφΛىಈ • OCI Specʹ͋Δcontainerىಈ࣌ͷ֤छϑοΫͰॲཧΛ͸͞ΜͰVMىಈ • ۩ମతʹ͸ `pre-start` ϑοΫͰVMͷىಈʹඞཁͳॲཧΛߦ͏

Slide 20

Slide 20 text

20 Podʹؔ͢ΔόΠύε΋shim v2Ͱ៉ྷʹ ref: https://github.com/kata-containers/documentation/blob/master/architecture.md

Slide 21

Slide 21 text

3. ·ͱΊ

Slide 22

Slide 22 text

22 PodͱίϯςφϥϯλΠϜͷΞʔΩςΫνϟ • k8s͓ΑͼcontainerdͷCRIϥϯλΠϜ͕PodΛίϯτϩʔϧ • untrustedͳworkloadʹ͓͍ͯ͸Podͷ؅ཧΛOCIϥϯλΠϜʹόΠύε • `crictl runp`ΛOCIϥϯλΠϜʹόΠύεͯ͠OCIϥϯλΠϜ্ͷ `CreateSandbox()` ΍ `StartSandbox()` ͳͲͰPodΛ࡞੒ɾىಈ • Podͷ࢓༷͸OCI Specʹࡌ͍ͬͯͳ͍͕Ͳͷ࣮૷΋OCIϥϯλΠϜͰ࣮ݱ • gVisorɼKata-ContainersɼFirecrackerɼNable-ContainersͳͲ • Pod͚ͩͰͳ͘sandboxͱͯ͠ͷػೳΛOCI୯ମͰ΋ఏڙ͢ΔͨΊͱ൑அ