Podとコンテナランタイムのアーキテクチャ

 Podとコンテナランタイムのアーキテクチャ

Hosting Casual Talk #5 @さくらインターネット福岡オフィス

2019/03/22
さくらインターネット株式会社
さくらインターネット研究所

松本亮介 / まつもとりー / @matsumotory

2b692bd83f4418103142a053ecf5ff59?s=128

MATSUMOTO Ryosuke

March 22, 2019
Tweet

Transcript

  1. 2.

    2 ɾ͘͞ΒΠϯλʔωοτݚڀॴ ্ڃݚڀһ ɾגࣜձࣾGrooves Forkewll ٕज़ސ໰ ɾϖύϘݚڀॴ ٬һݚڀһ ݚڀސ໰ ɾηΩϡϦςΟɾΩϟϯϓߨࢣ

    ɾ৘ใॲཧֶձ Πϯλʔωοτͱӡ༻ٕज़ݚڀձ ֤छҕһ ɾژ౎େֶത࢜ʢ৘ใֶʣ দຊ྄հ / ·ͭ΋ͱΓʔ / @matsumotory
  2. 5.

    5 ίϯςφϥϯλΠϜͷϨΠϠʔϞσϧԽ CRI ίϯςφϥϯλΠϜ ϥϯλΠϜ ্هͷΑ͏ʹఆٛ͞ΕΔ͜ͱ͕ଟ͍͕ɺ ίϯςφϥϯλΠϜͷதʹruncͳͲͷ ϥϯλΠϜ͕͋Δͱ͍͏ͷ͸গ͠Θ͔ Γʹ͍͘ɻ CRI

    CRIϥϯλΠϜ OCI OCIϥϯλΠϜ ίϯςφϥϯλΠϜ ΛϥϯλΠϜͷ໾ׂ ͰϨΠϠʔϞσϧԽ CRIϥϯλΠϜͱOCIϥϯλΠϜͱఆٛ※1ɻ͜ͷ2ͭ ͷϥϯλΠϜΛ·ͱΊͯίϯςφϥϯλΠϜͱ͢Δɻ CRI : Container Runtime Interface OCI: Open Container Initiative Runtime/Image Format Specification ※1 Google CloudͷIan Lewisࢯ͸CRIϥϯλΠϜΛHigh-Level RuntimeɺOCIϥϯλΠϜΛLow-Level Runtimesͱఆٛ https://www.ianlewis.org/en/container-runtimes-part-1-introduction-container-r
  3. 6.

    6 ίϯςφपลͷجຊϨΠϠʔϞσϧ ΦʔέετϨʔγϣϯ CRI CRIϥϯλΠϜ OCI OCIϥϯλΠϜ Podͱίϯςφ܈ ίϯςφͷߏ੒৘ใ΍ΠϝʔδͳͲ͔Β ίϯςφͷϦιʔεׂ౰΍ݖݶ෼཭Λߦͬ

    ͯίϯςφΛىಈͤ͞ΔOCIϥϯλΠϜ ʢrunCɺrunscɺrunncɺrunVɺkata- runtimeɺcc-runtimeͳͲʣ CRIܦ༝ͰΦʔέετϨʔγϣϯʹجͮ ͖ίϯςφߏ੒৘ใΛड͚औͬͨΓɼ Pod΍ίϯςφΠϝʔδΛ؅ཧ͢ΔCRI ϥϯλΠϜʢcri-oɺcontainerdͳͲʣ
  4. 7.

    7 ྫɿίϯςφपลͷجຊϨΠϠʔϞσϧ kubelet CRI containerd OCI runC Podͱίϯςφ܈ ίϯςφͷߏ੒৘ใ΍ΠϝʔδͳͲ͔Β ίϯςφͷϦιʔεׂ౰΍ݖݶ෼཭Λߦͬ

    ͯίϯςφΛىಈͤ͞ΔOCIϥϯλΠϜ ʢrunCɺrunscɺrunncɺrunVɺkata- runtimeɺcc-runtimeͳͲʣ CRIͱOCIʹ४ڌ͍ͯ͠Ε͹ɺ ΦʔέετϨʔγϣϯ૚͸ kubernetesΛ࢖͍ͭͭɺ޷͖ʹ CRIϥϯλΠϜ΍OCIϥϯλΠϜ Λஔ͖׵͑Մೳ CRIܦ༝ͰΦʔέετϨʔγϣϯʹجͮ ͖ίϯςφߏ੒৘ใΛड͚औͬͨΓɼ Pod΍ίϯςφΠϝʔδΛ؅ཧ͢ΔCRI ϥϯλΠϜʢcri-oɺcontainerdͳͲʣ
  5. 11.

    11 PodͱCRI / OCIϥϯλΠϜͷجຊ • Pod͸جຊతʹCRIϥϯλΠϜʹΑͬͯ࡞ΒΕΔ • Podʹؔ͢ΔAPIͷ࢓༷͸CRI࢓༷ʹॻ͔Ε͍ͯΔ • https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/

    apis/cri/runtime/v1alpha2/api.proto • CRIϥϯλΠϜ͕ `crictl runp` ͳͲʹΑͬͯPodΛ࡞੒ • CRI Spec΍containerdͷGoͷίʔυతʹ͸ `RunPodSandbox()` • Pod࡞੒ޙʹPodྖҬͰOCIϥϯλΠϜ (runc) ͰίϯςφΛىಈ
  6. 12.

    12 PodͱOCIϥϯλΠϜ • ͦ΋ͦ΋OCI Specʹ͸Podͷݴٴ͸ແ͍ • https://www.opencontainers.org/release-notices/v1-0-0 • OCI Spec͸ίϯςφىಈʹඞཁͳύϥϝʔλ΍handlerͷఆٛͳͲ

    • runtime-specͱimage-spec • Podͱίϯςφ͸ີ઀ʹؔ܎͕͋Δ • PodͷৼΔ෣͍Λม͍͑ͨͱ͖ʹ͸CRIϥϯλΠϜʹ௥Ճ࣮૷͢΂͖ʁ
  7. 14.

    14 containerdͷRunPodSandboxΛݟΔ • getSandboxRuntime()ͷதͰworkloadΛνΣοΫͯ͠ϥϯλΠϜΛݺͼग़͢ if untrustedWorkload(config) { if runtimeHandler !=

    "" && runtimeHandler != criconfig.RuntimeUntrusted { return criconfig.Runtime{}, errors.New("untrusted workload with explicit runtime handler is not allowed") } if hostAccessingSandbox(config) { return criconfig.Runtime{}, errors.New("untrusted workload with host access is not allowed") } if c.config.ContainerdConfig.UntrustedWorkloadRuntime.Type != "" { return c.config.ContainerdConfig.UntrustedWorkloadRuntime, nil } runtimeHandler = criconfig.RuntimeUntrusted }
  8. 15.

    15 OCIϥϯλΠϜʹPodΛόΠόε͢Δఆٛ༗Γ • PodͷॲཧΛόΠόε͢ΔͨΊͷ `untrusted-workload` ઃఆ • CRIϥϯλΠϜʹPod࣮૷Λ࠶࣮૷͢ΔͷͰ͸ͳ͘OCIͷ࣮૷Λ࢖͏ • `untrusted-workload`

    ʹΑͬͯPodͷॲཧΛOCIϥϯλΠϜ΁όΠύε • docker΍OCIϥϯλΠϜ୯ମͰ࢖͏ͱ͖΋sandboxػೳΛఏڙͰ͖ΔΑ͏ʹ apiVersion: v1 kind: Pod metadata: name: container-untrusted annotations: io.kubernetes.cri.untrusted-workload: "true"
  9. 16.

    16 OCIϥϯλΠϜଆͰPodͷॲཧΛ࣮૷͢Δ • ྫ͑͹gVisor͸ϢʔβʔϥϯυΧʔωϧΛPodͱͯ͠࡞੒͢Δ • `crictl runp --runtime=runsc pod-config.json` •

    gVisorͷOCIϥϯλΠϜͰ͋ΔrunscʹPodͷॲཧΛόΠύε • gvisor-containerd-shimΛ࢖ͬͯcontainerdͷruntime handlerʹϑοΫ • runscଆͰPodͷॲཧΛड͚ͯPod૬౰ͷsandboxΛ࡞੒͢Δ • `createSandboxProcess()` in `gvisor/runsc/sandbox/sandbox.go`
  10. 17.

    17 containerdͷόʔδϣϯͰ֦ுํ๏ͷ͕ࠩ͋Δ 1. containerd v1.1Ҏ߱ͷUntrusted Workload CRI extention͸deprecated 2. containerd

    v1.2Ҏ্ͰCRI Runtime handlerͰOCIʹόΠύε • https://github.com/google/gvisor-containerd-shim/blob/master/docs/runtime-handler-quickstart.md 3. containerd v1.2Ҏ্Ͱshim v2Λ࢖ͬͨCRI Runtime handlerͰόΠύε • https://github.com/google/gvisor-containerd-shim/blob/master/docs/runtime-handler-shim-v2-quickstart.md • Runtime v2 • https://github.com/containerd/containerd/tree/master/runtime/v2 • containerd-shim-runsc-v1Λ࢖ͬͯઃఆ΋γϯϓϧʹ
  11. 18.

    18 Kata-Containersͷ৔߹΋ಉ༷ • Pod (ίʔυ্͸CreateSandbox())ͷ؅ཧΛCRIϥϯλΠϜ͔ΒόΠύε • CRIϥϯλΠϜͷcri-o͔Βkata-runtimeͰPodͷ؅ཧ΋ड͚औΔ • CRI →

    RunPodSandbox() → cri-o →create αϒίϚϯυ → kata-runtime → CreateSandbox() → virtcontainers → VM৭ʑઃఆ → hypervisor → proxyىಈ → shim-podىಈ → VM಺agentىಈ → kata-runtime → Podىಈ ׬ྃ → cri-oʹ׬ྃ௨஌ • https://github.com/kata-containers/runtime/blob/master/cli/create.go#L89
  12. 19.

    19 Docker͔Βͷίϯςφىಈͷ৔߹ • DockerίϚϯυͰ࣮ߦ͢Δ৔߹΋VMΛىಈ͔ͤͯ͞ΒίϯςφΛىಈ • OCIϥϯλΠϜʹ͓͚Δ `Create` ίϚϯυͰVMͱίϯςφΛ྆ํىಈ • CreateSandbox()

    ͔ͯ͠ΒίϯςφΛىಈ • OCI Specʹ͋Δcontainerىಈ࣌ͷ֤छϑοΫͰॲཧΛ͸͞ΜͰVMىಈ • ۩ମతʹ͸ `pre-start` ϑοΫͰVMͷىಈʹඞཁͳॲཧΛߦ͏
  13. 21.
  14. 22.

    22 PodͱίϯςφϥϯλΠϜͷΞʔΩςΫνϟ • k8s͓ΑͼcontainerdͷCRIϥϯλΠϜ͕PodΛίϯτϩʔϧ • untrustedͳworkloadʹ͓͍ͯ͸Podͷ؅ཧΛOCIϥϯλΠϜʹόΠύε • `crictl runp`ΛOCIϥϯλΠϜʹόΠύεͯ͠OCIϥϯλΠϜ্ͷ `CreateSandbox()`

    ΍ `StartSandbox()` ͳͲͰPodΛ࡞੒ɾىಈ • Podͷ࢓༷͸OCI Specʹࡌ͍ͬͯͳ͍͕Ͳͷ࣮૷΋OCIϥϯλΠϜͰ࣮ݱ • gVisorɼKata-ContainersɼFirecrackerɼNable-ContainersͳͲ • Pod͚ͩͰͳ͘sandboxͱͯ͠ͷػೳΛOCI୯ମͰ΋ఏڙ͢ΔͨΊͱ൑அ