Upgrade to Pro — share decks privately, control downloads, hide ads and more …

CRI-O Odyssey: Exploring New Frontiers in Conta...

Sohan Kunkerkar
April 11, 2024
9

CRI-O Odyssey: Exploring New Frontiers in Container Runtimes

No journey is ever really done, it only continues differently. CRI-O's journey in continuing its effort to be the best container runtime made specifically for Kubernetes is no different, even after graduating within the CNCF. In this talk, join the CRI-O developers as they walk you through the new frontiers of container runtimes: such as integration with WebAssembly (WASM), secured and simplified Podman-in-Kubernetes, and the present and future of Confidential Computing. This session will also cover initiatives CRI-O is following within SIG-Node, such as CRI stats and separate image file systems.This session caters to both newcomers and seasoned users, offering insights into CRI-O's new features and journey beyond.

Sohan Kunkerkar

April 11, 2024
Tweet

Transcript

  1. Contents • Introduction and Recognition of CRI-O’s graduation within the

    CNCF • Commitment to ongoing evolution and improvement • Overview of recent updates and improvements to CRI-O • Exploration of the present and future landscape of Confidential Containers • Initiatives within SIG-Node ◦ CRI stats ◦ Split Image Filesystem • New Frontiers in Container Runtimes ◦ Unlocking the potential with WASM ◦ Podman-in-Kubernetes: Simple and Secured • Future work
  2. What’s new in CRI-O v1.30? • Support for OCI artifact

    seccomp profiles ◦ https://kubernetes.io/blog/2024/03/07/cri-o-seccomp-oci-artifacts/ • s390x architecture support • Enable support for split Image filesystem • Support to specify timezone for pods/containers • Automatic OpenTelemetry instrumentation of ttrpc calls to NRI plugins
  3. Confidential Containers support in CRI-O What is Confidential Containers ?

    Goal: Run sensitive workloads (Pods) in a Trusted Execution Environment (TEE) Execute the container in a VM (using kata-containers) Key concept: don’t trust the host! https://www.cncf.io/projects/confidential-containers/
  4. Relying Party Confidential VM Key Broker Service (KBS) + Attestation

    Service VMM (Cloud-Hypervisor/QEMU) with hardware support (TDX/SEV-SNP/etc) CRI-O containerd kata-shim-v2 kata-agent Container Image 🔑 image management Restricted API via vsock Confidential Containers Attestation Agent firmware kubelet Container Image Registry image-rs Worker node
  5. Confidential Containers support in CRI-O What is needed from CRI-O

    ? • Provide the image ID to the VM • First attempt: relay the “PullImageRequest” But too invasive for both containerd and CRI-O • Now: relies on optional data added to the “CreateContainerRequest” • In containerd: implementation is made in the Nydus Snapshotter { "volume_type":"image_guest_pull", "source":"quay.io/myrepo/image:tag", "fs_type":"overlayfs", "image_pull":{ "metadata":{ } } }
  6. Confidential Containers support in CRI-O What we have done •

    no “snapshotter” plugin in CRI-O • modify the CreateContainerRequest while processing it • Less invasive, because it only touches the kata-specific code in CRI-O • need to prevent CRI-O from pulling the image
  7. Confidential Containers support in CRI-O What remains • Pull in

    host with shared volume Benefit: download the image once, and share it to all VMs that run the same container • Possible KEP to simplify image management from the runtime? https://github.com/containerd/containerd/issues/9377
  8. CRI Stats/Metrics Update Goals • Improve performance and reduce confusion

    on metrics collection in the Kubelet. => single component for pod level metrics in Summary API. • Eliminate dependencies on container runtime clients used by cAdvisor. • Enhance CRI implementations to provide metrics analogous to the existing metrics provided by /metrics/cadvisor.
  9. Stats/Metrics Today Kubelet exposes the cAdvisor metrics via • /metrics/cadvisor

    (direct prometheus) kubectl get --raw "/api/v1/nodes/kind-worker/proxy/metrics/cadvisor" • /stats/summary (json) kubectl get --raw "/api/v1/nodes/kind-worker/proxy/stats/summary" • /metrics/resource (metrics server) Kubelet also depends on cAdvisor for: • Gathering node level stats • Eviction Manager
  10. Stats/Metrics Tomorrow Kubelet exposes the CRI metrics via • /metrics/cadvisor

    Interpreted from Metrics object of CRI • /stats/summary , /metrics/resource Interpreted from Stats object of CRI Kubelet still depends on cAdvisor for: • Gathering node level stats • Eviction Manager
  11. CRI Stats/Metrics Update • Aiming to support CRI Metrics in

    1.30.0 • Advocating for containerd support • KEP to Beta after that
  12. Split Image Filesystem Update • Currently in Alpha state in

    Kubernetes 1.30 • Separates read-only image layers from writable container data • Allows Separate Filesystem for Read-Only Layers (Images) • Writable Layers and Ephemeral Storage on the Same Filesystem • Current limitation: doesn’t support separating writeable layers from the nodefs
  13. Split Image Filesystem Update • Container runtime filesystem ◦ read-only

    + writeable layer = imagefs • Configure /etc/containers/storage.conf for temporary and the primary storage location • Future work: ◦ Improved eviction policies ◦ Clearer ephemeral storage reporting ◦ More runtime configuration options • Blog link: https://kubernetes.io/blog/2024/01/23/kubernet es-separate-image-filesystem/
  14. Unlocking the Potential with WASM • Edge Computing Agility ◦

    Enables cross-architecture, lightweight deployments • Dynamic Scaling ◦ Low disk footprint and rapid startup time optimize resource efficiency • Security-Enhanced Microservices: ◦ Provides module signing and runtime security controls • Polyglot Microservices Architecture ◦ Enables polyglot programming for language flexibility
  15. Implementation • Treat images with the wasi/wasm as WASM by

    default • Introduction to platform_runtime_paths to the RuntimeConfig • https://github.com/cri-o/cri-o/pull/7180
  16. Podman-in-Kubernetes: Simple and Secured • Add ProcMount option ◦ https://github.com/kubernetes/enhancements/issues/4265

    ▪ Current runtime practice of masking /proc paths ▪ Need for unmasking in nested unprivileged containers • Support User Namespaces in Pods ◦ https://github.com/kubernetes/enhancements/issues/127 ▪ Isolating security identifiers for enhanced security ▪ Running privileged processes in pods as unprivileged on the host ▪ Mitigating security vulnerabilities if containers break out
  17. User Namespaced Pod • Requires ◦ Newest container-selinux package ◦

    Kernel with idmapped mounting ◦ Kubernetes and CRI-O >= 1.29.0
  18. Future work Explore and contribute to CRI-O's feature roadmap: https://github.com/orgs/cri-o/projects/1

    Upcoming highlights: • Automatic reloading of mirror config registry • Implementation of a Rust-based NRI framework • WASM plugins loaded directly into CRI-O (instead of NRI) • Add support for FreeBSD