Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Reimagining Kubernetes Pods: Nested Containers ...

Sohan Kunkerkar
December 12, 2024
3

Reimagining Kubernetes Pods: Nested Containers with CRI-O

With user namespaces reaching beta in Kubernetes and new developments in CRI-O, we’re closer to making nested containers within pods more flexible and powerful. Traditionally limited by masked /proc and restricted user namespaces, this approach now offers capabilities similar to Podman. In this talk, we will explore how Kubernetes’ security features—privileged mode, rootless containers, and network isolation—can enable running containers inside pods. We’ll examine the support matrix for various configurations and discuss upcoming work to bring VM-like flexibility to Kubernetes pods for more secure and dynamic container orchestration.

Sohan Kunkerkar

December 12, 2024
Tweet

Transcript

  1. About the Speaker Sohan Kunkerkar Senior Software Engineer - Red

    Hat • CRI-O maintainer • Member of SIG-Node • Love playing the flute • Enjoy trekking and outdoor activities
  2. Agenda • Introduction • Real-world demands and aspirations for Pods

    • Exploring Nested Containers • Examples • Limitations in the Current Pod Model • Upstream Kubernetes Features • Demo • Future Directions • Q&A session
  3. Introduction • What is a Pod? ◦ Smallest deployable unit

    in Kubernetes. ◦ Fundamental building block for deploying applications. • Pod as a Logical Host: ◦ Multiple containers that share the same network namespace and storage volumes. • Primary Use cases: ◦ Microservices needing tight communication. ◦ Sharing volumes for inter-container file access. • Security and Isolation in Pods: ◦ Pods are isolated from other Pods by Kubernetes’ network and storage policies, but containers within the same Pod share certain resources.
  4. Real-world Demands and Aspirations for Pods • Running Containerized Toolchains

    ◦ Enable tools like Podman/Docker or Buildah to run seamlessly inside Pods. • Virtual Machine-like Isolation ◦ Secured and isolated environments for untrusted workloads. • Flexible Development Environments ◦ Support nested development setups within Pods. • Enhanced Security without Privileges ◦ Run workloads securely using rootless or restricted containers.
  5. Limitations with the Current Pod Model • Shared Resources ◦

    Containers share namespaces like /proc, limiting certain workloads ◦ No per-container user namespaces for fine-grained isolation. • Security Limitations ◦ Privileged containers expose Pod environment, increasing risks. ◦ Rootless containers are constrained by Pod-level resource limits. • Need for Flexibility ◦ Running tools or build systems inside Pods. ◦ Enhanced isolation for multi-tenant workloads. Pod Pod Security Standards Pod Security Admission Pod Security Context
  6. User Namespaces Support • https://github.com/kubernetes/enhancements/issues/127 (Beta in v1.31) • Maps

    container root to a non-root user on the host. • Offers stronger isolation, especially in multi-tenant clusters. • How k8s uses it? ◦ Maps container user IDs to different host IDs to reduce privilege escalation risks. ◦ Enables running containers as root inside the container while being non-root on the host. • Why it’s needed? ◦ Isolating security identifiers for enhanced security. ◦ Running privileged processes in pods as unprivileged on the host. ◦ Mitigating security vulnerabilities if containers break out.
  7. UserNamespacesPodSecurityStandards • Alpha in Kubernetes v1.29 • To enhance and

    regulate how UserNamespaces are utilized in multi-tenant environments. • Key Points: ◦ Relaxing Pod Security Standards. ◦ Ensures only compliant workloads can leverage user namespaces. ◦ Requires enabling UserNamespacesPodSecurityStandards feature gate and cluster-wide node compatibility.
  8. Add ProcMount Option • https://github.com/kubernetes/enhancements/issues/4265 (Beta in v1.31) • Default

    /proc mount exposes sensitive host process details to containers like PIDs and Host kernel configuration. • Provides control over how the /proc filesystem is mounted inside containers. • Why it’s needed? ◦ Security Improvement ◦ Controlled Flexibility
  9. What’s CRI-O? Supports OCI based container images, runtimes, and registries

    Implementation of the Kubernetes Container Runtime Interface - compliant with the Open Container Initiative Balance stability and features Focus on security Purpose-built for Kubernetes
  10. Future Directions • Move Kubernetes Features to GA. • Enhanced

    Security and Isolation. • Advanced Use Cases and Real-World Feedback.