Slide 1

Slide 1 text

© 2022, Amazon Web Services, Inc. or its affiliates. © 2022, Amazon Web Services, Inc. or its affiliates. firecracker-containerd and SOCI Snapshotter Kazuyoshi Kato (he/him) Sr. Software Development Engineer Amazon Web Services

Slide 2

Slide 2 text

© 2022, Amazon Web Services, Inc. or its affiliates. Linux container primitives • Namespaces – Visibility restrictions • Control groups (cgroups) – Resource limits • Capabilities – Permission rules • Seccomp – Syscall allow/deny lists 2

Slide 3

Slide 3 text

© 2022, Amazon Web Services, Inc. or its affiliates. Is this secure enough? 3 Your app Your sidecar Malicious app Linux kernel Container Container Container

Slide 4

Slide 4 text

© 2022, Amazon Web Services, Inc. or its affiliates. runc CVE-2019-5736 • A malicious actor could overwrite the host runc binary through /proc/self/exe • https://aws.amazon.com/blogs/compute/anatomy-of-cve-2019- 5736-a-runc-container-escape/ 4

Slide 5

Slide 5 text

© 2022, Amazon Web Services, Inc. or its affiliates. Firecracker • “Firecracker is an open source virtualization technology that is purpose-built for creating and managing secure, multi-tenant container and function-based services” • Open-source virtual machine monitor written in Rust • Utilizes hardware-assisted virtualization through Linux’s KVM • Minimalistic design to support only ”serverless” workloads • Not aware about Linux containers 5

Slide 6

Slide 6 text

© 2022, Amazon Web Services, Inc. or its affiliates. Running containers with Firecracker 6 Your app Your sidecar Malicious app Linux kernel Linux kernel Firecracker Firecracker Linux kernel Container Container Container

Slide 7

Slide 7 text

© 2022, Amazon Web Services, Inc. or its affiliates. Firecracker + containerd = firecracker-containerd • Secure isolation through Firecracker’s virtualization • Convenience and familiarity of containers from containerd • https://github.com/firecracker-microvm/firecracker-containerd 7

Slide 8

Slide 8 text

© 2022, Amazon Web Services, Inc. or its affiliates. firecracker-containerd implementation 8 fc-control (plugin) containerd-shim- aws-firecracker (runtime/shim) agent runc container client (e.g. ctr) firecracker-containerd (daemon) containerd Firecracker microVM

Slide 9

Slide 9 text

© 2022, Amazon Web Services, Inc. or its affiliates. Working with containerd community • Two maintainers from Amazon Web Services • OpenTelemetry tracing support • Device Mapper Snapshotter 9

Slide 10

Slide 10 text

© 2022, Amazon Web Services, Inc. or its affiliates. What are snapshotters? • Snapshotter converts container images to filesystems • “Graph Driver” in Docker Engine • overlay (default) • devmapper (used by firecracker-containerd) • btrfs, aufs, zfs, … 10

Slide 11

Slide 11 text

© 2022, Amazon Web Services, Inc. or its affiliates. Lazy-loading snapshotters • Stargz Snapshotter • Nydus Snapshotter • Downloading a container image and assembling the filesystem from the image is time-consuming • Containers don’t need the all files on the images to start doing useful work 11

Slide 12

Slide 12 text

© 2022, Amazon Web Services, Inc. or its affiliates. Lazy-loading without conversion • Explicitly converting images and managing them is cumbersome • Implicitly converting images have negative security implications • For example, image signing wouldn’t work if AWS implicitly converts images 12

Slide 13

Slide 13 text

© 2022, Amazon Web Services, Inc. or its affiliates. SOCI Snapshotter • SOCI Snapshotter is a new lazy-loading snapshotter, based on Stargz Snapshotter • Utilizes FUSE and HTTP’s ranged GET • No image conversion • Workload-specific load order optimization • https://github.com/awslabs/soci-snapshotter 13

Slide 14

Slide 14 text

© 2022, Amazon Web Services, Inc. or its affiliates. SOCI: Seekable OCI 14 OCI Image Layer 1 Layer 2 Layer 3 SOCI Index zTOC 1 zTOC 2 zTOC 3

Slide 15

Slide 15 text

© 2022, Amazon Web Services, Inc. or its affiliates. zTOC 15 Checkpoint M /bin/ls TOC entry Compressed Span M zTOC N Compressed Layer N Uncompressed Span M /bin/ls data

Slide 16

Slide 16 text

© 2022, Amazon Web Services, Inc. or its affiliates. Workload-specific load order optimization • The list of to-be-prefetched files wouldn’t be 1:1 to container images • Base images (e.g. Python 3) would have multiple possible prefetch lists, depending on upper application layers • The lists could be more dynamic than container images themselves. 16

Slide 17

Slide 17 text

© 2022, Amazon Web Services, Inc. or its affiliates. Workload-specific load order optimization 17 Python 3 ML training application Web application Load order document Load order document Python 3 Debian Bullseye Debian Bullseye

Slide 18

Slide 18 text

© 2022, Amazon Web Services, Inc. or its affiliates. What’s next? • Better code sharing with Stargz Snapshotter • Support OCI Reference Types instead of ORAS • Finalize SOCI Index and zTOC format (e.g. getting rid of encoding/gob) • Load order optimization • https://github.com/awslabs/soci-snapshotter 18

Slide 19

Slide 19 text

© 2022, Amazon Web Services, Inc. or its affiliates. Thank you! © 2022, Amazon Web Services, Inc. or its affiliates. Kazuyoshi Kato @kzys [email protected]