How to get Slim with SOCI and Star(g)s: Minifying Containers with Lazy Image Loading in ContainerD

How to get Slim with SOCI and Star(g)s Minifying Containers
with Lazy Image Loading in ContainerD

Minimal Container Images

• Minimal container images is a step towards meeting the
container best practices • Different ways to create them ◦ DIY ◦ … ◦ DockerSlim/MinToolkit ▪ Performs static and dynamic image analysis ▪ Collects runtime container telemetry

ONLY WHAT YOU NEED

• Original way DockerSlim collects runtime telemetry: ◦ Container level
sensor is added to a temporary container it creates • New way with a system level sensor (WIP): ◦ Uses eBPF • Both telemetry modes have their own use cases and trade offs • Both modes are trying to figure out what you need to have in the container. • Is it possible to get the same data without eBPF or instrumenting the containers?

Lazy (Image) Loading

• Is it possible to use lazy image loading to
know what you need in containers? • What is lazy loading? ◦ A way to start containers before their entire image is fetched ◦ Loading the image files on demand when you need them instead of loading all of them ahead of time • If you can observe the lazy loading process you will know what you need in the container

ContainerD + Remote Snapshotters

• Lazy loading is not available by default • Takes
advantage of the extensible ContainerD design • Possible because ContainerD has pluggable container filesystem services called Snapshotters

• Two ContainerD snapshotters (Stargz and SOCI) ◦ Enable lazy
loading ◦ Designed as remote (file system) snappshotters ◦ Configured as proxy plugins in ContainerD

Container (Image) Format

• Why don’t we have lazy loading by default? ◦
Because the standard container image format is not designed to support it • The container image format is designed around the container image build flow and NOT the container execution flow

• Container images are like trees ◦ They grow over
time by adding rings/layers • The image composition is based on how you build the image and not how you run it • What makes it worse: ◦ Each layer is designed to be an opaque blob… optimized for storage ◦ Each layer is a TAR archive (can be GZIP’ed) ◦ GZIP and TAR do not allow random I/O

(e)Stargz and SOCI: TLDR

• Alternative designs to load partial layer data • Both
require image format enhancements • Stargz: ◦ Repackages/re-encodes layer data ◦ Adds extra layer metadata • SOCI: ◦ No changes to target image ◦ Extra metadata stored as separate artifacts in container registry

FUSE FS + LAYER INDEX + BLOB RANGE REQUESTS

• Both create an INDEX for each layer ◦ Offsets
for file data ◦ Metadata for files/dirs/symlinks ◦ Snapshotter-specific INDEX data ▪ SOCI has extra data due to its design • The INDEX location depends on the snapshotter ◦ Appended to layer (Stargz) or ◦ Stored separately in the registry (SOCI) • Partial layer blob data is requested with the RANGE HTTP request header. • The INDEX is loaded before container starts • FUSE FS callbacks initiate lazy loading

• The original/target image is not changed • Not changing
the container images requires additional index metadata to have random I/O ◦ because the layers could be GZ compressed • Each requestable chunk (aka span) in layer metadata needs GZ decompression state

• SOCI Index Manifest ◦ Like a regular image manifest
+ “subject” ◦ Empty config ◦ Layer records point to layer index metadata (zTOC) ◦ zTOC index same as target layer index ◦ “com.amazon.soci.image-layer-digest” - layer digest from original/target image • Generate SOCI index with “soci create” (need to use small(er) minimal layer size and span size parameters for better minification results) • Layer index data (zTOC) is binary (“soci ztoc”)

(e)Stargz

• No need to track decompression state ◦ BUT requires
image/layer repackaging • Each layer file is TAR’ed and GZip’ed ◦ TAR’ed/GZip’ed files are combined into one GZIP stream • Layer index/TOC is saved as a file (“stargz.index.json”) at the end of the layer • eStargz layer ends with a Footer (has TOC offset) • Special Landmark files indicate what to pre-fetch • Only “file” data is lazy loaded ◦ not directories or symlinks

• Optional pre-fetched files (loaded/fetched when container starts) are moved
to the beginning of each eStargz layer ◦ “.prefetch.landmark” file added to show where they end • Layer index data (TOC) is text/json (“stargz.index.json”) unlike zTOC with SOCI • “Mount” flow (separate layer blob HTTP requests) ◦ Read layer Footer ◦ Read layer TOC/stargz.index.json ◦ Read all prefetch files

• “uname -a” - runs in the target Stargz container
• “bytes=5276000-5291999” range for layer “sha256:80505726a0e2e80ef6336f4976e6a30ed3 de572934224603cdc66801bbf195b9”- requested from registry (lower right side log window) • “bin/uname” record offset (“5276730”) from “stargz.index.json” is matched with the requested range (lower left side window) • Stargz Snapshotter config (no background fetch and a small data request chunk size, 2k)

Take Aways

• Yes, it’s possible to use Stargz/SOCI snapshotters to minify
images • BUT it requires custom configs (must disable background fetching) • AND you need to use/config small chunks/spans (which may result in large SOCI index data because it keeps track of the GZ decompression state) • Need to host the fat images in your registry that logs HTTP requests (“range” headers and the layer blob path)

Next Steps (MinToolkit Features)

• Ability to convert regular images to Stargz images •
Ability to generate and publish SOCI Index data • Stargz images as “slim”/”build” command output • Stargz-based minification mode (linux only) https://github.com/mintoolkit/mint

Thank You!

Kyle Quest • Creator, DockerSlim (aka SlimToolkit/minToolkit) • Founder, AutonomousPlane
• Founder/CTO, Slim.AI • https://twitter.com/kcqon • https://github.com/kcq

How to get Slim with SOCI and Star(g)s: Minifyi...

How to get Slim with SOCI and Star(g)s: Minifying Containers with Lazy Image Loading in ContainerD

Kyle Quest

More Decks by Kyle Quest

Other Decks in Technology

Featured

Transcript