Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Container Image Architecture

Container Image Architecture

Do you know how to be built the Dockerfile?
I'll talk about OCI image specification and docker image architecture.
It's the way to build images, to probe cache and to parse the Dockerfile.

Aya (Igarashi) Ozawa

October 31, 2016
Tweet

More Decks by Aya (Igarashi) Ozawa

Other Decks in Technology

Transcript

  1. Overview Terminology Calculate IDs Image layout Filesystem layers 01 OCI

    IMAGE SPEC 02 DOCKER IMAGE ARCHITECTURE Directory structure 03 HOW TO CREATE A DOCKER IMAGE Overview Building Logs Parse the Dockerfile Build images Probe Cache 04 BEST PRACTICE OF THE DOCKERFILE Reduce a image size Use caches well AGENDA
  2. Development happens on GitHub for the spec. 01 HOW TO

    MANAGE? 02 DEFINED Basic image layout (constitutive formats) Filesystem layers Overview Latest version is v1.0.0-rc1! 03 NOT YET DEFINED Security Practices Caching mechanism signatures is under discussion now :)
  3. Vincent Batts Antonio Murdaca W. Trevor King Brandon Philips Lei

    Jitang Jonathan Boulle Sergiusz Urbaniak Stephen Day Rob Dolin xiekeyang TOP 10 CONTRIBUTORS Who defines it? ATOMIC, OCI docker, ATOMIC, kubernetes-incubater numfocus, ipfs, swcarpentry coreos, systemd, goraft huawei cores cores docker microsoft huawei
  4. Manifest List Terminology Layer 0 Layer 1 Layer n Changesets

    Images n 1 1 n Image Config Manifest Refs Repository v1, latest, etc. amd, ppc64, etc. 1 n 1 n ENV, cpu, etc. 1
  5. Calculate identifiers Layer 0 Layer 1 Layer n Manifest List

    Image Config Manifest Manifest List ID Image Config ID Manifest ID TAKE DIGEST OF JSON FILE Take digest of unpackaged data. DIFF ID ChainID(layerN) = SHA256hex(ChainID(layerN-1) + " " + DiffID(layerN)) CHAIN ID Hash algorithm is SHA256
  6. 01 02 03 Image layout $ tree blobs/ └── sha256/

    # Manifeset list └── e692418e4cbaf90ca69d05a66403747baa33ee08806650b51fab815ad7fc331f # Manifest └── afff3924849e458c5ef237db5f89539274d5e609db5db935ed3959c90f1f2d51 # Image Configure └── 5b0bcabd1ed22e9fb1310cf6c2dec7cdef19f0ad69efa1f392e94a4333501270 # Layer └── e692418e4cbaf90ca69d05a66403747baa33ee08806650b51fab815ad7fc331f refs/ # tag └── v1.0 └── stable-release tar stream Refs object has Manifest list ID Manifest list has Manifest IDs Manifest has ImageID & LayerIDs Example
  7. Filesystem layers Additions Modifications Removals SUPPORTED CHANGE TYPES Additions &

    Modifications are same movement. R/O R/O Union filesystem R/W A whiteout filename consists of the prefix .wh. plus the basename of the path to be deleted. MARKER OF DELETED FILES A A A ADD MODIFY DEL
  8. Filesystem layers Additions Modifications Removals SUPPORTED CHANGE TYPES Writable layer

    is added when you run container. R/O R/O Union filesystem R/W A whiteout filename consists of the prefix .wh. plus the basename of the path to be deleted. MARKER OF DELETED FILES A A ADD MODIFY DEL A A C B A .wh. A
  9. What is difference? Q WHAT IS DIFFERENCE BETWEEN DOCKER AND

    OCI-SPEC? There is no significant difference now. Such as the file name and configuration file entries are different. Q HOW DOCKER SUPPORTS THE OCI SPECIFICATION? Docker is not going to fit the OCI image specification, it seems to develop a format converter. https://github.com/docker/docker/issues/25779 rkt looks like a positive attitude to support the OCI-spec. Originally, it takes an architecture that can support multiple image formats. https://github.com/coreos/rkt/issues/3204 FYI
  10. Directory structure |-- image | `-- overlay | |-- imagedb

    | | |-- content | | | `-- sha256 | | | `-- ee4603260daafe1a8c2f3b78fd760922918ab2441cbb28… | |-- layerdb | | |-- sha256 | | | `-- 9007f5987db353ec398a223bc5a135c5a9601798ba20a1abba… | | | |-- cache-id | | | |-- diff | | | |-- size | | | `-- tar-split.json.gz | `-- repositories.json https://github.com/vbatts/tar-split What is tar-split file? Docker use tar-split library.
 It’s provide consistent tar archives for the image layer content. Images and layers are arranged in different directory
  11. Overview Docker client Docker daemon $ docker build -t test

    ./ 1 Analyse command arguments and compress the target directory to tarball. 02 03 Call build image Remote API Default
 Unix socket domain 04 If dockerignore file exists, the client modifies the context to exclude files and directories that match patterns in it. Create builder instance 1. Check CacheFrom option 2. Get image cache (optional) 3. Parse Dockerfile Builder Instance Fetch a image as a cache If CacheFrom flag is ture. 4.2 Dockerfile Node Parse and return AST AST(nodes) 4.3
  12. Overview 05 Build images 1. Parse LABEL command 2. Execute

    all commands 1. Probe cache 2. Start a temporal container 3. Execute a command 4. Commit changes 3. Check unused ARG 4. Associate tags to image Parent image layer 0 Parent image layer n Current layer Execute a command Node 5.2.2 AST Base image layer 0 Base image layer n Read Only Docker daemon Writable layer If not found cache
  13. Building log Send tarball $ docker build -t test ./

    # Call remote API Sending build context to Docker daemon 557.1 kB Sending build context to Docker daemon 1.114 MB Sending build context to Docker daemon 1.425 MB # Fetch base image Step 1 : FROM python:3.5.2-alpine ---> a047e3d0ae2b # Show detail Step 2 : RUN apk add --no-cache git # Run container ---> Running in 7785e4cfb01a # Run command & Commit container ---> f413fb825b75 # Delete temporary container Removing intermediate container 7785e4cfb01a Successfully built f413fb825b75 Use parent image
  14. 1. Create root Node 2. Detect command 1. Skip comment

    or empty line 2. Send value to the parse dispatcher 3. Create Nodes 1. Put Node in the Next 4. Return Node 3. Append Node to the Child slice 4. Return root Node Parse the Dockerfile Node Value: string Next: Node Children: [ ]Node Node Value Next Children Node Node Node Node Children[0] root value is CMD name Children[1] Children[n] LABEL command is parsed separately. The Docker daemon collect it commands into one Node and append to Children of the root Node.. https://docs.docker.com/engine/reference/builder/#/label
  15. Probe Cache CacheBusted flag is NOT called from everywhere X(

    If I can use this flag to RUN command, feel free to use RUN command such as the following command in the Dockerfile. 01 01 02 03 Check NoCache and CacheBusted flag If flag is true, return false ( cache is not exist) Get images related with parent imageID from local image store If CacheFrom option is specified, not use local image store. Choice latest image as a cache and return true RUN pip install -r requirement Compare with current Image config and cache image config (Do NOT compare the "Image" nor "Hostname" fields) 04
  16. Reduce a image size Only whiteout file is created when

    you delete files on across the layers. Therefore, the image size is not reduced. If you want to reduce the size of image, put together as much as possible one command in order to reduce number of layers. If the number of layers are increased, I recommend you to use docker-suquash. It is a utility to squash multiple docker layers into one in order to create an image with fewer and smaller layers. https://github.com/jwilder/docker-squash Modification and additional are same movement, the image size is increased. virtualization option of apk will help it.
  17. Use caches well RUN command can not detect a change

    in the content of itself such as `git clone`. There is a workaround, but ... I recommend you to use COPY or ADD command. Once a command content of the Dockerfile is changed, subsequent lines can not be used the cache. You need define a command that modify frequently in the second Half of the Dockerfile. Relation of the Image size and effective use of the cache is often trade Off. Make multiple Dockerfile depending on the purpose is also a way to do it. (ex. for CI or Production)