Slide 1

Slide 1 text

Container type virtualisation study #10 CONTAINER IMAGE ARCHITECTURE

Slide 2

Slide 2 text

AYA IGARASHI @Ladicle NTT Communications Software Engineer I'm developing cloud services.

Slide 3

Slide 3 text

Have you build Dockerfile?

Slide 4

Slide 4 text

Do you know its architecture?

Slide 5

Slide 5 text

Do you know OCI? (Open Container Initiative) The current specification? Docker vs. OCI 3 months ago…

Slide 6

Slide 6 text

Overview Terminology Calculate IDs Image layout Filesystem layers 01 OCI IMAGE SPEC 02 DOCKER IMAGE ARCHITECTURE Directory structure 03 HOW TO CREATE A DOCKER IMAGE Overview Building Logs Parse the Dockerfile Build images Probe Cache 04 BEST PRACTICE OF THE DOCKERFILE Reduce a image size Use caches well AGENDA

Slide 7

Slide 7 text

OCI Image Specification

Slide 8

Slide 8 text

Development happens on GitHub for the spec. 01 HOW TO MANAGE? 02 DEFINED Basic image layout (constitutive formats) Filesystem layers Overview Latest version is v1.0.0-rc1! 03 NOT YET DEFINED Security Practices Caching mechanism signatures is under discussion now :)

Slide 9

Slide 9 text

Vincent Batts Antonio Murdaca W. Trevor King Brandon Philips Lei Jitang Jonathan Boulle Sergiusz Urbaniak Stephen Day Rob Dolin xiekeyang TOP 10 CONTRIBUTORS Who defines it? ATOMIC, OCI docker, ATOMIC, kubernetes-incubater numfocus, ipfs, swcarpentry coreos, systemd, goraft huawei cores cores docker microsoft huawei

Slide 10

Slide 10 text

Manifest List Terminology Layer 0 Layer 1 Layer n Changesets Images n 1 1 n Image Config Manifest Refs Repository v1, latest, etc. amd, ppc64, etc. 1 n 1 n ENV, cpu, etc. 1

Slide 11

Slide 11 text

Calculate identifiers Layer 0 Layer 1 Layer n Manifest List Image Config Manifest Manifest List ID Image Config ID Manifest ID TAKE DIGEST OF JSON FILE Take digest of unpackaged data. DIFF ID ChainID(layerN) = SHA256hex(ChainID(layerN-1) + " " + DiffID(layerN)) CHAIN ID Hash algorithm is SHA256

Slide 12

Slide 12 text

01 02 03 Image layout $ tree blobs/ └── sha256/ # Manifeset list └── e692418e4cbaf90ca69d05a66403747baa33ee08806650b51fab815ad7fc331f # Manifest └── afff3924849e458c5ef237db5f89539274d5e609db5db935ed3959c90f1f2d51 # Image Configure └── 5b0bcabd1ed22e9fb1310cf6c2dec7cdef19f0ad69efa1f392e94a4333501270 # Layer └── e692418e4cbaf90ca69d05a66403747baa33ee08806650b51fab815ad7fc331f refs/ # tag └── v1.0 └── stable-release tar stream Refs object has Manifest list ID Manifest list has Manifest IDs Manifest has ImageID & LayerIDs Example

Slide 13

Slide 13 text

Filesystem layers Additions Modifications Removals SUPPORTED CHANGE TYPES Additions & Modifications are same movement. R/O R/O Union filesystem R/W A whiteout filename consists of the prefix .wh. plus the basename of the path to be deleted. MARKER OF DELETED FILES A A A ADD MODIFY DEL

Slide 14

Slide 14 text

Filesystem layers Additions Modifications Removals SUPPORTED CHANGE TYPES Writable layer is added when you run container. R/O R/O Union filesystem R/W A whiteout filename consists of the prefix .wh. plus the basename of the path to be deleted. MARKER OF DELETED FILES A A ADD MODIFY DEL A A C B A .wh. A

Slide 15

Slide 15 text

Docker Image Architecture

Slide 16

Slide 16 text

What is difference? Q WHAT IS DIFFERENCE BETWEEN DOCKER AND OCI-SPEC? There is no significant difference now. Such as the file name and configuration file entries are different. Q HOW DOCKER SUPPORTS THE OCI SPECIFICATION? Docker is not going to fit the OCI image specification, it seems to develop a format converter. https://github.com/docker/docker/issues/25779 rkt looks like a positive attitude to support the OCI-spec. Originally, it takes an architecture that can support multiple image formats. https://github.com/coreos/rkt/issues/3204 FYI

Slide 17

Slide 17 text

Directory structure |-- image | `-- overlay | |-- imagedb | | |-- content | | | `-- sha256 | | | `-- ee4603260daafe1a8c2f3b78fd760922918ab2441cbb28… | |-- layerdb | | |-- sha256 | | | `-- 9007f5987db353ec398a223bc5a135c5a9601798ba20a1abba… | | | |-- cache-id | | | |-- diff | | | |-- size | | | `-- tar-split.json.gz | `-- repositories.json https://github.com/vbatts/tar-split What is tar-split file? Docker use tar-split library.
 It’s provide consistent tar archives for the image layer content. Images and layers are arranged in different directory

Slide 18

Slide 18 text

Create The Docker image With the Dockerfile

Slide 19

Slide 19 text

Overview Docker client Docker daemon $ docker build -t test ./ 1 Analyse command arguments and compress the target directory to tarball. 02 03 Call build image Remote API Default
 Unix socket domain 04 If dockerignore file exists, the client modifies the context to exclude files and directories that match patterns in it. Create builder instance 1. Check CacheFrom option 2. Get image cache (optional) 3. Parse Dockerfile Builder Instance Fetch a image as a cache If CacheFrom flag is ture. 4.2 Dockerfile Node Parse and return AST AST(nodes) 4.3

Slide 20

Slide 20 text

Overview 05 Build images 1. Parse LABEL command 2. Execute all commands 1. Probe cache 2. Start a temporal container 3. Execute a command 4. Commit changes 3. Check unused ARG 4. Associate tags to image Parent image layer 0 Parent image layer n Current layer Execute a command Node 5.2.2 AST Base image layer 0 Base image layer n Read Only Docker daemon Writable layer If not found cache

Slide 21

Slide 21 text

Building log Send tarball $ docker build -t test ./ # Call remote API Sending build context to Docker daemon 557.1 kB Sending build context to Docker daemon 1.114 MB Sending build context to Docker daemon 1.425 MB # Fetch base image Step 1 : FROM python:3.5.2-alpine ---> a047e3d0ae2b # Show detail Step 2 : RUN apk add --no-cache git # Run container ---> Running in 7785e4cfb01a # Run command & Commit container ---> f413fb825b75 # Delete temporary container Removing intermediate container 7785e4cfb01a Successfully built f413fb825b75 Use parent image

Slide 22

Slide 22 text

1. Create root Node 2. Detect command 1. Skip comment or empty line 2. Send value to the parse dispatcher 3. Create Nodes 1. Put Node in the Next 4. Return Node 3. Append Node to the Child slice 4. Return root Node Parse the Dockerfile Node Value: string Next: Node Children: [ ]Node Node Value Next Children Node Node Node Node Children[0] root value is CMD name Children[1] Children[n] LABEL command is parsed separately. The Docker daemon collect it commands into one Node and append to Children of the root Node.. https://docs.docker.com/engine/reference/builder/#/label

Slide 23

Slide 23 text

Probe Cache CacheBusted flag is NOT called from everywhere X( If I can use this flag to RUN command, feel free to use RUN command such as the following command in the Dockerfile. 01 01 02 03 Check NoCache and CacheBusted flag If flag is true, return false ( cache is not exist) Get images related with parent imageID from local image store If CacheFrom option is specified, not use local image store. Choice latest image as a cache and return true RUN pip install -r requirement Compare with current Image config and cache image config (Do NOT compare the "Image" nor "Hostname" fields) 04

Slide 24

Slide 24 text

Best Practice Of the Dockerfile

Slide 25

Slide 25 text

Reduce a image size Only whiteout file is created when you delete files on across the layers. Therefore, the image size is not reduced. If you want to reduce the size of image, put together as much as possible one command in order to reduce number of layers. If the number of layers are increased, I recommend you to use docker-suquash. It is a utility to squash multiple docker layers into one in order to create an image with fewer and smaller layers. https://github.com/jwilder/docker-squash Modification and additional are same movement, the image size is increased. virtualization option of apk will help it.

Slide 26

Slide 26 text

Use caches well RUN command can not detect a change in the content of itself such as `git clone`. There is a workaround, but ... I recommend you to use COPY or ADD command. Once a command content of the Dockerfile is changed, subsequent lines can not be used the cache. You need define a command that modify frequently in the second Half of the Dockerfile. Relation of the Image size and effective use of the cache is often trade Off. Make multiple Dockerfile depending on the purpose is also a way to do it. (ex. for CI or Production)

Slide 27

Slide 27 text

Thanks for watching! See You Next Time @ladicle