Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Container Image Architecture

Container Image Architecture

Do you know how to be built the Dockerfile?
I'll talk about OCI image specification and docker image architecture.
It's the way to build images, to probe cache and to parse the Dockerfile.

Aya (Igarashi) Ozawa

October 31, 2016
Tweet

More Decks by Aya (Igarashi) Ozawa

Other Decks in Technology

Transcript

  1. Container type virtualisation study #10
    CONTAINER
    IMAGE
    ARCHITECTURE

    View Slide

  2. AYA IGARASHI
    @Ladicle
    NTT Communications
    Software Engineer
    I'm developing
    cloud services.

    View Slide

  3. Have you build
    Dockerfile?

    View Slide

  4. Do you know
    its architecture?

    View Slide

  5. Do you know OCI?
    (Open Container Initiative)
    The
    current
    specification?
    Docker vs. OCI
    3 months ago…

    View Slide

  6. Overview
    Terminology
    Calculate IDs
    Image layout
    Filesystem layers
    01 OCI IMAGE SPEC 02 DOCKER IMAGE ARCHITECTURE
    Directory structure
    03 HOW TO CREATE A DOCKER IMAGE
    Overview
    Building Logs
    Parse the Dockerfile
    Build images
    Probe Cache
    04 BEST PRACTICE OF THE DOCKERFILE
    Reduce a image size
    Use caches well
    AGENDA

    View Slide

  7. OCI
    Image
    Specification

    View Slide

  8. Development happens on
    GitHub for the spec.
    01 HOW TO MANAGE?
    02 DEFINED
    Basic image layout
    (constitutive formats)
    Filesystem layers
    Overview
    Latest version is v1.0.0-rc1!
    03 NOT YET DEFINED
    Security Practices
    Caching mechanism
    signatures is under
    discussion now :)

    View Slide

  9. Vincent Batts
    Antonio Murdaca
    W. Trevor King
    Brandon Philips
    Lei Jitang
    Jonathan Boulle
    Sergiusz Urbaniak
    Stephen Day
    Rob Dolin
    xiekeyang
    TOP 10 CONTRIBUTORS
    Who defines it?
    ATOMIC, OCI
    docker, ATOMIC, kubernetes-incubater
    numfocus, ipfs, swcarpentry
    coreos, systemd, goraft
    huawei
    cores
    cores
    docker
    microsoft
    huawei

    View Slide

  10. Manifest
    List
    Terminology
    Layer 0
    Layer 1
    Layer n
    Changesets
    Images
    n
    1
    1
    n
    Image
    Config
    Manifest
    Refs
    Repository
    v1, latest, etc.
    amd, ppc64, etc.
    1
    n
    1
    n
    ENV, cpu, etc.
    1

    View Slide

  11. Calculate identifiers
    Layer 0
    Layer 1
    Layer n
    Manifest
    List
    Image
    Config
    Manifest
    Manifest
    List ID
    Image
    Config ID
    Manifest
    ID
    TAKE DIGEST OF JSON FILE
    Take digest of
    unpackaged data.
    DIFF ID
    ChainID(layerN) =
    SHA256hex(ChainID(layerN-1) + " " + DiffID(layerN))
    CHAIN ID
    Hash algorithm
    is SHA256

    View Slide

  12. 01
    02
    03
    Image layout
    $ tree
    blobs/
    └── sha256/
    # Manifeset list
    └── e692418e4cbaf90ca69d05a66403747baa33ee08806650b51fab815ad7fc331f
    # Manifest
    └── afff3924849e458c5ef237db5f89539274d5e609db5db935ed3959c90f1f2d51
    # Image Configure
    └── 5b0bcabd1ed22e9fb1310cf6c2dec7cdef19f0ad69efa1f392e94a4333501270
    # Layer
    └── e692418e4cbaf90ca69d05a66403747baa33ee08806650b51fab815ad7fc331f
    refs/
    # tag
    └── v1.0
    └── stable-release
    tar stream
    Refs object has Manifest list ID
    Manifest list has Manifest IDs
    Manifest has ImageID & LayerIDs
    Example

    View Slide

  13. Filesystem layers
    Additions
    Modifications
    Removals
    SUPPORTED CHANGE TYPES
    Additions & Modifications
    are same movement.
    R/O
    R/O
    Union filesystem
    R/W
    A whiteout filename consists of the
    prefix .wh. plus the basename of the
    path to be deleted.
    MARKER OF DELETED FILES
    A A A
    ADD MODIFY DEL

    View Slide

  14. Filesystem layers
    Additions
    Modifications
    Removals
    SUPPORTED CHANGE TYPES
    Writable layer is added
    when you run container.
    R/O
    R/O
    Union filesystem
    R/W
    A whiteout filename consists of the
    prefix .wh. plus the basename of the
    path to be deleted.
    MARKER OF DELETED FILES
    A A
    ADD MODIFY DEL
    A
    A
    C B
    A
    .wh.
    A

    View Slide

  15. Docker
    Image
    Architecture

    View Slide

  16. What is difference?
    Q WHAT IS DIFFERENCE BETWEEN DOCKER AND OCI-SPEC?
    There is no significant difference now. Such as the file name
    and configuration file entries are different.
    Q HOW DOCKER SUPPORTS THE OCI SPECIFICATION?
    Docker is not going to fit the OCI image specification, it
    seems to develop a format converter.
    https://github.com/docker/docker/issues/25779
    rkt looks like a positive attitude to support the OCI-spec.
    Originally, it takes an architecture that can support multiple
    image formats.
    https://github.com/coreos/rkt/issues/3204
    FYI

    View Slide

  17. Directory structure
    |-- image
    | `-- overlay
    | |-- imagedb
    | | |-- content
    | | | `-- sha256
    | | | `-- ee4603260daafe1a8c2f3b78fd760922918ab2441cbb28…
    | |-- layerdb
    | | |-- sha256
    | | | `-- 9007f5987db353ec398a223bc5a135c5a9601798ba20a1abba…
    | | | |-- cache-id
    | | | |-- diff
    | | | |-- size
    | | | `-- tar-split.json.gz
    | `-- repositories.json
    https://github.com/vbatts/tar-split
    What is tar-split file? Docker use tar-split library.

    It’s provide consistent tar archives for the image layer content.
    Images and layers are arranged
    in different directory

    View Slide

  18. Create
    The Docker image
    With the Dockerfile

    View Slide

  19. Overview
    Docker
    client
    Docker
    daemon
    $ docker build -t test ./
    1
    Analyse command arguments
    and compress the target directory to tarball.
    02
    03 Call build image Remote API
    Default

    Unix socket domain
    04
    If dockerignore file exists, the client
    modifies the context to exclude files and
    directories that match patterns in it.
    Create builder instance
    1. Check CacheFrom option
    2. Get image cache (optional)
    3. Parse Dockerfile
    Builder Instance
    Fetch a image as a
    cache If CacheFrom
    flag is ture.
    4.2
    Dockerfile
    Node
    Parse and return AST
    AST(nodes) 4.3

    View Slide

  20. Overview
    05 Build images
    1. Parse LABEL command
    2. Execute all commands
    1. Probe cache
    2. Start a temporal container
    3. Execute a command
    4. Commit changes
    3. Check unused ARG
    4. Associate tags to image
    Parent image layer 0
    Parent image layer n
    Current layer
    Execute a command
    Node
    5.2.2 AST
    Base image layer 0
    Base image layer n
    Read Only
    Docker
    daemon
    Writable layer
    If not found cache

    View Slide

  21. Building log
    Send tarball
    $ docker build -t test ./
    # Call remote API
    Sending build context to Docker daemon 557.1 kB
    Sending build context to Docker daemon 1.114 MB
    Sending build context to Docker daemon 1.425 MB
    # Fetch base image
    Step 1 : FROM python:3.5.2-alpine
    ---> a047e3d0ae2b
    # Show detail
    Step 2 : RUN apk add --no-cache git
    # Run container
    ---> Running in 7785e4cfb01a
    # Run command & Commit container
    ---> f413fb825b75
    # Delete temporary container
    Removing intermediate container 7785e4cfb01a
    Successfully built f413fb825b75
    Use parent
    image

    View Slide

  22. 1. Create root Node
    2. Detect command
    1. Skip comment or empty line
    2. Send value to the parse dispatcher
    3. Create Nodes
    1. Put Node in the Next
    4. Return Node
    3. Append Node to the Child slice
    4. Return root Node
    Parse the Dockerfile
    Node
    Value: string
    Next: Node
    Children: [ ]Node
    Node
    Value
    Next
    Children
    Node Node
    Node
    Node
    Children[0]
    root value is
    CMD name
    Children[1] Children[n]
    LABEL command is parsed separately.
    The Docker daemon collect it commands into one Node and
    append to Children of the root Node..
    https://docs.docker.com/engine/reference/builder/#/label

    View Slide

  23. Probe Cache
    CacheBusted flag is NOT called from everywhere X(
    If I can use this flag to RUN command, feel free to use RUN command
    such as the following command in the Dockerfile.
    01
    01
    02
    03
    Check NoCache and CacheBusted flag
    If flag is true, return false ( cache is not exist)
    Get images related with parent imageID from local image store
    If CacheFrom option is specified, not use local image store.
    Choice latest image as a cache and return true
    RUN pip install -r requirement
    Compare with current Image config and cache image config
    (Do NOT compare the "Image" nor "Hostname" fields)
    04

    View Slide

  24. Best
    Practice
    Of the Dockerfile

    View Slide

  25. Reduce a image size
    Only whiteout file is created when you delete files on across the
    layers. Therefore, the image size is not reduced.
    If you want to reduce the size of image, put together as much as
    possible one command in order to reduce number of layers.
    If the number of layers are increased, I recommend you to use
    docker-suquash. It is a utility to squash multiple docker layers into
    one in order to create an image with fewer and smaller layers.
    https://github.com/jwilder/docker-squash
    Modification and additional are same
    movement, the image size is increased.
    virtualization option of
    apk will help it.

    View Slide

  26. Use caches well
    RUN command can not detect a change in the content of itself
    such as `git clone`.
    There is a workaround, but ...
    I recommend you to use COPY or ADD command.
    Once a command content of the Dockerfile is changed,
    subsequent lines can not be used the cache.
    You need define a command that modify frequently in the
    second Half of the Dockerfile.
    Relation of the Image size and effective use of
    the cache is often trade Off. Make multiple
    Dockerfile depending on the purpose is also a
    way to do it. (ex. for CI or Production)

    View Slide

  27. Thanks for watching!
    See You Next Time
    @ladicle

    View Slide