$30 off During Our Annual Pro Sale. View Details »

Efficient Container Image Updating in Low-bandwidth Networks with Delta Encoding

松本直樹
September 26, 2023

Efficient Container Image Updating in Low-bandwidth Networks with Delta Encoding

松本直樹

September 26, 2023
Tweet

More Decks by 松本直樹

Other Decks in Technology

Transcript

  1. Efficient Container Image Updating
    in Low-bandwidth Networks with Delta Encoding
    IC2E 2023 Session 1: Containers and micro-services
    September 26, 2023
    Naoki Matsumoto, Daisuke Kotani, Yasuo Okabe
    (Kyoto University, Japan)
    1

    View Slide

  2. Background
    Container: Lightweight isolation technology.
    • Container’s process, rootfs, network namespaces are isolated from hosts.
    • Users can bundle and distribute environments as Container image
    → It makes easy to provision or update environments.
    • Often used in Cloud and Edge Computing environment.
    Container image: Bundle of container’s rootfs
    • Each container uses it with read-only mode.
    Container
    Image
    /
    ├ etc
    ├ usr
    └ local
    ├ home
    ├ ubuntu
    └ public
    └ opt 2

    View Slide

  3. Background
    Increasing container use in network-resource restricted environment.
    • Bandwidth is low (e.g., Cellular :50~300Mbps[1])
    To start or update containers, users download and expand container images. (pull)
    3
    Cellular
    Cloud
    ISP
    コンテナ
    Container
    コンテナ
    コンテナ
    Container
    Pulling image
    Run with
    image
    Container image
    [1]Mobile access bandwidth in practice: measurement, analysis, and implications(Xinlei Yang et al., 2022)

    View Slide

  4. Problems in Container Image Updating
    Large update data cause problems.
    4
    Lightweight and Fast Updating is Needed!
    Low-bandwidth
    Network
    Cloud
    IoT Device
    Cost increases!
    Congestion!
    Deployment takes too much time…

    View Slide

  5. Current Container Image Updating
    Current container runtimes (e.g., containerd) provides layer-based image.
    Layer-based images cannot provide efficient update.
    5
    0.00
    20.00
    40.00
    60.00
    80.00
    100.00
    10Mbps 50Mbps 100Mbps 500Mbps 1Gbps 5Gbps 10Gbps
    Time to pull (sec)
    Network Bandwidth
    Time to update from postgres:13.1 to postgres:13.2
    Download Expand
    Download time
    is dominant
    We assume these network environments
    Layer-0
    Layer-1
    Layer-2
    Build with updated Dockerfile or source code
    Updated layer

    View Slide

  6. Related Works
    6
    Lazy-pulling[2][5]
    • Downloading files required to start container preferentially.
    [2] Slacker: Fast Distribution with Lazy Docker Containers (Tyler Harter et al., 2016)
    a
    Container Lazy-pulling
    plugin
    Container
    Registry
    Container
    2. Request files
    to start the container
    3.Files or chunks
    5. Request a file when read
    Client
    [5] stargz-snapshotter (https://github.com/containerd/stargz-snapshotter)
    4. A container starts 6.Files or chunks
    1. Request to start a container

    View Slide

  7. Related Works
    File-by-file delta method[3]
    • Transferring file-by-file deltas between local images and required images.
    7
    Compare files and transfer complete updated files
    [3] Starlight: Fast Container Provisioning on the Edge and over the WAN (Jun Lin Chen et al., 2022)
    a
    Container
    Container Starlight
    Client
    Container
    Starlight Client Starlight Server
    Compare files between images
    The client has some images
    Send only new
    or updated files
    Requesting new image

    View Slide

  8. Problems in Related Works
    These works rest a room to reduce update data size.
    Lazy-pulling
    • Data size is not reduced, Require stable low-latency network.
    File-by-file delta method
    • Cannot handle partial modifications on files efficiently.
    • Most of the content in some execs and shared libs are not updated.
    8
    old file
    new file
    updated
    Need to transfer complete file

    View Slide

  9. Proposed Method
    Reducing data to update images using delta encoding.
    • Transferring only required partial data to update.
    Old image
    New image
    update
    Generating deltas (Server)
    9
    a
    Apply
    deltas
    Applying deltas (Client)
    Update bundle
    Generate
    deltas
    Distribute
    コンテナ
    コンテナ
    Container
    Low-bandwidth
    network
    Update bundle
    Old image
    New image
    Non-layered updating Update data size is reduced!

    View Slide

  10. Results
    Update data size is reduced to 5 ~ 40% compared to existing methods.
    • Time to update is also reduced.
    • Performance degradation is little excepting some cases.
    10
    5.17 4.95
    12.56
    20.64
    1.69 0.97
    3.23
    8.48
    0.00
    5.00
    10.00
    15.00
    20.00
    25.00
    .1 - .2 .2 - .3 .29 - .30 .30 - .31
    postgres mysql
    Time to Update Image
    (sec)
    File-by-file delta Proposed method
    28.27 26.57
    69.84
    118.56
    4.46 3.79
    16.51
    47.26
    0.00
    20.00
    40.00
    60.00
    80.00
    100.00
    120.00
    140.00
    .1 - .2 .2 - .3 .29 - .30 .30 - .31
    postgres mysql
    Delta Bundle Size (MB)
    File-by-file delta Proposed method

    View Slide

  11. Challenges in Container Image Updating
    Delta encoding: generating deltas between files.
    → Updating container images has challenges in delta encoding.
    • Generating deltas on server: Target is many versioned files
    • Deltas are generated and applied to 100s or 1000s of files at a time.
    • Number of combination for deltas are 𝑂(𝑛2) with 𝑛 versions.
    → We need to consider the time and load to generate them.
    • Applying deltas on client: Need to apply each delta.
    • Applying deltas takes time and consumes disk IO and CPU resource.
    11

    View Slide

  12. Overview of Proposed Method
    Our approach uses delta encoding for container image updating.
    • We used bsdiff as delta encoding algorithm.
    • A server generates and merges deltas, and a client applies deltas to old images.
    12
    Server
    Client
    Container Runtime
    Di3FS
    Snapshotter
    plugin
    Container Container
    (5) Provide
    container images
    (3) Work
    with runtime
    Update bundle
    server
    (4) Mount delta
    bundle with Di3FS
    Registry
    Delta bundle
    store
    (0) Download image
    (1)Generate
    delta bundles
    (2) Generate update
    bundle with DeltaMerging
    Update
    bundle

    View Slide

  13. Overview of Proposed Method
    This presentation explains the core parts of our method.
    • Delta Bundle Format, Merge-based Delta Generation Strategy
    • Server: DeltaMerging enables merged-based strategy.
    • Client: Di3FS applies deltas laziliy.
    13
    Container Runtime
    Di3FS
    Snapshotter
    plugin
    Container Container
    (5) Provide
    container images
    (3) Work
    with runtime
    Update bundle
    server
    (4) Mount delta
    bundle with Di3FS
    Registry
    Delta bundle
    store
    (0) Download image
    (1)Generate
    delta bundles
    (2) Generate update
    bundle with DeltaMerging
    Update
    bundle
    Generating deltas quickly
    Applying deltas lazily

    View Slide

  14. Delta Generation
    Generating deltas for each file and packing them as delta bundle.
    • Delta encoding generates delta files for updated files.
    • New files are compressed.
    Manifest and Config for container are packed as an update bundle
    14
    /
    ├ usr
    └ home
    └ ubuntu
    ├ fileA
    └ fileB
    /
    ├ usr
    └ home
    └ ubuntu
    ├ fileA(updated)
    ├ fileB
    └ fileC(new)
    compression
    delta
    ・Manifest
    ・Config
    ・Delta bundle
    ・Metadata
    ・Structure of directories
    ・File attributes
    ・fileA.diff (delta file)
    ・fileC (new file)
    Old image New image Update bundle

    View Slide

  15. Delta Bundle
    Decompressing image layers(tar.gz) takes much time especially in IoT devices.
    → It increases pulling time and consume CPU resources and disk IO.
    Delta bundle does not require entire decompression and expansion.
    • Directory structures and file attributes are retained as metadata.
    • Di3FS provides updated image using metadata without applying all deltas.
    FileA B FileD.diff
    C E
    {
    "name":“FileD",
    “type”: FILE_DIFF,
    "compressedSize":74,
    "offset":78
    }
    {
    "name":“FileA",
    “type”: FILE_NEW,
    "compressedSize":40,
    "offset":0
    }
    Metadata
    Delta bundle
    {
    "name":“FileD”,
    “size”:180,
    “mode”:420,
    “uid”:1000,
    “gid”:1000,
    “type”:FILE_DIFF,
    “childs”:[],
    “compressedSize”:74,
    "offset":78
    }
    15

    View Slide

  16. Strategy for Delta Generation
    16
    Each Client can have different old image.
    As a strategy, three approaches are considered.
    1. Generating deltas for each client on request.
    2. Generating deltas for all combinations in advance.
    3. Cherry picking best points from 1 and 2
    v3
    v3
    Client A
    Client B
    Delta for v2 → v3 is required
    Delta for v1 → v3 is required
    v2
    v1

    View Slide

  17. Strategy for Delta Generation
    1. Generating deltas for each client on request.
    → Generating deltas takes much time = update time increases
    2. Generating deltas for all patterns in advance.
    → Number of deltas follows 𝑂 𝑛2
    = impractical when version increases
    17
    Client Server
    1. Request
    2. Generate deltas
    3. Response
    Client Server
    1. Request
    2. Response
    0. Generate deltas in advance

    View Slide

  18. Strategy for Delta Generation
    18
    3. Generating deltas cherry picking best points from 1 and 2
    • We employed the approach to utilize pre-generated deltas and merging.
    • Generating deltas for (𝑽𝒊
    , 𝑽𝒊+𝟏
    ) in advance, and merging them.
    Client A
    Request Δ(𝑉0
    , 𝑉1
    )
    Response Δ(𝑉0
    , 𝑉1
    )
    Client B
    Δ(𝑉0
    , 𝑉1
    )
    Δ(𝑉1
    , 𝑉2
    )
    Δ(𝑉2
    , 𝑉3
    )
    Request Δ(𝑉1
    , 𝑉3
    )
    Response Δ(𝑉1
    , 𝑉3
    )
    Server
    Send
    pre-generated delta
    Merge
    pre-generated deltas
    Δ(𝑉1
    , 𝑉3
    )

    View Slide

  19. Faster Delta Generation
    We use bsdiff to generate and apply deltas.
    • Known as highly efficient delta generation method.
    • Using suffix array to get Longest Common Subsequence.
    19
    0x00, 0x00, 0x02, 0x02, 0x02 0xAB, 0xBC INSERT
    ADD
    0x05, 0x02, 0x03 offsets
    ADD 5 bytes, INSERT 2 bytes, Move +3 bytes
    Delta file
    A block of operation
    A subsequence to ADD A subsequence to INSERT

    View Slide

  20. Faster Delta Generation: DeltaMerging
    DeltaMerging merges each delta files generated by bsdiff.
    • ADD and ADD are merged as ADD, and others are merged as INSERT.
    • Only seeking and merging delta blocks → Faster than generating deltas.
    20
    v1→v2
    Delta file
    v2→v3
    Delta file
    v1→v3
    Delta file
    ADD INSERT
    ADD INSERT ADD
    ADD INSE
    RT


    INSE
    RT
    INSE
    RT

    View Slide

  21. Lazy Delta Applying: Di3FS
    Applying deltas on-demand when the file is opened → No need to apply all deltas
    21
    same approach with lazy-pulling
    Di3FS
    1. Showing new files
    with metadata in the delta bundle
    New file attributes
    (metadata)
    ReadDir, GetAttr
    Open
    New file
    Applying
    delta
    2. Applying delta when
    the files is opened
    OK
    Read(offset=0, len=4096)
    OK(Data=0xab, 0xbc,…)
    3. Reading data
    Old file
    Delta file
    ls -l
    cat new.txt

    View Slide

  22. Implementation and Evaluation
    • Implemented for containerd 1.6.2.
    • Environment: IoT device as client, and network is slow cellular network.
    • Parameters are Throughput: 50 Mbps, Latency(RTT): 40 ms [1][4]
    • Showing results for postgres(13.1, .2, .3) and mysql(8.0.29, .30, .31).
    22
    Client
    (Raspberry Pi 4B)
    CPU 4 cores
    Memory 8GB
    Server
    (Virtual Machine)
    CPU 8 cores
    Memory 32GB
    [4] Revisiting the Arguments for Edge Computing Research(Blesson Varghese, et al., 2021)
    tc emulated
    network
    (50 Mbps, 40 ms)

    View Slide

  23. Delta Size Reduction
    Compared delta size reduction with Starlight[3]’s approach (File-by-file delta).
    → Proposed method reduces delta size to 5~40% compared to File-by-file delta
    Size increase with DeltaMerging is little.
    23
    With pre-generated deltas With merging deltas on request
    28.27 26.57
    69.84
    118.56
    4.46 3.79
    16.51
    47.26
    0.00
    20.00
    40.00
    60.00
    80.00
    100.00
    120.00
    140.00
    .1 - .2 .2 - .3 .29 - .30 .30 - .31
    postgres mysql
    Delta Bundle Size (MB)
    File-by-file delta Proposed method
    31.02
    6.71
    5.29
    1.16
    5.36
    1.15
    0.00
    5.00
    10.00
    15.00
    20.00
    25.00
    30.00
    35.00
    .1 - .3
    postgres
    .29 - .31
    mysql
    Delta Bundle Size (MB)
    File-by-file delta Binary delta encoding DeltaMerging

    View Slide

  24. Breakdown of Delta Size Reduction
    Huge size reductions were seen in executables and shared libs.
    • bsdiff is designed for executable files.
    24
    0
    0.5
    1
    1.5
    2
    2.5
    3
    3.5
    0 500000 1000000 1500000 2000000 2500000 3000000 3500000 4000000
    Compressed size ratio
    Compressed new file size (bytes)
    /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
    /usr/lib/postgresql/13/lib/bitcode/postgres.index.bc
    /usr/lib/x86_64-linux-gnu/libapt-pkg.so.5.0.2
    Proposed method is better
    file-by-file delta is better /usr/lib/postgresql/13/bin/postgres
    Lower is better
    Deltas between postgres:13.1 and postgres:13.2

    View Slide

  25. Time to Generate Deltas
    • Time to generate deltas increased compared to file-by-file delta.
    • With DeltaMerging, generating deltas is much faster than non-merging generation.
    → Pre-generated + merging deltas reduces the time to generate deltas.
    25
    3.37 3.38
    4.80 5.35
    19.99 18.03
    113.89
    156.94
    0.00
    30.00
    60.00
    90.00
    120.00
    150.00
    180.00
    .1 - .2 .2 - .3 .29 - .30 .30 - .31
    postgres mysql
    Time to Generate Deltas
    (sec)
    File-by-file delta Delta encoding
    3.38 5.32
    22.33
    167.08
    5.87
    23.96
    0.00
    30.00
    60.00
    90.00
    120.00
    150.00
    180.00
    .1 - .3 .29 - .31
    postgres mysql
    Time to Generate Deltas
    (sec)
    File-by-file delta Delta encoding DeltaMerging
    Generating deltas Comparison between generating and merging deltas

    View Slide

  26. Time to Update Container Image
    Time to update is from downloading to mounting images.
    → When pre-generated deltas exists, the time is reduced.
    Merging deltas is a bit slow, and more improvements are required.
    26
    With pre-generated deltas With merging deltas on request
    0.00
    5.00
    10.00
    15.00
    20.00
    25.00
    File-by-file
    delta
    Proposed
    method
    File-by-file
    delta
    Proposed
    method
    File-by-file
    delta
    Proposed
    method
    File-by-file
    delta
    Proposed
    method
    postgres .1 - .2 postgres .2 - .3 mysql .29 - .30 mysql .30 - .31
    Time to update images (sec)
    download mount
    0
    10
    20
    30
    40
    File-by-file
    delta
    Proposed
    method
    File-by-file
    delta
    Proposed
    method
    postgres .1 - .3 mysql .29 - .31
    Time to update images
    (sec)
    merge download mount

    View Slide

  27. Performance Degradation on Applications
    Evaluated at updating from postgres:13.1 to postgres:13.2
    • Time to compare files in new images with diff(1) increased greatly.
    • Due to the delta applying overhead in Di3FS
    • No performance degradation were not seen in benchmark with pgbench.
    • Di3FS only handles files in images and delta applied result is retained on memory.
    • New data and modifications are handled by native FS
    → Once the container started, severe performance issues will not occur.
    27
    Elapsed time for diff(1) pgbench result
    method Time (sec)
    Di3FS 6.019
    Native FS 0.234
    method
    Time per transaction
    (ms) Transactions per second
    Di3FS 15.748 634.997
    Native FS 15.747 635.037

    View Slide

  28. Summary
    Objective: Reducing data size and time to update container images.
    Proposal: Updating method with delta encoding.
    Evaluation: Our method reduce size to 5 ~ 40% that of a file-by-file delta.
    • Huge reduction in executable binaries and shared libraries.
    • Performance degradation is little excepting some cases.
    Conclusion: Delta encoding is also effective in container image updating.
    • File-specific delta encoding method will reduce data size more.
    28
    Prototype implementation is available at
    https://github.com/naoki9911/d4c

    View Slide