$30 off During Our Annual Pro Sale. View Details »

Linux kernel のコンテナ機能 (2014-04-11)

Linux kernel のコンテナ機能 (2014-04-11)

Docker Meetup Tokyo #2 でお話した時のスライドです.

参考 URI はクリックすると飛べるようになっていますが,Speakerdeck のスライドでは無理なようなので PDF をダウンロードしてください.

翌日行った勉強会の資料を少し省略して,少し Docker ユーザを意識した作りになっているので,翌日の勉強会資料も合わせてご覧いただくと良いかもしれません.

翌日の資料はこちら
https://speakerdeck.com/tenforward/linuxkontenaru-men-kontenafalseji-chu-tozui-xin-qing-bao-2014-04-12

tenforward

April 11, 2014
Tweet

More Decks by tenforward

Other Decks in Technology

Transcript

  1. Linux Kernel ͷίϯςφػೳ
    ೖ໳ฤ
    Ճ౻ହจ
    Docker Meetup Tokyo #2
    2014-04-11

    View Slide

  2. Ճ౻ହจ
    http://www.ten-forward.ws/
    @ten_forward
    http://gplus.to/tenforward
    https://github.com/tenforward
    ·
    ·
    ·
    ·
    3/35

    View Slide

  3. ۀ຿
    ϑΝʔεταʔόגࣜձࣾ ج൫։ൃ෦
    ·
    αʔϏεͷ։ൃ
    ৭ʑͳٕज़ͷௐࠪ
    ࣮͸ίϯςφ͸ۀ຿Ͱ΄΅࢖ͬͯ·ͤΜ
    -
    -
    -
    Ҏલ͸ Virtuozzo Λগ͠
    ࠓ͸ Docker Λ͘͝؆୯ʹ
    -
    -
    4/35

    View Slide

  4. ۀ຿͡Όͳ͍׆ಈ
    ίϯςφؔ࿈ٕज़ͷௐࠪ
    Plamo Linux ϝϯςφ
    IP ి࿩αʔϏεͷ։ൃΛͨ͠བྷΈͰ೔ຊ Asterisk Ϣʔβձ׆ಈΛҎલগ͠
    Jetspeed-2 υΩϡϝϯτ຋༁
    ʲվగ৽൛ʳ LinuxΤϯδχΞཆ੒ಡຊ (ٕज़ධ࿦ࣾ)
    ·
    αʔϏεͰ࢖͑ͳ͍͔ͱ 2010 ೥͘Β͍ʹ cgroup ͷௐࠪΛ࢝Ίͨͷ͕͖ͬ
    ͔͚
    lxc man pages ຋༁
    -
    -
    ·
    ·
    ·
    ·
    5/35

    View Slide

  5. ϒϩά΍ͬͯ·͢
    http://d.hatena.ne.jp/defiant/
    ·

    View Slide

  6. ษڧձ΍ͬͯ·͢
    ͳΜͱ໌೔!!
    ౦ژͰ΋·ͨ΍Γ͍ͨͷͰΑΖ͓͘͠ئ͍க͠·͢ɽ
    7/35

    View Slide

  7. ࠓ೔ͷ͓୊
    Docker ૉਓ͕ Docker ͕࢖༻͍ͯ͠Δ Linux Χʔωϧͷίϯςφػೳʹ͍ͭͯগ
    ͚͓ͩ͠࿩͠·͢
    Docker ͷࢹ఺͔ΒݟΔʮίϯςφʯͱগ͠ҧ͏͔΋஌Ε·ͤΜ
    ·
    ίϯςφͷجૅ
    Linux ʹ͓͚Δίϯςφͷ࢓૊Έ
    LXC ʹ͍ͭͯ
    -
    -
    -
    ·
    8/35

    View Slide

  8. ίϯςφͷجૅ
    9/35

    View Slide

  9. ίϯςφͱ͸
    OS ϨϕϧͷԾ૝Խ
    Χʔωϧ͕࣋ͭػೳ
    ΧʔωϧͷػೳͰ (ෳ਺ͷ) ಠཱۭͨؒ͠Λ࡞Γग़͠ɼϦιʔεΛ෼ׂɾ෼഑͢Δ
    ʮԾ૝Խʯͱ͍͏ΑΓ͸ʮִ཭Խʯ
    ·
    ·
    ·
    ϓϩηεΛάϧʔϓԽͯ͠ଞͷάϧʔϓͱϦιʔεۭؒΛִ཭
    άϧʔϓԽͨ͠ϓϩηεʹର͢ΔϦιʔε੍ݶ
    -
    -
    ·
    10/35

    View Slide

  10. ίϯςφͷϝϦοτ
    ߴີ౓Խ͕Մೳ
    Φʔόʔϔου͕খ͍͞
    ىಈ͕ૣ͍
    ඞͣ͠΋γεςϜΛಈ͔͢ඞཁ͸ͳ͍ (ΞϓϦέʔγϣϯίϯςφ)
    Ծ૝Ϛγϯͷ্Ͱ΋໰୊ͳ͘ಈͥ͘!
    ·
    ىಈ͍ͯ͠Δ OS (Χʔωϧ) ͸Ұͭ
    -
    ·
    ϋʔυ΢ΣΞͷԾ૝Խ͕ෆཁ
    -
    ·
    Ծ૝ϚγϯͷىಈͰ͸ͳ͘ɼϗετ OS ͔ΒݟͨΒ୯ʹϓϩηε͕ىಈͯ͠
    ͍Δ͚ͩͳͷͰɼී௨ͷϓϩάϥϜ͕ىಈ͢Δͷͱ΄ͱΜͲมΘΒͳ͍
    -
    ·
    ྫ͑͹ίϯςφ಺Ͱ͸ httpd ͷΈ͕ಈ͍͍ͯΔ
    -
    ·
    ࠷ۙ͸ KVM ͷ্Ͱ KVM ಈ͍ͨΓ͢ΔͷͰίϯςφͳΒͰ͸ͱ͍͏Θ͚Ͱ΋
    ͳ͍
    -
    11/35

    View Slide

  11. ίϯςφͷσϝϦοτ
    ҟͳΔ OS ͷγεςϜ / ϓϩάϥϜ͸ಈ͔ͤͳ͍
    ΧʔωϧʹؔΘΔૢ࡞͸Ͱ͖ͳ͍
    Χʔωϧͷ࣮૷͸ෳࡶʹͳΔ
    ·
    ୯ʹϗετ OS ্Ͱϓϩηε͕ىಈ͢Δ͚ͩͳͷͰ౰ͨΓલ
    -
    ·
    ىಈ͍ͯ͠ΔΧʔωϧ͸มΘΒͳ͍ͷͰ
    ίϯςφຖʹϩʔυ͢ΔϞδϡʔϧΛม͑ΔͳͲ
    -
    -
    ·
    શͯΧʔωϧͷػೳͱ࣮ͯ͠૷͞Ε͍ͯΔͷͰ
    -
    12/35

    View Slide

  12. Linux ʹ͓͚Δίϯςφ࣮૷
    Χʔωϧͷػೳ (+ ύον) + ΧʔωϧͷػೳΛ࢖͏ userspace πʔϧ
    Χʔωϧ + ύον + userspace πʔϧ
    Χʔωϧ + userspace πʔϧ (લճΑΓ૿͑ͨ! :-)
    ·
    OpenVZ / Virtuozzo(঎༻)
    Linux VServer
    -
    -
    ·
    LXC
    libvirt (lxc υϥΠό)
    systemd(systemd-nspawn)
    vzctl for upstream kernel
    lmctfy
    docker(libcontainer) 0.9 Ҏ߱
    -
    -
    -
    -
    -
    -
    13/35

    View Slide

  13. Linux ʹ͓͚Δίϯςφͷ࢓૊Έ
    14/35

    View Slide

  14. Linux Χʔωϧͷόʔδϣϯͱίϯςφ
    3.0
    3.8
    3.9
    3.12
    ͨͩ͠ Cgroup ͸ݱࡏ࠶ߏஙͷਅͬ࠷த!
    ·
    setns() γεςϜίʔϧͷ࣮૷ (glibc ͸ 2.14 Ҏ߱)
    -
    ·
    ίϯςφͷओཁػೳ͕Ұ௨Γἧͬͨόʔδϣϯ
    -
    ·
    3.8 Ͱἧͬͨػೳ͕࣮༻ʹͳͬͨόʔδϣϯ (=XFSҎ֎ͷϑΝΠϧγεςϜʹ
    ࣮૷͞Εͨ)
    -
    ·
    XFS ΁ͷ࣮૷͕׬ྃ
    ͨͩ͠ 3.12.8 Ҏલ͸ LXC Λ࢖͏্ͰҰ෦໰୊͕͋Δ
    -
    -
    ·
    Linux Χʔωϧͷ͢΂ͯ: cgroup ͷ࠶ઃܭ (linux.com)
    ػೳతʹ͸ࠓ͋ΔϞϊ͕े෼࢖͑ΔϨϕϧ͕ͩɼࠓޙ͔ͳΓมԽ͍ͯ͘͠༧
    ఆͳͷͰ஫ҙ͕ඞཁɽ
    -
    -
    15/35

    View Slide

  15. Linux ͰίϯςφΛ࣮ݱ͢ΔͨΊͷػೳ
    ϓϩηεΛάϧʔϓԽͯ͠ଞͷάϧʔϓͱִ཭
    άϧʔϓԽͨ͠ϓϩηεʹର͢ΔϦιʔε੍ݶ
    chroot (pivot_root)
    ͦͷଞ
    ·
    → Namespace (໊લۭؒ)
    -
    ·
    → Cgroups (control groups)
    -
    ·
    ·
    ωοτϫʔΫ (veth, macvlan)
    έʔύϏϦςΟ
    Checkpoint/Restore (CRIU)
    ͳͲͳͲ...
    -
    -
    -
    -
    16/35

    View Slide

  16. Namespace ͷछྨ (1)
    Mount Namespace: 2.4.19
    UTS Namespace: 2.6.19
    PID Namespace: 2.6.24
    ·
    ϓϩηε͔Βݟ͍͑ͯΔϚ΢ϯτͷू߹ɼૢ࡞Λ෼཭͢ΔɽNamespace ಺
    ͷ mount, umount ͸ଞͷ Namespace ʹ͸Өڹ͠ͳ͍
    (ࢀߟ) Ϛ΢ϯτ໊લۭؒΛద༻͢Δ(IBM developerWorks)
    -
    -
    ·
    ϗετ໊ͳͲɼuname(2) ͕ฦ͢஋ͷू߹Λ෼཭ɽsetdomainname(2),
    sethostname(2) Ͱ Namespace ಺ͷ஋ͷΈมߋͰ͖Δ
    -
    ·
    PID ۭؒͷ෼཭ɽ৽͍͠ PID NamespaceͰ͸ PID 1 ͔Β࢝·Δ PID ׂ͕Γ
    ౰ͯΒΕΔɽ਌͔Βࢠͷ PID Namespace ͸ݟ͑Δ (਌ͷۭؒͷ PID Λ࣋ͭ)
    ͕ɼࢠ͔Β਌͸ݟ͑ͳ͍
    -
    17/35

    View Slide

  17. Namespace ͷछྨ (2)
    IPC Namespace: 2.6.19
    User Namespace: 2.6.23 ~ 3.8
    Network Namespace: 2.6.26
    ·
    SysV IPC ΦϒδΣΫτɼPOSIX ϝοηʔδΩϡʔͷִ཭
    -
    ·
    ಠཱͨ͠ UID/GID ۭؒͱ֎෦ۭؒͷϚοϐϯά (ྫ͑͹ɼִ཭ۭؒͰ͸
    uid/gid 0/0ɼ֎෦Ͱ͸ 1000/1000 ͱ͔ՄೳʹͳΔ)
    -
    ·
    ωοτϫʔΫϦιʔεͷִ཭ɽωοτϫʔΫσόΠεɼΞυϨεɼϧʔςΟ
    ϯάςʔϒϧɼιέοτɼϑΟϧλϦϯά
    -
    18/35

    View Slide

  18. Namespace ͷૢ࡞
    clone(2) Ͱ৽͍͠ϓϩηε Λੜ੒
    unshare(2) Ͱ৽͍͠ϓϩ ηεΛੜ੒ͤͣʹ࣮ߦίϯςΩετΛ੍ޚ͢Δ
    setns(2) ͰϓϩηεΛطଘ ͷNamespaceʹؔ࿈෇͚Δ
    ·
    ·
    unshareͷ࢖༻ྫ
    -
    ·
    19/35

    View Slide

  19. Cgroup (1)
    ϓϩηεΛάϧʔϓԽ͠ɼάϧʔϓʹରͯ͠Ϧιʔε੍ݶΛߦ͏ɽผʹίϯςφઐ
    ༻ͷ࢓૊ΈͰ͸ͳ͍ɽ
    cpu
    cpuacct
    cpuset
    ·
    CFS(Completely Fair Scheduler) bandwidth controlɽ୯Ґ࣌ؒ಺ͷάϧʔϓ
    ಺ͷλεΫ͕࣮ߦͰ͖Δ߹ܭ࣌ؒΛ੍ݶ͢Δ (3.2 Ͱ࣮૷)
    ૬ର഑෼ɽάϧʔϓؒͷ CPU ࣌ؒͷׂ౰ͷׂ߹Λࢦఆ͢Δɽྫ͑͹
    GroupA=100, GroupB=50 ͱ͢Δͱ A:B = 2:1
    -
    (ࢀߟ) Linux 3.2 ͷ CFS bandwidth control
    -
    -
    ·
    άϧʔϓ಺ͷ CPU ϦιʔεͷϨϙʔτ (CPU ࣌ؒ)
    -
    ·
    ׂΓ౰ͯΔ CPU, ϝϞϦϊʔυͷׂ౰
    -
    20/35

    View Slide

  20. Cgroup (2)
    device
    freezer
    memory
    blkio (Block IO)
    ·
    σόΠε΁ͷΞΫηεڐՄɼ੍ݶͷࢦఆ
    -
    ·
    άϧʔϓ಺ͷϓϩηεΛશͯҰ࣌ఀࢭ͢Δ
    -
    ·
    ϝϞϦϦιʔεͷ੍ݶ (ϢʔβϝϞϦɼΧʔωϧϝϞϦ)
    -
    ·
    I/O weight controller (2.6.33 Ҏ߱) άϧʔϓͷ༏ઌ౓Λࢦఆ͢Δ
    I/O throttling (2.6.37 Ҏ߱) άϧʔϓ಺ͷϓϩηεͷσόΠεʹର͢Δ
    bytes/second ͷ߹ܭͷࢦఆ
    -
    -
    (ࢀߟ) Linux 2.6.37 ͷ৽ػೳ "I/O throttling"
    -
    21/35

    View Slide

  21. Cgroup (3)
    hugetlb
    perf_event
    net_cls
    net_prio
    ·
    hugetlb ʹର͢Δ੍ݶ (3.6 Ҏ߱)
    mm/hugetlb: add new HugeTLB cgroup
    -
    -
    ·
    άϧʔϓ୯ҐͰ perf πʔϧͰϞχλϦϯά (ύϑΥʔϚϯεղੳ)
    -
    ·
    ύέοτʹࣝผࢠΛ͚ͭɼτϥϑΟοΫίϯτϩʔϧ (tc) ͱ netfilter (3.14
    Ҏ߱) ͰίϯτϩʔϧՄೳʹ
    -
    ·
    άϧʔϓؒͰͷωοτϫʔΫͷ༏ઌ౓ΛΠϯλʔϑΣʔεຖʹࢦఆ͢Δ
    Linux 3.3 ͷ৽ػೳ Network priority cgroup
    Linux 3.3 ͷ৽ػೳ Network priority cgroup (2)
    -
    -
    -
    22/35

    View Slide

  22. Cgroup (4)
    Cgroup ͸ίϯςφͱؔ܎ͳ͘࢖༻Մೳ
    cgroupfs ͱ͍͏ٙࣅϑΝΠϧγεςϜʹΑΔ࣮૷
    ·
    ·
    # mount -t tmpfs cgroup_root /sys/fs/cgroup
    # mkdir /sys/fs/cgroup/memory
    # mount -t cgroup -o memory cgroup /sys/fs/cgroup/memory (ϝϞϦαϒγεςϜͷϚ΢ϯτ)
    # mkdir /sys/fs/cgroup/memory/test01 ("test01" ͱ͍͏άϧʔϓͷ࡞੒)
    # echo $$ > /sys/fs/cgroup/memory/test01/tasks (ϓϩηεΛάϧʔϓʹొ࿥)
    # cat /sys/fs/cgroup/memory/test01/tasks (άϧʔϓ಺ͷϓϩηεͷ֬ೝ)
    2824
    2837
    # echo 30M > /sys/fs/cgroup/memory/test01/memory.limit_in_bytes
    (άϧʔϓʹରͯ͠ϝϞϦ্ݶ 30M ͱ͍͏੍ݶΛઃఆ)
    # cat /sys/fs/cgroup/memory/test01/memory.limit_in_bytes (੍ݶ஋ͷ֬ೝ)
    31457280
    # cat /sys/fs/cgroup/memory/test01/memory.usage_in_bytes (ݱࡏͷ࢖༻ྔͷ֬ೝ)
    565248
    23/35

    View Slide

  23. LXC ࠷৽ಈ޲
    24/35

    View Slide

  24. Docker ͱ LXC
    ݩʑ Docker ͸ LXC ͱ aufs Λϕʔεʹͨ͠ιϑτ΢ΣΞͰͨ͠
    LXC
    ·
    0.7 Ͱ aufs Ҏ֎΋࢖͑ΔΑ͏ʹͳΓ
    0.9 Ͱ LXC ͳͯ͘΋ಈ͘Α͏ʹͳΓ (υϥΠό͕͋Ε͹ଞͷίϯςφͰ΋)
    -
    CentOS ରԠͷͨΊ?
    -
    -
    ·
    2008 ೥ࠒ͔Β Daniel Lezcano ࢯΛத৺ʹ։ൃελʔτ
    2013 ೥ 9 ݄ʹϝϯςφ͕ Serge Hallyn ࢯ, Stéphane Graber ࢯʹަ୅
    2014 ೥ 2 ݄ʹ 1.0.0 ϦϦʔε
    -
    -
    -
    25/35

    View Slide

  25. LXC 1.0.0 (1)
    2014 ೥ 2 ݄ 20 ೔ϦϦʔε!!
    API ͷ੔උɽliblxc1 ͱͦΕΛ࢖ͬͨίϚϯυϥΠϯπʔϧ
    ඇಛݖίϯςφɽҰൠϢʔβͰίϯςφΛ࣮ߦՄೳʹ (User Namespace)
    ֤छݴޠͷ bindings
    Ϋϩʔϯͱεφοϓγϣοτػೳ
    ίϚϯυϥΠϯπʔϧͷ੔ཧ (ෆཁͳ΋ͷͷ࡟আ)
    ϞχλϦϯάͷվྑ
    1.0 ܥ͸ 5 ೥αϙʔτ
    ·
    ·
    ·
    ·
    lua (in tree)
    python3 (in tree)
    Go (out of tree)
    ruby (out of tree)
    -
    -
    -
    -
    ·
    ·
    ·
    ·
    26/35

    View Slide

  26. LXC 1.0.0 (2)
    ίϯςφͷ rootfs ʹ༷ʑͳ backingstore Λར༻Մೳʹ
    υΩϡϝϯτͷߋ৽
    ·
    σΟϨΫτϦ (ඪ४)
    btrfs
    zfs
    lvm
    loop device
    aufs
    overlayfs
    -
    -
    -
    -
    -
    -
    -
    ·
    man pages ॆ࣮
    API υΩϡϝϯτ (liblxc)
    ೔ຊޠ man pages ͷ௥Ճ (!)
    -
    -
    -
    27/35

    View Slide

  27. LXC 1.0.0 (3)
    ςϯϓϨʔτͷॆ࣮ɽओཁσΟετϦϏϡʔγϣϯ͕Ұ௨Γἧͬͨײ͡
    ·
    CentOS ͷ௥Ճ!!
    -
    lxc-alpine lxc-cirros lxc-openmandriva lxc-ubuntu
    lxc-altlinux lxc-debian lxc-opensuse lxc-ubuntu-cloud
    lxc-archlinux lxc-download lxc-oracle
    lxc-busybox lxc-fedora lxc-plamo
    lxc-centos lxc-gentoo lxc-sshd
    μ΢ϯϩʔυςϯϓϨʔτ
    ݱࡏͷ stable ͸ 1.0.3 (݁ߏසൟʹϦϦʔε͞Ε͍ͯ·͢)
    ·
    ඇಛݖίϯςφΛ lxc-create ͢Δʹ͸৭ʑো֐͕͋ΔͷͰμ΢ϯϩʔυͰ
    σΠϦʔͰओཁσΟετϦϏϡʔγϣϯͷ rootfs image ͕Ϗϧυ͞Ε͍ͯΔ
    -
    -
    centos, debian, fedora, gentoo, oracle, plamo, ubuntu
    -
    ·
    28/35

    View Slide

  28. ࠷ޙʹ
    29/35

    View Slide

  29. ·ͱΊ
    Linux ͷίϯςφͷओͳཁૉ (ଞʹ΋͋Γ·͕͢)
    Namespace (໊લۭؒ)
    Cgroups
    Chroot (pivot_root)
    ·
    ·
    ·
    30/35

    View Slide

  30. ϝʔϦϯάϦετ / ຋༁
    lxc JP άϧʔϓ
    lxc man pages ຋༁
    linuxcontainers.org ຋༁
    ·
    ίϯςφͷ࿩Λ·ͬͨΓ΍͍ͬͯ·͢ɽͨ·ʔʹ͔͠ϝʔϧ͸དྷ·ͤΜɽlxc-
    jp ͱ͍͏໊લͰ͕͢ɼ࿩୊͸ LXC ʹݶΒͣԿͰ΋ OK Ͱ͢ɽ
    -
    ·
    ڠྗऀืू! (ಛʹࠪಡ!!)
    -
    ·
    ڠྗऀืू
    -
    31/35

    View Slide

  31. ίϯςφܕԾ૝Խͷ৘ใަ׵ձ
    ࠓ೥தʹ౦ژͰ΍Γ͍ͨ
    ൃදऀืूத
    ·
    ·
    32/35

    View Slide

  32. ͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠
    33/35

    View Slide


  33. Important contact information goes here.
    twitter @ten_forward
    www www.ten-forward.ws/
    github github.com/tenforward

    View Slide