Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Linux コンテナの基礎 / OSC 2017 Nagoya

Linux コンテナの基礎 / OSC 2017 Nagoya

OSC 2017 Nagoya の発表資料です。
参考となる情報にはPDF中からリンクをしていますが、資料中のリンクは Speaker Deck 上ではクリックできないので PDF をダウンロードしてご覧ください。

Avatar for tenforward

tenforward

May 26, 2017
Tweet

More Decks by tenforward

Other Decks in Technology

Transcript

  1. ࣗݾ঺հ Ճ౻ହจ • http://www.ten-forward.ws/ • @ten forward • http://gplus.to/tenforward •

    https://github.com/tenforward • http://d.hatena.ne.jp/defiant/ (ٕज़ϒϩά) 2
  2. ࣗݾ঺հ • LXC/LXD ͷ։ൃʹগ͠ࢀՃ • man page ͷ೔ຊޠ༁ • ެࣜϖʔδ

    (linuxcontainers.org) ຋༁ • όάϑΟοΫεͳͲগ͚ͩ͠ίʔυʹ΋ߩݙ • LXD ೔ຊޠϝοηʔδ 4
  3. ίϯςφͱ͸ • Χʔωϧ͔ΒݟΔͱී௨ʹϓϩηε͕ىಈ͢Δ͚ͩ • ىಈ͢Δࡍʹִ཭Λࢦࣔ͢Δ • ΧʔωϧͷػೳͰ (ෳ਺ͷ) ಠཱۭͨؒ͠Λ࡞Γग़͠ɼϦιʔ εΛ෼ׂɾ෼഑͢Δ

    • ϓϩηεΛάϧʔϓԽͯ͠ଞͱϦιʔεۭؒΛִ཭ • άϧʔϓԽͨ͠ϓϩηεʹର͢ΔϦιʔε੍ݶ • Ծ૝Խͱ͍͏ΑΓʮִ཭Խʯͱݴͬͨ΄͏͕Θ͔Γ΍͍͢ ͔΋ • Ծ૝తͳίϯϐϡʔλɾγεςϜΛ࠶ݱ͢ΔԾ૝Ϛγϯʹର ͯ͠ɺԾ૝తͳ OS ؀ڥΛఏڙ͢Δ • ˠ OS ϨϕϧͷԾ૝Խ 11
  4. ίϯςφͷϝϦοτ • ߴີ౓Խ͕Մೳ • ىಈ͍ͯ͠Δ OS (Χʔωϧ) ͸Ұͭ • Φʔόʔϔου͕খ͍͞

    • ϋʔυ΢ΣΞͷԾ૝Խ͕ෆཁ • ىಈ͕ૣ͍ • Ծ૝ϚγϯͷىಈͰ͸ͳ͘ɼϗετ OS ͔ΒݟͨΒ୯ʹϓϩ ηε͕ىಈ͍ͯ͠Δ͚ͩͳͷͰɼී௨ͷϓϩάϥϜ͕ىಈ͢ Δͷͱ΄ͱΜͲมΘΒͳ͍ • ඞͣ͠΋γεςϜΛಈ͔͢ඞཁ͸ͳ͍ (ΞϓϦέʔγϣϯί ϯςφ) • ྫ͑͹ίϯςφ಺Ͱ͸ httpd ͷΈ͕ಈ͍͍ͯΔ • ίϯςφʹϝϞϦΛݻఆతʹׂΓ౰ͯΔඞཁ͕ͳ͍ 12
  5. ίϯςφͷσϝϦοτ • ҟͳΔ OS ͷγεςϜ / ϓϩάϥϜ͸ಈ͔ͤͳ͍ • ୯ʹϗετ OS

    ্Ͱϓϩηε͕ىಈ͢Δ͚ͩͳͷͰ౰ͨΓલ • ΧʔωϧʹؔΘΔૢ࡞͸Ͱ͖ͳ͍ • ىಈ͍ͯ͠ΔΧʔωϧ͸มΘΒͳ͍ͷͰ • ίϯςφຖʹϩʔυ͢ΔϞδϡʔϧΛม͑ΔͳͲ • Χʔωϧͷ࣮૷͸ෳࡶʹͳΔ • શͯΧʔωϧͷػೳͱ࣮ͯ͠૷͞Ε͍ͯΔͷͰ 13
  6. Linux ͰίϯςφΛ࣮ݱ͢ΔͨΊͷػೳ Linux Χʔωϧʹؚ·ΕΔ৭ʑͳػೳΛ૊Έ߹Θͤͯίϯςφ؀ ڥΛ࡞੒͢ΔɻͦΕͧΕͷػೳ͸ίϯςφઐ༻ͷػೳͱ͍͏Θ͚ Ͱ͸ͳ͍ɻ • ϓϩηεΛάϧʔϓԽͯ͠ଞͷάϧʔϓͱִ཭ • OS

    Ϧιʔεͷִ཭ • ˠ Namespace (໊લۭؒ) • άϧʔϓԽͨ͠ϓϩηεʹର͢ΔϦιʔε੍ݶ • ϗετͷ෺ཧϦιʔεʹର͢Δ੍ݶ • ˠ cgroup (control group) 18
  7. Linux ͰίϯςφΛ࣮ݱ͢ΔͨΊͷػೳ • ͦͷଞ • ωοτϫʔΫ (veth, macvlan ͳͲ) •

    έʔύϏϦςΟ • chroot (pivot root) • bind mount • Checkpoint/Restore (CRIU) • ͳͲͳͲ 19
  8. Linux ͷίϯςφ࣮૷ྫ • Docker ΞϓϦέʔγϣϯίϯςφͷ࣮ߦʹಛԽɻίϯςφؔ࿈ͷॲཧ͸ runC ϓ ϩδΣΫτ಺ͷ libcontainer Λ࢖༻ɻ͍·΍ʮDockerʯͱ͍͏ݴ༿͕ࢦ

    ͢΋ͷ͸ίϯςφͰ͸͋Γ·ͤΜɻΞϓϦέʔγϣϯΛ؆୯ʹ։ൃͨ͠Γ ߏஙͨ͠Γ͢ΔͨΊͷϓϥοτϑΥʔϜɺΠϯϑϥɻ • runC (libcontainer) Docker ʹΑΔ Open Container Project ४ڌͷ࣮૷ • LXC/LXD Ubuntu Λத৺ʹ։ൃɻओʹγεςϜίϯςφΛ࣮ߦ͢Δ͜ͱΛલఏʹ࡞ ΒΕ͍ͯΔ͕ɺΞϓϦέʔγϣϯίϯςφͷ࣮ߦ΋Մೳɻඇಛݖίϯςφ ͕࣮ߦͰ͖Δɻ 20
  9. Linux ͷίϯςφ࣮૷ྫ • OpenVZ Linux ͷίϯςφ࣮૷ͱͯ͠͸ݹ͔͘Β͋Δ࣮૷ͷͻͱͭɻ2000 ೥͝Ζ ͔ΒɻΧʔωϧʹύονΛద༻͢ΔɻΧʔωϧʹ࣮૷͞Ε͍ͯΔίϯςφ ؔ࿈ػೳ͸ OpenVZ

    ༝དྷͷػೳ͕ଟ਺͋ΔɻOpenVZ Λϕʔεʹͨ͠঎ ༻൛ Virtuozzo ͕ଘࡏ͢Δɻ • rkt CoreOS ͕ࣾ։ൃ͢ΔΞϓϦέʔγϣϯίϯςφͷϥϯλΠϜɻ • systemd ͝ଘ஌ Linux ޲͚ͷ࠷ۙओྲྀͱͳͬͨ init ࣮૷ͷͻͱͭɻίϯςφΛѻ͏ ίϚϯυ΍࢓૊Έ΋಺แ͍ͯ͠Δ 21
  10. Linux ͷίϯςφ࣮૷ྫ • MINCS γΣϧεΫϦϓτͰॻ͔Εͨίϯςφ࣮૷ • bocker “Docker implemented in

    around 100 lines of bash” • haconiwa ίϯςφ࣮૷ɾ࡞੒ͷͨΊͷ (m)Ruby DSL • aqr perl Ͱॻ͔Εͨίϯςφ࣮૷ • Awesome Container ͦͷଞ৭ʑ·ͱ·ͬͯ·͢ 22
  11. Mount Namespace (2.4.19ʙ) • ϓϩηε͔Βݟ͍͑ͯΔϚ΢ϯτͷू߹ɼૢ࡞Λ෼཭͢Δɽ Namespace ಺ͷ mount, umount ͕ଞͷ

    Namespace ʹӨ ڹΛ༩͑ͳ͍Α͏ʹͰ͖Δ (༩͑ΔΑ͏ʹ΋Ͱ͖Δ) ˠ private/shared/slave • ࢀߟ: • Ϛ΢ϯτ໊લۭؒΛద༻͢Δ (IBM developerWorks) • Mount Namespace and shared subtrees (lwn.net) • Mount namespaces, mount propagation, and unbindable mounts (lwn.net) • Χʔωϧෟଐจॻ (Documentation/filesystems/sharedsubtree.txt) • σϑΥϧτ͸ private ͕ͩɺsystemd ͸/Λ shared ͰϚ΢ϯ τ͢Δ 26
  12. UTS Namespace (2.6.19ʙ) • ϗετ໊ͳͲɼuname(2) ͕ฦ͢஋ͷू߹Λ෼཭ɽ setdomainname(2), sethostname(2) Ͱ Namespace

    ಺ͷ ஋ͷΈมߋͰ͖Δ   user$ hostname enterprise --- (͜͜·Ͱϗετͷ Namespace) --- user$ sudo unshare --uts (৽͍͠ Namespace ࡞੒) root# hostname enterprise (ॳظ஋͸ϗετͱಉ͡) root# hostname utsns (ϗετ໊มߋ) root# hostname utsns root# exit logout --- (͔͜͜Βϗετͷ Namespace) --- user$ hostname enterprise   27
  13. PID Namespace (2.6.24ʙ) • PID ۭؒͷ෼཭ɽ৽͍͠ PID Namespace Ͱ͸ PID

    1 ͔Β ࢝·Δ PID ׂ͕Γ౰ͯΒΕΔɽ਌͔Βࢠͷ PID Namespace ͸ݟ͑Δ (਌ͷۭؒͷ PID Λ࣋ͭ) ͕ɼࢠ͔Β਌͸ݟ͑ͳ͍ 28
  14. IPC Namespace (2.6.19ʙ) • SysV IPC ΦϒδΣΫτɼPOSIX ϝοηʔδΩϡʔͷִ཭  

    # ipcs -q (ϗετͷ Namespace ্ͰϝοηʔδΩϡʔͷ֬ೝ) ------ Message Queues -------- key msqid owner perms used-bytes messages 0x4b79e805 32768 root 644 0 0 # unshare --ipc (৽ͨʹ IPC Namespace ࡞੒) # ipcs -q (৽ͨʹ࡞ͬͨ Namespace ͰΩϡʔΛ֬ೝ͢Δͱଘࡏ͠ͳ͍) ------ Message Queues -------- key msqid owner perms used-bytes messages   29
  15. User Namespace (3.8ʙ) • ಠཱͨ͠ UID/GID ۭؒͱ֎෦ۭؒͷϚοϐϯά (ྫ͑͹ɼ ִ཭ۭؒͰ͸ uid/gid

    0/0ɼ֎෦Ͱ͸ 1000/1000 ͱ͔Մೳ ʹͳΔ) • User Namespace ͸ҰൠϢʔβͰ࡞੒Ͱ͖ɺNamespace ಺ ͷಛݖϢʔβ͸ଞͷ Namespace Λ࡞੒Ͱ͖Δ (User Namespace Ҏ֎ͷ Namespace ͸ಛݖ͕ඞཁ) 30
  16. cgroup Namespace (4.6ʙ) • cgroup ͷִ཭ • /proc/$PID/cgroup ϑΝΠϧ಺ͷ cgroup

    ύε • namespace ಺ͰϚ΢ϯτͨ͠ cgroupfs πϦʔ • (͜ͷ Namespace Ͱ clone(2) ʹ༩͑Δϑϥά (32bit ੔਺) Λ࢖͍͖Γ·ͨ͠ :-) • Ubuntu 16.04 ͷ 4.4 Χʔωϧʹ͸όοΫϙʔτࡁ 32
  17. cgroup ͱ͸ ϓϩηεΛάϧʔϓԽ͠ɺάϧʔϓʹରͯ͠Ϧιʔε੍ݶΛߦ ͏ɻίϯςφઐ༻ͷ࢓૊ΈͰ͸ͳ͍ɻ • cgroup ͷಛ௃ • ػೳ͝ͱʹαϒγεςϜʹ෼͔ΕΔ •

    cgroupfs ΛϚ΢ϯτͯ͠σΟϨΫτϦͰάϧʔϓΛද͢ • ϓϩηεΛάϧʔϓ಺ͷ tasks ϑΝΠϧʹ௥Ճ͢Δͱؔ࿈͢ ΔλεΫ͕εϨου୯ҐͰάϧʔϓʹ௥Ճ͞ΕΔ • ෳ਺֊૚ߏ଄ɻվ଄ߏ଄͝ͱʹҟͳΔπϦʔΛ࡞੒Ͱ͖Δɻ ͨͩ͠ɺҰͭͷαϒγεςϜ͕ॴଐͰ͖ΔπϦʔ͸Ұͭ • πϦʔͷͲͷϨϕϧͷάϧʔϓʹ΋λεΫ͕ॴଐͰ͖Δ 37
  18. cgroup ͷαϒγεςϜ • cpu: 2.6.24 • CFS(Completely Fair Scheduler) bandwidth

    controlɽ୯Ґ ࣌ؒ಺ͷάϧʔϓ಺ͷλεΫ͕࣮ߦͰ͖Δ߹ܭ࣌ؒΛ੍ݶ͢ Δ (3.2 Ͱ࣮૷) • ૬ର഑෼ɽάϧʔϓؒͷ CPU ࣌ؒͷׂ౰ͷׂ߹Λࢦఆ͢Δɽ ྫ͑͹ GroupA=100,GroupB=50 ͱ͢Δͱ A:B=2:1 • cpuacct: 2.6.24 • άϧʔϓ಺ͷ CPU ϦιʔεͷϨϙʔτ (CPU ࣌ؒ) • cpuset: 2.6.24 • ׂΓ౰ͯΔ CPU, ϝϞϦϊʔυͷׂ౰ 39
  19. cgroup ͷαϒγεςϜ • device: 2.6.26 • σόΠε΁ͷΞΫηεڐՄɼ੍ݶͷࢦఆ • freezer: 2.6.28

    • άϧʔϓ಺ͷϓϩηεΛશͯҰ࣌ఀࢭ͢Δ • memory: 2.6.29 • ϝϞϦϦιʔεͷ੍ݶ (ϢʔβϝϞϦɼΧʔωϧϝϞϦ) • blkio (Block IO): • I/O weight controller(2.6.33 Ҏ߱) άϧʔϓͷ༏ઌ౓Λࢦ ఆ͢Δ • I/O throttling(2.6.37 Ҏ߱) άϧʔϓ಺ͷϓϩηεͷσόΠ εʹର͢Δૢ࡞਺ͷ߹ܭͷࢦఆ • (ࢀߟ)Linux2.6.37 ͷ৽ػೳ “I/O throttling” 40
  20. cgroup ͷαϒγεςϜ • hugetlb: 3.6 • cgroup ͔Βͷ hugetlb ͷ࢖༻

    • perf event: 2.6.39 • άϧʔϓ୯ҐͰ perf πʔϧͰϞχλϦϯά (ύϑΥʔϚϯε ղੳ) • net cls: 2.6.29 • ύέοτʹࣝผࢠΛ͚ͭɼτϥϑΟοΫίϯτϩʔϧ (tc) ͱ netfilter(3.14 Ҏ߱) ͰίϯτϩʔϧՄೳʹ • Linux 3.14 Ͱ net cls cgroup ʹ௥Ճ͞Εͨ netfilter ରԠ • net prio: 3.3 • άϧʔϓؒͰͷωοτϫʔΫͷ༏ઌ౓ΛΠϯλʔϑΣʔεຖ ʹࢦఆ͢Δ • Linux 3.3 ͷ৽ػೳ Network priority cgroup • Linux 3.3 ͷ৽ػೳ Network priority cgroup (2) 41
  21. cgroup ͷαϒγεςϜ • pids: 4.3 • fork() ΍ clone() ͰىಈͰ͖Δϓϩηε਺Λ੍ݶ͢Δ

    • LXC ͰֶͿίϯςφೖ໳ ୈ 30 ճ Linux Χʔωϧͷίϯς φػೳ [8] ʔ cgroup ͷ pids αϒγεςϜ • rdma: 4.11 • Remote Direct Memory Access 42
  22. cgroup ͷ࢖͍ํ cgroup ͸ίϯςφͱؔ܎ͳ͘࢖༻Մೳ   # mount -t tmpfs

    cgroup_root /sys/fs/cgroup # mkdir /sys/fs/cgroup/memory # mount -t cgroup -o memory cgroup /sys/fs/cgroup/memory (ϝϞϦαϒ γεςϜͷϚ΢ϯτ) # mkdir /sys/fs/cgroup/memory/test01 ("test01" ͱ͍͏άϧʔϓͷ࡞੒) # echo $$ > /sys/fs/cgroup/memory/test01/tasks (ϓϩηεΛάϧʔϓʹొ ࿥) # cat /sys/fs/cgroup/memory/test01/tasks (άϧʔϓ಺ͷϓϩηεͷ֬ೝ) 2824 2837 # echo 30M > /sys/fs/cgroup/memory/test01/memory.limit_in_bytes (άϧʔϓʹରͯ͠ϝϞϦ্ݶ 30M ͱ͍͏੍ݶΛઃఆ) # cat /sys/fs/cgroup/memory/test01/memory.limit_in_bytes (੍ݶ஋ͷ֬ ೝ) 31457280 # cat /sys/fs/cgroup/memory/test01/memory.usage_in_bytes (ݱࡏͷ࢖༻ ྔͷ֬ೝ) 565248   43
  23. ίϯςφͰ࢖͏ωοτϫʔΫػೳ ʙ ipvlan • macvlan ͱಉ༷ͷػೳɻMAC ΞυϨε͸શΠϯλʔϑΣʔ εͰڞ௨͕ͩɺผͷΞυϨεΛׂΓৼΔ (ׂΓ౰ͯΔ෺ཧΠ ϯλʔϑΣʔεͷ

    MAC ΞυϨεΛڞ༗͢Δ) • 2 ͭͷϞʔυ͕ଘࡏ L2 Ϟʔυ શΠϯλʔϑΣʔε͕ɺ਌ΠϯλʔϑΣʔεͱ ಉ͡ωοτϫʔΫʹଐ͢Δ L3 Ϟʔυ ΠϯλʔϑΣʔε͝ͱʹҧ͏ωοτϫʔΫͱ ͳΔɻ • ར༻γʔϯ • ਌ΠϯλʔϑΣʔε͕ແઢ LAN (ผͷ MAC ΞυϨεΛׂΓ ౰ͯΒΕͳ͍) • ਌ΠϯλʔϑΣʔεʹׂΓ౰ͯΒΕΔ MAC ΞυϨεͷ਺͕ ੍ݶ͞Ε͍ͯΔɻ΋͘͠͸ύϑΥʔϚϯεͷཧ༝͔Βଟ਺Λ ׂΓ౰ͯΒΕͳ͍ • ಛผͳωοτϫʔΫγφϦΦ 48
  24. σϞ 1. Network Namespace Λ࡞੒ 2. Network Namespace Λ֬ೝ 3.

    ࡞੒௚ޙͷ Network Namespace Λ֬ೝ 3.1 ΠϯλʔϑΣʔε 3.2 ϧʔςΟϯά 3.3 ϑΟϧλϦϯά 4. veth ϖΞͷ࡞੒ (veth0-host / veth0-ns) 5. ࡞੒௚ޙͷ veth ϖΞͷ֬ೝ 6. veth0-ns Λ Namespace netns01 ʹׂΓ͋ͯΔ 7. ϗετଆͷ veth ΠϯλʔϑΣʔεͷ֬ೝ 8. Namespace netns01 ಺ͷΠϯλʔϑΣʔεͷ֬ೝ 9. ϗετଆͷ veth0-host ʹΞυϨεΛׂΓ͋ͯΔ 10. Namespace netns01 ಺ͷ veth0-ns ʹΞυϨεΛׂΓ͋ͯΔ 11. ϗετଆͷ veth0-host Λ࡟আ 12. Namespace netns01 Λ࡟আ 50
  25. σϞ   NETNS="netns01" VETH="veth0" ip a ip netns add

    $NETNS ip netns list ip netns exec $NETNS ip link show ip netns exec $NETNS ip route ip netns exec $NETNS iptables -L -n -v ip link add name $VETH-host type veth peer name $VETH-ns ip link show | grep $VETH # on host ip link set $VETH-ns netns $NETNS ip link show | grep $VETH # on host ip netns exec $NETNS ip link show # in netns ip addr add 10.10.10.10/24 dev $VETH-host ip link set $VETH-host up ip addr show | grep veth ip netns exec $NETNS ip addr add 10.10.10.11/24 dev $VETH-ns ip netns exec $NETNS ip link set $VETH-ns up ip netns exec $NETNS ip addr show | grep veth ping -c 1 10.10.10.11 ip netns exec $NETNS ping -c 1 10.10.10.10 ip link delete $VETH-host ip netns delete $NETNS   51
  26. ·ͱΊ • ίϯςφͷ֓ཁ • Linux ʹ͓͚Δίϯςφͷ࢓૊Έ • ίϯςφ͸Χʔωϧʹ࣮૷͞Ε͍ͯΔ৭ʑͳػೳͷ૊Έ߹Θ ͤͰ࣮ݱ͞Ε͍ͯΔ •

    Namespace • OS Ϧιʔεͷִ཭ • cgroup • ϗετͷ෺ཧϦιʔεͷ੍ݶ • ωοτϫʔΫؔ࿈ػೳ • veth • macvlan • ίϯςφͰ࢖͑Δ໘ന͍ػೳ 53
  27. ίϯςφܕԾ૝Խͷ৘ใަ׵ձ • https://sites.google.com/site/containerstudy/ • http://ct-study.connpass.com/ • ίϯςφٕज़ʹؔ࿈͢Δ࿩୊Λѻ͏ • ίϯςφʹؔ࿈͢ΔΧʔωϧͷ࣮૷ʹ͍ͭͯ •

    ֤छπʔϧΩοτͷ঺հɼ࣮૷ʹ͍ͭͯ • ίϯςφٕज़Λ࢖ͬͨπʔϧ΍ιϑτ΢ΣΞͷ঺հ΍࣮૷ʹ ͍ͭͯ • ίϯςφٕज़ͷ׆༻ɾӡ༻ࣄྫ • ͦͷଞʮίϯςφʯͱ͍͏Ωʔϫʔυ͕গ͠Ͱ΋ೖ͍ͬͯΔ ٕज़ʹ͍ͭͯ • ͜Ε·Ͱେࡕͱ౦ژͰަޓʹ 8 ճ࣮ࢪɻ࣍ճ͸෱Ԭͷ༧ఆɻ 55