Slide 1

Slide 1 text

Linux Kernel ͷίϯςφػೳ ೖ໳ฤ Ճ౻ହจ Docker Meetup Tokyo #2 2014-04-11

Slide 2

Slide 2 text

Ճ౻ହจ http://www.ten-forward.ws/ @ten_forward http://gplus.to/tenforward https://github.com/tenforward · · · · 3/35

Slide 3

Slide 3 text

ۀ຿ ϑΝʔεταʔόגࣜձࣾ ج൫։ൃ෦ · αʔϏεͷ։ൃ ৭ʑͳٕज़ͷௐࠪ ࣮͸ίϯςφ͸ۀ຿Ͱ΄΅࢖ͬͯ·ͤΜ - - - Ҏલ͸ Virtuozzo Λগ͠ ࠓ͸ Docker Λ͘͝؆୯ʹ - - 4/35

Slide 4

Slide 4 text

ۀ຿͡Όͳ͍׆ಈ ίϯςφؔ࿈ٕज़ͷௐࠪ Plamo Linux ϝϯςφ IP ి࿩αʔϏεͷ։ൃΛͨ͠བྷΈͰ೔ຊ Asterisk Ϣʔβձ׆ಈΛҎલগ͠ Jetspeed-2 υΩϡϝϯτ຋༁ ʲվగ৽൛ʳ LinuxΤϯδχΞཆ੒ಡຊ (ٕज़ධ࿦ࣾ) · αʔϏεͰ࢖͑ͳ͍͔ͱ 2010 ೥͘Β͍ʹ cgroup ͷௐࠪΛ࢝Ίͨͷ͕͖ͬ ͔͚ lxc man pages ຋༁ - - · · · · 5/35

Slide 5

Slide 5 text

ϒϩά΍ͬͯ·͢ http://d.hatena.ne.jp/defiant/ ·

Slide 6

Slide 6 text

ษڧձ΍ͬͯ·͢ ͳΜͱ໌೔!! ౦ژͰ΋·ͨ΍Γ͍ͨͷͰΑΖ͓͘͠ئ͍க͠·͢ɽ 7/35

Slide 7

Slide 7 text

ࠓ೔ͷ͓୊ Docker ૉਓ͕ Docker ͕࢖༻͍ͯ͠Δ Linux Χʔωϧͷίϯςφػೳʹ͍ͭͯগ ͚͓ͩ͠࿩͠·͢ Docker ͷࢹ఺͔ΒݟΔʮίϯςφʯͱগ͠ҧ͏͔΋஌Ε·ͤΜ · ίϯςφͷجૅ Linux ʹ͓͚Δίϯςφͷ࢓૊Έ LXC ʹ͍ͭͯ - - - · 8/35

Slide 8

Slide 8 text

ίϯςφͷجૅ 9/35

Slide 9

Slide 9 text

ίϯςφͱ͸ OS ϨϕϧͷԾ૝Խ Χʔωϧ͕࣋ͭػೳ ΧʔωϧͷػೳͰ (ෳ਺ͷ) ಠཱۭͨؒ͠Λ࡞Γग़͠ɼϦιʔεΛ෼ׂɾ෼഑͢Δ ʮԾ૝Խʯͱ͍͏ΑΓ͸ʮִ཭Խʯ · · · ϓϩηεΛάϧʔϓԽͯ͠ଞͷάϧʔϓͱϦιʔεۭؒΛִ཭ άϧʔϓԽͨ͠ϓϩηεʹର͢ΔϦιʔε੍ݶ - - · 10/35

Slide 10

Slide 10 text

ίϯςφͷϝϦοτ ߴີ౓Խ͕Մೳ Φʔόʔϔου͕খ͍͞ ىಈ͕ૣ͍ ඞͣ͠΋γεςϜΛಈ͔͢ඞཁ͸ͳ͍ (ΞϓϦέʔγϣϯίϯςφ) Ծ૝Ϛγϯͷ্Ͱ΋໰୊ͳ͘ಈͥ͘! · ىಈ͍ͯ͠Δ OS (Χʔωϧ) ͸Ұͭ - · ϋʔυ΢ΣΞͷԾ૝Խ͕ෆཁ - · Ծ૝ϚγϯͷىಈͰ͸ͳ͘ɼϗετ OS ͔ΒݟͨΒ୯ʹϓϩηε͕ىಈͯ͠ ͍Δ͚ͩͳͷͰɼී௨ͷϓϩάϥϜ͕ىಈ͢Δͷͱ΄ͱΜͲมΘΒͳ͍ - · ྫ͑͹ίϯςφ಺Ͱ͸ httpd ͷΈ͕ಈ͍͍ͯΔ - · ࠷ۙ͸ KVM ͷ্Ͱ KVM ಈ͍ͨΓ͢ΔͷͰίϯςφͳΒͰ͸ͱ͍͏Θ͚Ͱ΋ ͳ͍ - 11/35

Slide 11

Slide 11 text

ίϯςφͷσϝϦοτ ҟͳΔ OS ͷγεςϜ / ϓϩάϥϜ͸ಈ͔ͤͳ͍ ΧʔωϧʹؔΘΔૢ࡞͸Ͱ͖ͳ͍ Χʔωϧͷ࣮૷͸ෳࡶʹͳΔ · ୯ʹϗετ OS ্Ͱϓϩηε͕ىಈ͢Δ͚ͩͳͷͰ౰ͨΓલ - · ىಈ͍ͯ͠ΔΧʔωϧ͸มΘΒͳ͍ͷͰ ίϯςφຖʹϩʔυ͢ΔϞδϡʔϧΛม͑ΔͳͲ - - · શͯΧʔωϧͷػೳͱ࣮ͯ͠૷͞Ε͍ͯΔͷͰ - 12/35

Slide 12

Slide 12 text

Linux ʹ͓͚Δίϯςφ࣮૷ Χʔωϧͷػೳ (+ ύον) + ΧʔωϧͷػೳΛ࢖͏ userspace πʔϧ Χʔωϧ + ύον + userspace πʔϧ Χʔωϧ + userspace πʔϧ (લճΑΓ૿͑ͨ! :-) · OpenVZ / Virtuozzo(঎༻) Linux VServer - - · LXC libvirt (lxc υϥΠό) systemd(systemd-nspawn) vzctl for upstream kernel lmctfy docker(libcontainer) 0.9 Ҏ߱ - - - - - - 13/35

Slide 13

Slide 13 text

Linux ʹ͓͚Δίϯςφͷ࢓૊Έ 14/35

Slide 14

Slide 14 text

Linux Χʔωϧͷόʔδϣϯͱίϯςφ 3.0 3.8 3.9 3.12 ͨͩ͠ Cgroup ͸ݱࡏ࠶ߏஙͷਅͬ࠷த! · setns() γεςϜίʔϧͷ࣮૷ (glibc ͸ 2.14 Ҏ߱) - · ίϯςφͷओཁػೳ͕Ұ௨Γἧͬͨόʔδϣϯ - · 3.8 Ͱἧͬͨػೳ͕࣮༻ʹͳͬͨόʔδϣϯ (=XFSҎ֎ͷϑΝΠϧγεςϜʹ ࣮૷͞Εͨ) - · XFS ΁ͷ࣮૷͕׬ྃ ͨͩ͠ 3.12.8 Ҏલ͸ LXC Λ࢖͏্ͰҰ෦໰୊͕͋Δ - - · Linux Χʔωϧͷ͢΂ͯ: cgroup ͷ࠶ઃܭ (linux.com) ػೳతʹ͸ࠓ͋ΔϞϊ͕े෼࢖͑ΔϨϕϧ͕ͩɼࠓޙ͔ͳΓมԽ͍ͯ͘͠༧ ఆͳͷͰ஫ҙ͕ඞཁɽ - - 15/35

Slide 15

Slide 15 text

Linux ͰίϯςφΛ࣮ݱ͢ΔͨΊͷػೳ ϓϩηεΛάϧʔϓԽͯ͠ଞͷάϧʔϓͱִ཭ άϧʔϓԽͨ͠ϓϩηεʹର͢ΔϦιʔε੍ݶ chroot (pivot_root) ͦͷଞ · → Namespace (໊લۭؒ) - · → Cgroups (control groups) - · · ωοτϫʔΫ (veth, macvlan) έʔύϏϦςΟ Checkpoint/Restore (CRIU) ͳͲͳͲ... - - - - 16/35

Slide 16

Slide 16 text

Namespace ͷछྨ (1) Mount Namespace: 2.4.19 UTS Namespace: 2.6.19 PID Namespace: 2.6.24 · ϓϩηε͔Βݟ͍͑ͯΔϚ΢ϯτͷू߹ɼૢ࡞Λ෼཭͢ΔɽNamespace ಺ ͷ mount, umount ͸ଞͷ Namespace ʹ͸Өڹ͠ͳ͍ (ࢀߟ) Ϛ΢ϯτ໊લۭؒΛద༻͢Δ(IBM developerWorks) - - · ϗετ໊ͳͲɼuname(2) ͕ฦ͢஋ͷू߹Λ෼཭ɽsetdomainname(2), sethostname(2) Ͱ Namespace ಺ͷ஋ͷΈมߋͰ͖Δ - · PID ۭؒͷ෼཭ɽ৽͍͠ PID NamespaceͰ͸ PID 1 ͔Β࢝·Δ PID ׂ͕Γ ౰ͯΒΕΔɽ਌͔Βࢠͷ PID Namespace ͸ݟ͑Δ (਌ͷۭؒͷ PID Λ࣋ͭ) ͕ɼࢠ͔Β਌͸ݟ͑ͳ͍ - 17/35

Slide 17

Slide 17 text

Namespace ͷछྨ (2) IPC Namespace: 2.6.19 User Namespace: 2.6.23 ~ 3.8 Network Namespace: 2.6.26 · SysV IPC ΦϒδΣΫτɼPOSIX ϝοηʔδΩϡʔͷִ཭ - · ಠཱͨ͠ UID/GID ۭؒͱ֎෦ۭؒͷϚοϐϯά (ྫ͑͹ɼִ཭ۭؒͰ͸ uid/gid 0/0ɼ֎෦Ͱ͸ 1000/1000 ͱ͔ՄೳʹͳΔ) - · ωοτϫʔΫϦιʔεͷִ཭ɽωοτϫʔΫσόΠεɼΞυϨεɼϧʔςΟ ϯάςʔϒϧɼιέοτɼϑΟϧλϦϯά - 18/35

Slide 18

Slide 18 text

Namespace ͷૢ࡞ clone(2) Ͱ৽͍͠ϓϩηε Λੜ੒ unshare(2) Ͱ৽͍͠ϓϩ ηεΛੜ੒ͤͣʹ࣮ߦίϯςΩετΛ੍ޚ͢Δ setns(2) ͰϓϩηεΛطଘ ͷNamespaceʹؔ࿈෇͚Δ · · unshareͷ࢖༻ྫ - · 19/35

Slide 19

Slide 19 text

Cgroup (1) ϓϩηεΛάϧʔϓԽ͠ɼάϧʔϓʹରͯ͠Ϧιʔε੍ݶΛߦ͏ɽผʹίϯςφઐ ༻ͷ࢓૊ΈͰ͸ͳ͍ɽ cpu cpuacct cpuset · CFS(Completely Fair Scheduler) bandwidth controlɽ୯Ґ࣌ؒ಺ͷάϧʔϓ ಺ͷλεΫ͕࣮ߦͰ͖Δ߹ܭ࣌ؒΛ੍ݶ͢Δ (3.2 Ͱ࣮૷) ૬ର഑෼ɽάϧʔϓؒͷ CPU ࣌ؒͷׂ౰ͷׂ߹Λࢦఆ͢Δɽྫ͑͹ GroupA=100, GroupB=50 ͱ͢Δͱ A:B = 2:1 - (ࢀߟ) Linux 3.2 ͷ CFS bandwidth control - - · άϧʔϓ಺ͷ CPU ϦιʔεͷϨϙʔτ (CPU ࣌ؒ) - · ׂΓ౰ͯΔ CPU, ϝϞϦϊʔυͷׂ౰ - 20/35

Slide 20

Slide 20 text

Cgroup (2) device freezer memory blkio (Block IO) · σόΠε΁ͷΞΫηεڐՄɼ੍ݶͷࢦఆ - · άϧʔϓ಺ͷϓϩηεΛશͯҰ࣌ఀࢭ͢Δ - · ϝϞϦϦιʔεͷ੍ݶ (ϢʔβϝϞϦɼΧʔωϧϝϞϦ) - · I/O weight controller (2.6.33 Ҏ߱) άϧʔϓͷ༏ઌ౓Λࢦఆ͢Δ I/O throttling (2.6.37 Ҏ߱) άϧʔϓ಺ͷϓϩηεͷσόΠεʹର͢Δ bytes/second ͷ߹ܭͷࢦఆ - - (ࢀߟ) Linux 2.6.37 ͷ৽ػೳ "I/O throttling" - 21/35

Slide 21

Slide 21 text

Cgroup (3) hugetlb perf_event net_cls net_prio · hugetlb ʹର͢Δ੍ݶ (3.6 Ҏ߱) mm/hugetlb: add new HugeTLB cgroup - - · άϧʔϓ୯ҐͰ perf πʔϧͰϞχλϦϯά (ύϑΥʔϚϯεղੳ) - · ύέοτʹࣝผࢠΛ͚ͭɼτϥϑΟοΫίϯτϩʔϧ (tc) ͱ netfilter (3.14 Ҏ߱) ͰίϯτϩʔϧՄೳʹ - · άϧʔϓؒͰͷωοτϫʔΫͷ༏ઌ౓ΛΠϯλʔϑΣʔεຖʹࢦఆ͢Δ Linux 3.3 ͷ৽ػೳ Network priority cgroup Linux 3.3 ͷ৽ػೳ Network priority cgroup (2) - - - 22/35

Slide 22

Slide 22 text

Cgroup (4) Cgroup ͸ίϯςφͱؔ܎ͳ͘࢖༻Մೳ cgroupfs ͱ͍͏ٙࣅϑΝΠϧγεςϜʹΑΔ࣮૷ · · # mount -t tmpfs cgroup_root /sys/fs/cgroup # mkdir /sys/fs/cgroup/memory # mount -t cgroup -o memory cgroup /sys/fs/cgroup/memory (ϝϞϦαϒγεςϜͷϚ΢ϯτ) # mkdir /sys/fs/cgroup/memory/test01 ("test01" ͱ͍͏άϧʔϓͷ࡞੒) # echo $$ > /sys/fs/cgroup/memory/test01/tasks (ϓϩηεΛάϧʔϓʹొ࿥) # cat /sys/fs/cgroup/memory/test01/tasks (άϧʔϓ಺ͷϓϩηεͷ֬ೝ) 2824 2837 # echo 30M > /sys/fs/cgroup/memory/test01/memory.limit_in_bytes (άϧʔϓʹରͯ͠ϝϞϦ্ݶ 30M ͱ͍͏੍ݶΛઃఆ) # cat /sys/fs/cgroup/memory/test01/memory.limit_in_bytes (੍ݶ஋ͷ֬ೝ) 31457280 # cat /sys/fs/cgroup/memory/test01/memory.usage_in_bytes (ݱࡏͷ࢖༻ྔͷ֬ೝ) 565248 23/35

Slide 23

Slide 23 text

LXC ࠷৽ಈ޲ 24/35

Slide 24

Slide 24 text

Docker ͱ LXC ݩʑ Docker ͸ LXC ͱ aufs Λϕʔεʹͨ͠ιϑτ΢ΣΞͰͨ͠ LXC · 0.7 Ͱ aufs Ҏ֎΋࢖͑ΔΑ͏ʹͳΓ 0.9 Ͱ LXC ͳͯ͘΋ಈ͘Α͏ʹͳΓ (υϥΠό͕͋Ε͹ଞͷίϯςφͰ΋) - CentOS ରԠͷͨΊ? - - · 2008 ೥ࠒ͔Β Daniel Lezcano ࢯΛத৺ʹ։ൃελʔτ 2013 ೥ 9 ݄ʹϝϯςφ͕ Serge Hallyn ࢯ, Stéphane Graber ࢯʹަ୅ 2014 ೥ 2 ݄ʹ 1.0.0 ϦϦʔε - - - 25/35

Slide 25

Slide 25 text

LXC 1.0.0 (1) 2014 ೥ 2 ݄ 20 ೔ϦϦʔε!! API ͷ੔උɽliblxc1 ͱͦΕΛ࢖ͬͨίϚϯυϥΠϯπʔϧ ඇಛݖίϯςφɽҰൠϢʔβͰίϯςφΛ࣮ߦՄೳʹ (User Namespace) ֤छݴޠͷ bindings Ϋϩʔϯͱεφοϓγϣοτػೳ ίϚϯυϥΠϯπʔϧͷ੔ཧ (ෆཁͳ΋ͷͷ࡟আ) ϞχλϦϯάͷվྑ 1.0 ܥ͸ 5 ೥αϙʔτ · · · · lua (in tree) python3 (in tree) Go (out of tree) ruby (out of tree) - - - - · · · · 26/35

Slide 26

Slide 26 text

LXC 1.0.0 (2) ίϯςφͷ rootfs ʹ༷ʑͳ backingstore Λར༻Մೳʹ υΩϡϝϯτͷߋ৽ · σΟϨΫτϦ (ඪ४) btrfs zfs lvm loop device aufs overlayfs - - - - - - - · man pages ॆ࣮ API υΩϡϝϯτ (liblxc) ೔ຊޠ man pages ͷ௥Ճ (!) - - - 27/35

Slide 27

Slide 27 text

LXC 1.0.0 (3) ςϯϓϨʔτͷॆ࣮ɽओཁσΟετϦϏϡʔγϣϯ͕Ұ௨Γἧͬͨײ͡ · CentOS ͷ௥Ճ!! - lxc-alpine lxc-cirros lxc-openmandriva lxc-ubuntu lxc-altlinux lxc-debian lxc-opensuse lxc-ubuntu-cloud lxc-archlinux lxc-download lxc-oracle lxc-busybox lxc-fedora lxc-plamo lxc-centos lxc-gentoo lxc-sshd μ΢ϯϩʔυςϯϓϨʔτ ݱࡏͷ stable ͸ 1.0.3 (݁ߏසൟʹϦϦʔε͞Ε͍ͯ·͢) · ඇಛݖίϯςφΛ lxc-create ͢Δʹ͸৭ʑো֐͕͋ΔͷͰμ΢ϯϩʔυͰ σΠϦʔͰओཁσΟετϦϏϡʔγϣϯͷ rootfs image ͕Ϗϧυ͞Ε͍ͯΔ - - centos, debian, fedora, gentoo, oracle, plamo, ubuntu - · 28/35

Slide 28

Slide 28 text

࠷ޙʹ 29/35

Slide 29

Slide 29 text

·ͱΊ Linux ͷίϯςφͷओͳཁૉ (ଞʹ΋͋Γ·͕͢) Namespace (໊લۭؒ) Cgroups Chroot (pivot_root) · · · 30/35

Slide 30

Slide 30 text

ϝʔϦϯάϦετ / ຋༁ lxc JP άϧʔϓ lxc man pages ຋༁ linuxcontainers.org ຋༁ · ίϯςφͷ࿩Λ·ͬͨΓ΍͍ͬͯ·͢ɽͨ·ʔʹ͔͠ϝʔϧ͸དྷ·ͤΜɽlxc- jp ͱ͍͏໊લͰ͕͢ɼ࿩୊͸ LXC ʹݶΒͣԿͰ΋ OK Ͱ͢ɽ - · ڠྗऀืू! (ಛʹࠪಡ!!) - · ڠྗऀืू - 31/35

Slide 31

Slide 31 text

ίϯςφܕԾ૝Խͷ৘ใަ׵ձ ࠓ೥தʹ౦ژͰ΍Γ͍ͨ ൃදऀืूத · · 32/35

Slide 32

Slide 32 text

͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠ 33/35

Slide 33

Slide 33 text

Important contact information goes here. twitter @ten_forward www www.ten-forward.ws/ github github.com/tenforward