cgroup v2 / 10th CTStudy

2591343b244565d6199f61c4acd148f9?s=47 tenforward
October 29, 2016

cgroup v2 / 10th CTStudy

「第10回 コンテナ型仮想化の情報交換会@東京」の発表資料です。
参考となる情報にはPDF中からリンクをしていますが、資料中のリンクは Speaker Deck 上ではクリックできないので PDF をダウンロードしてご覧ください。

2591343b244565d6199f61c4acd148f9?s=128

tenforward

October 29, 2016
Tweet

Transcript

  1. cgroup v2 ୈ 10 ճίϯςφܕԾ૝Խͷ৘ใަ׵ձˏ౦ژ Ճ౻ହจ 2016-10-29 1

  2. ࣗݾ঺հ Ճ౻ହจ • http://www.ten-forward.ws/ • @ten forward • http://gplus.to/tenforward •

    https://github.com/tenforward • http://d.hatena.ne.jp/defiant/ (ٕज़ϒϩά) 2
  3. ࣗݾ঺հ • Plamo Linux ϝϯςφ • LXC ͰֶͿίϯςφೖ໳ɹʔܰྔԾ૝Խ؀ڥΛ࣮ݱ͢Δٕज़ gihyo.jp Ͱ࿈ࡌ

    3
  4. ࣗݾ঺հ • LXC/LXD ͷ։ൃʹগ͠ࢀՃ • man page ͷ೔ຊޠ༁ • ެࣜϖʔδ

    (linuxcontainers.org) ຋༁ • όάϑΟοΫεͳͲগ͚ͩ͠ίʔυʹ΋ߩݙ • LXD ೔ຊޠϝοηʔδ 4
  5. ࠓ೔ͷ໨ඪ 5

  6. ໨ඪ • cgroup v2 ͷجຊͷجຊΛ঺հ͢Δ 6

  7. cgroup ͓͞Β͍ 7

  8. cgroup ͱ͸ ϓϩηεΛάϧʔϓԽ͠ɺάϧʔϓʹରͯ͠Ϧιʔε੍ݶΛߦ ͏ɻίϯςφઐ༻ͷ࢓૊ΈͰ͸ͳ͍ɻ • ػೳ͝ͱʹαϒγεςϜ (ίϯτϩʔϥ) ʹ෼͔ΕΔ • cgroupfs

    ΛϚ΢ϯτͯ͠σΟϨΫτϦͰάϧʔϓΛද͢ • σΟϨΫτϦ಺ͷϑΝΠϧΛಡΈॻ͖͢Δ͜ͱͰૢ࡞Λߦ͏ • ݱ࣌఺Ͱ޿͘ར༻͞Ε͍ͯΔ cgroup v1 ͱɺ4.5 ΧʔωϧͰ stable ʹͳͬͨ cgroup v2 ͕͋Δ 8
  9. ༻ޠ cgroup v2 จॻͰ໌֬ʹఆٛ͞Ε·ͨ͠ɻ • cgroup ͸খจࣈ • cgroup or

    cgroups? • ୯਺ܗ͸ • ػೳΛද͢ࡍ • म০ࢠͱͯ͠ (“cgroup controllers” ͷΑ͏ʹ) • ෳ਺ܗ͸ • ໌֬ʹෳ਺ͷ cgroup Λࣔ͢ͱ͖ʹ࢖͏ 9
  10. cgroup ͷ֊૚ߏ଄ 10

  11. cgroup v1 ͷಛ௃ • ෳ਺ͷ֊૚Λ࣋ͯΔɻcgroupfs Λෳ਺Ϛ΢ϯτͰ͖Δ • άϧʔϓ΁ͷొ࿥͸εϨου୯Ґ • cgroupfs

    πϦʔͷͲͷϨϕϧͷϊʔυ (σΟϨΫτϦ) ʹ΋ λεΫ͕ొ࿥Ͱ͖Δ 11
  12. cgroup v1 ໰୊఺ 12

  13. ෳ਺֊૚ ෳ਺֊૚ • ೚ҙͷ਺ͷ֊૚Λ࡞੒Ͱ͖ɺͦΕͧΕʹ͍ͭ͘΋ͷίϯτ ϩʔϥ͕ॴଐͰ͖Δ • ॊೈͰศར ͷ͸ͣʜ 13

  14. ίϯτϩʔϥͷ੍ݶ ͱ͜Ζ͕ • ίϯτϩʔϥ͸ͻͱͭͷ֊૚ʹ͔͠ॴଐͰ͖ͳ͍ • ෳ਺ͷ֊૚Ͱ࢖͑Δͱศརͳίϯτϩʔϥ (ྫ͑͹ freezer) ͕ɺಛఆͷ֊૚Ͱ͔͠࢖͑ͳ͍ •

    Ұ౓͋Δ֊૚ʹଐͨ͠ίϯτϩʔϥ͸ҠಈͰ͖ͳ͍ • ಉ͡֊૚ʹଐ͍ͯ͠Δίϯτϩʔϥ͸ಉ͡ߏ଄Ͱͳ͚Ε͹ͳ Βͳ͍ 14
  15. ݁ہ ॊೈੑͳΜͯͦΜͳʹͳ͔ͬͨ • ίϯτϩʔϥ͝ͱʹ֊૚Λ࡞੒͢Δͷ͕Ұൠతʹͳͬͨ • ີ઀ʹؔ܎͠ɺಉ͡Α͏ͳάϧʔϓͰѻ͏ҙຯͷ͋Δίϯτ ϩʔϥ͚ͩಉ͡֊૚ʹଐͤ͞Δ͜ͱʹ 15

  16. ίϯτϩʔϥؒͷؔ܎ • ίϯτϩʔϥؒͷ࿈ܞ͕ͳ͍ • ෳ਺ίϯτϩʔϥͰ࿈ܞͯ͠ಈ࡞ͤ͞ΒΕͳ͍ • ίϯτϩʔϥʹΑͬͯಈ͖΋όϥόϥ • cpuset ͷ

    cgroup.clone children ϑΝΠϧ • memory ͷ memory.use hierarchy ϑΝΠϧ • ͳͲͳͲʜ 16
  17. λεΫͷѻ͍ • Ͳͷ֊૚ͷϊʔυʹ΋λεΫ͕ॴଐͰ͖Δ • ਌ࢠͷ cgroup ͲͪΒʹ΋λεΫ͕ଐ͍ͯ͠Δ৔߹ͷϦιʔ εׂΓৼΓͱ͔ΧΦε • λεΫͷ୯Ґ͕εϨου୯Ґ

    • ίϯτϩʔϥʹΑͬͯ͸ҙຯ͕ͳ͍ 17
  18. cgroup v2 ֓ཁ 18

  19. ྺ࢙ • 3.16 Ͱ “unified control group hierarchy” ͱͯ͠ಋೖ •

    DEVEL sane behavior ΦϓγϣϯͰϚ΢ϯτͯͨ͠ (·ͱ ΋ͳৼΔ෣͍Φϓγϣϯ!!) • (ࢀߟ) • The unified control group hierarchy in 3.16 (lwn.net) • Linux Χʔωϧͷ͢΂ͯ: cgroup ͷ࠶ઃܭ (linux.com) • 4.5 Ͱ stable ʹ 19
  20. ಛ௃ • ୯Ұ֊૚ߏ଄ • ؅ཧ͸ϓϩηε୯Ґ • v1 ͱڞଘͰ͖Δ • Ұ෦ͷίϯτϩʔϥ͸

    v2 Ͱɺଞͷίϯτϩʔϥ͸ v1 Ͱར༻ ͱ͔Ͱ͖Δ • ૢ࡞͸ v1 ͱಉ͡ (σΟϨΫτϦ؅ཧɺϑΝΠϧ΁ͷಡΈॻ ͖Ͱૢ࡞) 20
  21. ࣮૷ • 4.8 ࣌఺Ͱ memoryɺioɺpids ͷΈ • υΩϡϝϯτʹ cpu ͷهࡌ͋Γ·͕͢Ϛʔδ͞Ε͍ͯ·ͤΜ

    21
  22. cgroup v2 ૢ࡞ 22

  23. cgroup v2 ͷར༻ • Ϛ΢ϯτ͢Δ   # mount -t

    cgroup2 cgroup2 /sys/fs/cgroup   23
  24. cgroup ͷ࡞੒ɾ࡟আ • Ϛ΢ϯτ௚ޙ͸ root cgroup ͷΈଘࡏ • σΟϨΫτϦͷ࡞੒ʹΑΓάϧʔϓΛ࡞੒͢Δ 

     # mkdir cgroup_name   • σΟϨΫτϦͷ࡟আʹΑΓάϧʔϓΛ࡟আ͢Δ   # rmdir cgroup_name   • ࢠ cgroup ΋ͳ͘ɺϓϩηε΋ͳ͍ cgroup ͷΈ࡟আͰ͖Δ 24
  25. ϓϩηεΛ cgroup ʹॴଐͤ͞Δ • PID Λ cgroup.procs ʹॻ͖ࠐΉ  

    # echo $PID > /path/to/cgroup/cgroup.procs   • ϓϩηε͕Ͳͷ cgroup ʹଐ͢Δ͔͸/proc/$PID/cgroup ʹ Ϧετ͞ΕΔ   # cat /proc/$$/cgroup 0::/cgroup_name   25
  26. cgroup ͷঢ়ଶ؂ࢹ • root Ҏ֎ͷ cgroup ʹ͸ cgroup.events ϑΝΠϧ͕ଘࡏ 

     # cat cgroup.events populated 0 (ࣗ෼΋͘͠͸ࢠଙͷ cgroup ʹϓϩηε͕ଘࡏ͠ͳ͍ͱ͖͸ populated ͕ 0) # echo $$ > cgroup.procs (ϓϩηεΛ௥Ճ͢Δͱʜ) # cat cgroup.events populated 1 (ϓϩηε͕ଘࡏ͢Δͱ͖͸ populated ͕ 1)   • populated ͷ஋͕มԽ͢ΔͱϑΝΠϧ͕มԽͨ͠Πϕϯτൃ ੜ (poll,dnotify,inotify)   $ inotifywait -m /sys/fs/cgroup/test01/cgroup.events (test01 ʹϓϩηεΛ௥Ճ͢Δ) /sys/fs/cgroup/test01/cgroup.events MODIFY   26
  27. ίϯτϩʔϥͷ੍ޚ • ֤ cgroup Ͱ࢖༻Ͱ͖Δίϯτϩʔϥ͸ cgroup.controllers ʹϦετ͞ΕΔ   #

    cat cgroup.controllers io memory pids   • ࢠ cgroup Ͱ࢖༻͍ͨ͠ίϯτϩʔϥ͸ cgroup.subtree control Ͱ੍ޚ͢Δ   (࢖͍͍ͨίϯτϩʔϥʹ͸"+"Λɺফڈ͍ͨ͠ɾ࢖Θͳ͍ίϯτϩʔϥʹ͸"-"Λ͚ͭΔ) # echo "-pids +memory +io" > cgroup.subtree_control # cat cgroup.subtree_control io memory   27
  28. ࢠ cgroup Λ࣋ͭ৔߹ͷ੍໿ • ࣗ਎͕ϓϩηεΛ࣋ͨͳ͍ͱ͖͚ͩɺࢠڙʹϦιʔεΛ෼഑ Ͱ͖Δ • ϓϩηεΛ࣋ͨͳ͍ cgroup ͷΈɺcgroup.subtree

    control ϑΝΠϧͰίϯτϩʔϥΛ༗ޮʹͰ͖Δ • root ͸͜ͷ੍໿Λड͚ͳ͍   # cat cgroup.procs 3541 3577 # mkdir child # echo "+io" > cgroup.subtree_control bash: echo: write error: Device or resource busy # echo $$ > /sys/fs/cgroup/cgroup.procs (ϓϩηεΛ root ΁໭͠ݱάϧʔϓ͔Β࡟আ) # echo "+io" > cgroup.subtree_control # cat cgroup.subtree_control io   28
  29. ඇಛݖϢʔβ΁ͷݖݶҕৡ • cgroup ͷσΟϨΫτϦͱ cgroup.procs ϑΝΠϧ΁ͷॻ͖ࠐ ΈݖݶΛ༩͑ɺඇಛݖϢʔβʹݖݶҕৡ͢Δ • ͦͷάϧʔϓҎԼ͸ࣗ༝ʹϦιʔε഑෼Ͱ͖Δ 29

  30. Ϧιʔε෼഑ͷํ๏ • Weights • άϧʔϓؒͷൺ཰ • Limits • ઃఆྔ·ͰϦιʔεΛ࢖༻Ͱ͖Δ (ΦʔόʔίϛοτՄೳ)

    • Protections • (૆ઌ੍͕ݶΛ௒͍͑ͯͳ͍ݶΓ) ׂΓ౰͕ͯอূ͞ΕΔ (ΦʔόʔίϛοτՄೳ) • Allocation • ༗ݶϦιʔεͷׂΓ౰ͯ (ΦʔόʔίϛοτෆՄ) 30
  31. ίϯτϩʔϧϑΝΠϧ ϑΝΠϧ໊ • ΢ΣΠτͰͷϦιʔε෼഑Λߦ͏৔߹ɺϑΝΠϧ͸ “weight” • ઈର஋ͰͷϦιʔεอূɺ੍ݶͷ৔߹ɺϑΝΠϧ͸ͦΕͧΕ “min”ɺ“max” • ϕετΤϑΥʔτͷϦιʔεอূɺ੍ݶͷ৔߹ɺϑΝΠϧ͸

    ͦΕͧΕ “low”ɺ“high” 31
  32. Ϧιʔε੍ݶͷํ๏ • v1 ͱ΄΅ಉ͡ • ͦΕͧΕͷίϯτϩʔϥͷ࢖͍ํΛࢀর͠·͠ΐ͏ • ΢ΣΠτͷ৔߹ɺσϑΥϧτ஋͕ 100ɺൣғ͸ 0ʙ10000

    • σϑΥϧτ஋͕ઃఆͰ͖Δ৔߹ “default” ͱΩʔϫʔυΛ ࢖͍ొ࿥   echo "default 100" > control_file echo "8:0 150" > control_file ("8:0"ͷ੍ݶ஋Λ 150 ʹ) echo "8:0 default" > control_file ("8:0"ͷ੍ݶ஋ΛσϑΥϧτʹ໭ ͢)   • ੍ݶͳ͠ͷ৔߹͸ “max” Λ࢖͏ 32
  33. ઃఆྫ   # cat pids.max (ϓϩηε਺ͷ੍ݶ஋ͷදࣔ) max (σϑΥϧτ஋) #

    echo "2" > pids.max (ϓϩηε਺ͷ੍ݶ஋Λ 2 ʹઃఆ) # cat pids.max 2 # echo $$ > cgroup.procs # cat pids.current (ݱࡏͷϓϩηε਺ͷදࣔ) 2 # ( echo "test" | cat ) bash: fork: retry: No child processes bash: fork: retry: No child processes bash: fork: retry: No child processes bash: fork: retry: No child processes bash: fork: Resource temporarily unavailable Terminated # echo "max" > pids.max (ϓϩηε਺ͷ੍ݶΛ֎͢) # cat pids.max max # ( echo "test" | cat ) test   33
  34. cgroup v2 ͷεςʔλε • stable ͚ͩͲ • CPU ίϯτϩʔϥ͕൓ରʹ͋ͬͯϚʔδ͞Ε͍ͯͳ͍ •

    The case of the stalled CPU controller (lwn.net) • [Documentation] State of CPU controller in cgroup v2 (lwn.net) (Tejun Heo ࢯʹΑΔ·ͱΊ) 34