$30 off During Our Annual Pro Sale. View Details »

cgroup v2 / 10th CTStudy

tenforward
October 29, 2016

cgroup v2 / 10th CTStudy

「第10回 コンテナ型仮想化の情報交換会@東京」の発表資料です。
参考となる情報にはPDF中からリンクをしていますが、資料中のリンクは Speaker Deck 上ではクリックできないので PDF をダウンロードしてご覧ください。

tenforward

October 29, 2016
Tweet

More Decks by tenforward

Other Decks in Technology

Transcript

  1. cgroup v2
    ୈ 10 ճίϯςφܕԾ૝Խͷ৘ใަ׵ձˏ౦ژ
    Ճ౻ହจ
    2016-10-29
    1

    View Slide

  2. ࣗݾ঺հ
    Ճ౻ହจ
    • http://www.ten-forward.ws/
    • @ten forward
    • http://gplus.to/tenforward
    • https://github.com/tenforward
    • http://d.hatena.ne.jp/defiant/ (ٕज़ϒϩά)
    2

    View Slide

  3. ࣗݾ঺հ
    • Plamo Linux ϝϯςφ
    • LXC ͰֶͿίϯςφೖ໳ɹʔܰྔԾ૝Խ؀ڥΛ࣮ݱ͢Δٕज़
    gihyo.jp Ͱ࿈ࡌ
    3

    View Slide

  4. ࣗݾ঺հ
    • LXC/LXD ͷ։ൃʹগ͠ࢀՃ
    • man page ͷ೔ຊޠ༁
    • ެࣜϖʔδ (linuxcontainers.org) ຋༁
    • όάϑΟοΫεͳͲগ͚ͩ͠ίʔυʹ΋ߩݙ
    • LXD ೔ຊޠϝοηʔδ
    4

    View Slide

  5. ࠓ೔ͷ໨ඪ
    5

    View Slide

  6. ໨ඪ
    • cgroup v2 ͷجຊͷجຊΛ঺հ͢Δ
    6

    View Slide

  7. cgroup ͓͞Β͍
    7

    View Slide

  8. cgroup ͱ͸
    ϓϩηεΛάϧʔϓԽ͠ɺάϧʔϓʹରͯ͠Ϧιʔε੍ݶΛߦ
    ͏ɻίϯςφઐ༻ͷ࢓૊ΈͰ͸ͳ͍ɻ
    • ػೳ͝ͱʹαϒγεςϜ (ίϯτϩʔϥ) ʹ෼͔ΕΔ
    • cgroupfs ΛϚ΢ϯτͯ͠σΟϨΫτϦͰάϧʔϓΛද͢
    • σΟϨΫτϦ಺ͷϑΝΠϧΛಡΈॻ͖͢Δ͜ͱͰૢ࡞Λߦ͏
    • ݱ࣌఺Ͱ޿͘ར༻͞Ε͍ͯΔ cgroup v1 ͱɺ4.5 ΧʔωϧͰ
    stable ʹͳͬͨ cgroup v2 ͕͋Δ
    8

    View Slide

  9. ༻ޠ
    cgroup v2 จॻͰ໌֬ʹఆٛ͞Ε·ͨ͠ɻ
    • cgroup ͸খจࣈ
    • cgroup or cgroups?
    • ୯਺ܗ͸
    • ػೳΛද͢ࡍ
    • म০ࢠͱͯ͠ (“cgroup controllers” ͷΑ͏ʹ)
    • ෳ਺ܗ͸
    • ໌֬ʹෳ਺ͷ cgroup Λࣔ͢ͱ͖ʹ࢖͏
    9

    View Slide

  10. cgroup ͷ֊૚ߏ଄
    10

    View Slide

  11. cgroup v1 ͷಛ௃
    • ෳ਺ͷ֊૚Λ࣋ͯΔɻcgroupfs Λෳ਺Ϛ΢ϯτͰ͖Δ
    • άϧʔϓ΁ͷొ࿥͸εϨου୯Ґ
    • cgroupfs πϦʔͷͲͷϨϕϧͷϊʔυ (σΟϨΫτϦ) ʹ΋
    λεΫ͕ొ࿥Ͱ͖Δ
    11

    View Slide

  12. cgroup v1 ໰୊఺
    12

    View Slide

  13. ෳ਺֊૚
    ෳ਺֊૚
    • ೚ҙͷ਺ͷ֊૚Λ࡞੒Ͱ͖ɺͦΕͧΕʹ͍ͭ͘΋ͷίϯτ
    ϩʔϥ͕ॴଐͰ͖Δ
    • ॊೈͰศར
    ͷ͸ͣʜ
    13

    View Slide

  14. ίϯτϩʔϥͷ੍ݶ
    ͱ͜Ζ͕
    • ίϯτϩʔϥ͸ͻͱͭͷ֊૚ʹ͔͠ॴଐͰ͖ͳ͍
    • ෳ਺ͷ֊૚Ͱ࢖͑Δͱศརͳίϯτϩʔϥ (ྫ͑͹ freezer)
    ͕ɺಛఆͷ֊૚Ͱ͔͠࢖͑ͳ͍
    • Ұ౓͋Δ֊૚ʹଐͨ͠ίϯτϩʔϥ͸ҠಈͰ͖ͳ͍
    • ಉ͡֊૚ʹଐ͍ͯ͠Δίϯτϩʔϥ͸ಉ͡ߏ଄Ͱͳ͚Ε͹ͳ
    Βͳ͍
    14

    View Slide

  15. ݁ہ
    ॊೈੑͳΜͯͦΜͳʹͳ͔ͬͨ
    • ίϯτϩʔϥ͝ͱʹ֊૚Λ࡞੒͢Δͷ͕Ұൠతʹͳͬͨ
    • ີ઀ʹؔ܎͠ɺಉ͡Α͏ͳάϧʔϓͰѻ͏ҙຯͷ͋Δίϯτ
    ϩʔϥ͚ͩಉ͡֊૚ʹଐͤ͞Δ͜ͱʹ
    15

    View Slide

  16. ίϯτϩʔϥؒͷؔ܎
    • ίϯτϩʔϥؒͷ࿈ܞ͕ͳ͍
    • ෳ਺ίϯτϩʔϥͰ࿈ܞͯ͠ಈ࡞ͤ͞ΒΕͳ͍
    • ίϯτϩʔϥʹΑͬͯಈ͖΋όϥόϥ
    • cpuset ͷ cgroup.clone children ϑΝΠϧ
    • memory ͷ memory.use hierarchy ϑΝΠϧ
    • ͳͲͳͲʜ
    16

    View Slide

  17. λεΫͷѻ͍
    • Ͳͷ֊૚ͷϊʔυʹ΋λεΫ͕ॴଐͰ͖Δ
    • ਌ࢠͷ cgroup ͲͪΒʹ΋λεΫ͕ଐ͍ͯ͠Δ৔߹ͷϦιʔ
    εׂΓৼΓͱ͔ΧΦε
    • λεΫͷ୯Ґ͕εϨου୯Ґ
    • ίϯτϩʔϥʹΑͬͯ͸ҙຯ͕ͳ͍
    17

    View Slide

  18. cgroup v2 ֓ཁ
    18

    View Slide

  19. ྺ࢙
    • 3.16 Ͱ “unified control group hierarchy” ͱͯ͠ಋೖ
    • DEVEL sane behavior ΦϓγϣϯͰϚ΢ϯτͯͨ͠ (·ͱ
    ΋ͳৼΔ෣͍Φϓγϣϯ!!)
    • (ࢀߟ)
    • The unified control group hierarchy in 3.16 (lwn.net)
    • Linux Χʔωϧͷ͢΂ͯ: cgroup ͷ࠶ઃܭ (linux.com)
    • 4.5 Ͱ stable ʹ
    19

    View Slide

  20. ಛ௃
    • ୯Ұ֊૚ߏ଄
    • ؅ཧ͸ϓϩηε୯Ґ
    • v1 ͱڞଘͰ͖Δ
    • Ұ෦ͷίϯτϩʔϥ͸ v2 Ͱɺଞͷίϯτϩʔϥ͸ v1 Ͱར༻
    ͱ͔Ͱ͖Δ
    • ૢ࡞͸ v1 ͱಉ͡ (σΟϨΫτϦ؅ཧɺϑΝΠϧ΁ͷಡΈॻ
    ͖Ͱૢ࡞)
    20

    View Slide

  21. ࣮૷
    • 4.8 ࣌఺Ͱ memoryɺioɺpids ͷΈ
    • υΩϡϝϯτʹ cpu ͷهࡌ͋Γ·͕͢Ϛʔδ͞Ε͍ͯ·ͤΜ
    21

    View Slide

  22. cgroup v2 ૢ࡞
    22

    View Slide

  23. cgroup v2 ͷར༻
    • Ϛ΢ϯτ͢Δ

    # mount -t cgroup2 cgroup2 /sys/fs/cgroup

    23

    View Slide

  24. cgroup ͷ࡞੒ɾ࡟আ
    • Ϛ΢ϯτ௚ޙ͸ root cgroup ͷΈଘࡏ
    • σΟϨΫτϦͷ࡞੒ʹΑΓάϧʔϓΛ࡞੒͢Δ

    # mkdir cgroup_name

    • σΟϨΫτϦͷ࡟আʹΑΓάϧʔϓΛ࡟আ͢Δ

    # rmdir cgroup_name

    • ࢠ cgroup ΋ͳ͘ɺϓϩηε΋ͳ͍ cgroup ͷΈ࡟আͰ͖Δ
    24

    View Slide

  25. ϓϩηεΛ cgroup ʹॴଐͤ͞Δ
    • PID Λ cgroup.procs ʹॻ͖ࠐΉ

    # echo $PID > /path/to/cgroup/cgroup.procs

    • ϓϩηε͕Ͳͷ cgroup ʹଐ͢Δ͔͸/proc/$PID/cgroup ʹ
    Ϧετ͞ΕΔ

    # cat /proc/$$/cgroup
    0::/cgroup_name

    25

    View Slide

  26. cgroup ͷঢ়ଶ؂ࢹ
    • root Ҏ֎ͷ cgroup ʹ͸ cgroup.events ϑΝΠϧ͕ଘࡏ

    # cat cgroup.events
    populated 0 (ࣗ෼΋͘͠͸ࢠଙͷ cgroup ʹϓϩηε͕ଘࡏ͠ͳ͍ͱ͖͸ populated ͕ 0)
    # echo $$ > cgroup.procs (ϓϩηεΛ௥Ճ͢Δͱʜ)
    # cat cgroup.events
    populated 1 (ϓϩηε͕ଘࡏ͢Δͱ͖͸ populated ͕ 1)

    • populated ͷ஋͕มԽ͢ΔͱϑΝΠϧ͕มԽͨ͠Πϕϯτൃ
    ੜ (poll,dnotify,inotify)

    $ inotifywait -m /sys/fs/cgroup/test01/cgroup.events
    (test01 ʹϓϩηεΛ௥Ճ͢Δ)
    /sys/fs/cgroup/test01/cgroup.events MODIFY

    26

    View Slide

  27. ίϯτϩʔϥͷ੍ޚ
    • ֤ cgroup Ͱ࢖༻Ͱ͖Δίϯτϩʔϥ͸
    cgroup.controllers ʹϦετ͞ΕΔ

    # cat cgroup.controllers
    io memory pids

    • ࢠ cgroup Ͱ࢖༻͍ͨ͠ίϯτϩʔϥ͸
    cgroup.subtree control Ͱ੍ޚ͢Δ

    (࢖͍͍ͨίϯτϩʔϥʹ͸"+"Λɺফڈ͍ͨ͠ɾ࢖Θͳ͍ίϯτϩʔϥʹ͸"-"Λ͚ͭΔ)
    # echo "-pids +memory +io" > cgroup.subtree_control
    # cat cgroup.subtree_control
    io memory

    27

    View Slide

  28. ࢠ cgroup Λ࣋ͭ৔߹ͷ੍໿
    • ࣗ਎͕ϓϩηεΛ࣋ͨͳ͍ͱ͖͚ͩɺࢠڙʹϦιʔεΛ෼഑
    Ͱ͖Δ
    • ϓϩηεΛ࣋ͨͳ͍ cgroup ͷΈɺcgroup.subtree control
    ϑΝΠϧͰίϯτϩʔϥΛ༗ޮʹͰ͖Δ
    • root ͸͜ͷ੍໿Λड͚ͳ͍

    # cat cgroup.procs
    3541
    3577
    # mkdir child
    # echo "+io" > cgroup.subtree_control
    bash: echo: write error: Device or resource busy
    # echo $$ > /sys/fs/cgroup/cgroup.procs
    (ϓϩηεΛ root ΁໭͠ݱάϧʔϓ͔Β࡟আ)
    # echo "+io" > cgroup.subtree_control
    # cat cgroup.subtree_control
    io

    28

    View Slide

  29. ඇಛݖϢʔβ΁ͷݖݶҕৡ
    • cgroup ͷσΟϨΫτϦͱ cgroup.procs ϑΝΠϧ΁ͷॻ͖ࠐ
    ΈݖݶΛ༩͑ɺඇಛݖϢʔβʹݖݶҕৡ͢Δ
    • ͦͷάϧʔϓҎԼ͸ࣗ༝ʹϦιʔε഑෼Ͱ͖Δ
    29

    View Slide

  30. Ϧιʔε෼഑ͷํ๏
    • Weights
    • άϧʔϓؒͷൺ཰
    • Limits
    • ઃఆྔ·ͰϦιʔεΛ࢖༻Ͱ͖Δ (ΦʔόʔίϛοτՄೳ)
    • Protections
    • (૆ઌ੍͕ݶΛ௒͍͑ͯͳ͍ݶΓ) ׂΓ౰͕ͯอূ͞ΕΔ
    (ΦʔόʔίϛοτՄೳ)
    • Allocation
    • ༗ݶϦιʔεͷׂΓ౰ͯ (ΦʔόʔίϛοτෆՄ)
    30

    View Slide

  31. ίϯτϩʔϧϑΝΠϧ
    ϑΝΠϧ໊
    • ΢ΣΠτͰͷϦιʔε෼഑Λߦ͏৔߹ɺϑΝΠϧ͸
    “weight”
    • ઈର஋ͰͷϦιʔεอূɺ੍ݶͷ৔߹ɺϑΝΠϧ͸ͦΕͧΕ
    “min”ɺ“max”
    • ϕετΤϑΥʔτͷϦιʔεอূɺ੍ݶͷ৔߹ɺϑΝΠϧ͸
    ͦΕͧΕ “low”ɺ“high”
    31

    View Slide

  32. Ϧιʔε੍ݶͷํ๏
    • v1 ͱ΄΅ಉ͡
    • ͦΕͧΕͷίϯτϩʔϥͷ࢖͍ํΛࢀর͠·͠ΐ͏
    • ΢ΣΠτͷ৔߹ɺσϑΥϧτ஋͕ 100ɺൣғ͸ 0ʙ10000
    • σϑΥϧτ஋͕ઃఆͰ͖Δ৔߹ “default” ͱΩʔϫʔυΛ
    ࢖͍ొ࿥

    echo "default 100" > control_file
    echo "8:0 150" > control_file ("8:0"ͷ੍ݶ஋Λ 150 ʹ)
    echo "8:0 default" > control_file ("8:0"ͷ੍ݶ஋ΛσϑΥϧτʹ໭
    ͢)

    • ੍ݶͳ͠ͷ৔߹͸ “max” Λ࢖͏
    32

    View Slide

  33. ઃఆྫ

    # cat pids.max (ϓϩηε਺ͷ੍ݶ஋ͷදࣔ)
    max (σϑΥϧτ஋)
    # echo "2" > pids.max (ϓϩηε਺ͷ੍ݶ஋Λ 2 ʹઃఆ)
    # cat pids.max
    2
    # echo $$ > cgroup.procs
    # cat pids.current (ݱࡏͷϓϩηε਺ͷදࣔ)
    2
    # ( echo "test" | cat )
    bash: fork: retry: No child processes
    bash: fork: retry: No child processes
    bash: fork: retry: No child processes
    bash: fork: retry: No child processes
    bash: fork: Resource temporarily unavailable
    Terminated
    # echo "max" > pids.max (ϓϩηε਺ͷ੍ݶΛ֎͢)
    # cat pids.max
    max
    # ( echo "test" | cat )
    test

    33

    View Slide

  34. cgroup v2 ͷεςʔλε
    • stable ͚ͩͲ
    • CPU ίϯτϩʔϥ͕൓ରʹ͋ͬͯϚʔδ͞Ε͍ͯͳ͍
    • The case of the stalled CPU controller (lwn.net)
    • [Documentation] State of CPU controller in cgroup v2
    (lwn.net) (Tejun Heo ࢯʹΑΔ·ͱΊ)
    34

    View Slide