Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ランタイムとcgroupの
xxxな関係 / bpf_get_current_cgroup_id(void) and modern container runtimes

2cf373725ded741824c50fd571eda6e1?s=47 KONDO Uchio
January 28, 2021

ランタイムとcgroupの
xxxな関係 / bpf_get_current_cgroup_id(void) and modern container runtimes

Container Runtime Meetup #3

https://runtime.connpass.com/event/198071/

2cf373725ded741824c50fd571eda6e1?s=128

KONDO Uchio

January 28, 2021
Tweet

Transcript

  1. bpf_get_current_cgroup_id(void) を添えて Uchio Kondo / Container Runtime Meetup #3 ランタイムとcgroupの


    xxxな関係 * Photo by Fukuoka City
  2. γχΞɾϓϦϯγύϧΤϯδχΞ ۙ౻ Ӊஐ࿕ / @udzura https://blog.udzura.jp/ Uchio Kondo ٕज़෦ ٕज़ج൫νʔϜ

    #Ruby #mruby #Containers #eBPF #CRIU #Seccomp #RubyKaigi #CloudNativeDays #Zumba #γϨϯ
  3. ToC •τϨʔγϯάͱ eBPF •ίϯςφΛτϨʔε͢ΔͨΊͷલఏ஌ࣝ •eBPF ͰͷίϯςφͱϨʔεͷ࣮ࡍ •ίϯςφϥϯλΠϜͷରԠ •ʢ͓·͚ʣBPF CO-RE

  4. eBPF and Containers

  5. eBPF ͷ࿩ •https://speakerdeck.com/chikuwait/learn-ebpf

  6. eBPF ͱ͸Կ͔ •ϢʔβۭؒͰ࡞ͬͨϓϩάϥϜΛΧʔωϧͰಈ͔ٕ͢ज़ͷͻͱͭ •ϑΟϧλϦϯά͕ಘҙʢtcpdump, seccomp, bpftraceʣ •Χʔωϧͷ৘ใʹΞΫηεͰ͖Δ͕ɺةݥͳίʔυ͸ಈ͔ͳ͍ͳͲ ҆શੑ͕͋Δఔ౓୲อ͞Ε͍ͯΔ

  7. τϨʔεπʔϧ΁ͷར༻ •bpftrace •BCC •BPF Performance Tools • execsnoop, runqlat, tcplife...

    • http://www.brendangregg.com/bpf-performance-tools-book.html
  8. ίϯςφΛτϨʔε͍ͨ͠ •લఏ஌ࣝ2ͭ •Linux Namespace •cgroup (v1/v2)

  9. Linux Namespaceʢ໊લۭؒʣ •OSͷதͷҰ෦ͷ໊લۭؒΛ੾Γग़͠ɺ ಠཱͨ͠Ϧιʔεʢϗετ໊ɺωοτϫʔΫɺPIDͷ࠾൪ɺϚ΢ϯτ ϙΠϯτͳͲʣΛ࣋ͨͤΔٕज़ɻ IUUQTDPOUBJOFSTFDVSJUZEFWOBNFTQBDF

  10. cgroup (Control Groups) •ϓϩηεΛάϧʔϓԽ͠ɺͦͷ୯ҐͰϦιʔεͷར༻ʢCPUɺϝϞ ϦɺϒϩοΫI/Oɺϓϩηε਺ʣΛ੍ݶ͢Δɻ •rlimitͱҧ͍ϢʔβΛލ͍ͰॴଐՄೳɺ·ͨλεΫͷॴଐάϧʔϓ΋ ॊೈʹม͑ΒΕΔ •v1/v2͕͋Δ (v2=2014/8~ Linux

    3.16) IUUQTDPOUBJOFSTFDVSJUZEFWDHSPVQɹ
  11. Implementations

  12. eBPFͰίϯςφΛτϨʔε͢Δ •ઓུ͕͍͔ͭ͋͘Δ •Linux Namespace·ͨ͸cgroup (v2)ͷ৘ใ͕ར༻Ͱ͖Δ

  13. ઓུ(1) •task_struct→nsproxy ͔Β namespaceͷ৘ใΛ औಘͯ͠ϑΟϧλ͢Δ ʢcxrayʣ IUUQTHJUIVCDPNNSUDDYSBZCMPCNBTUFSQLHUSBDFSPQFOPQFOHP--

  14. ઓུ(2) •BPFϓϩάϥϜͰऔಘͰ͖ͨ tidͱɺϗετͰͷtidΛ ൺֱ͠ɺҰக͠ͳ͚Ε͹ ίϯςφͱ൑ఆ͢Δ ʢTraceeʣ • tasuk_structґଘ IUUQTHJUIVCDPNBRVBTFDVSJUZUSBDFFCMPCNBJOUSBDFFUSBDFFCQGD-ɹ

  15. ઓུ(3) •cgroup v2ͷIDΛϗετͱൺֱ͢Δ •bpf-helpers(7)

  16. ࣮ࡍʹ࢖ͬͯΈ࣮ͨ૷ྫ •udzura/copenclose(8)

  17. 6TJOHIPTUOBNF 654/4 6TJOH$(SPVQW*%

  18. cgroup v2

  19. ϥϯλΠϜͷରԠঢ়گ •Suda͞Μͷهࣄ͕ৄ͍͠Ͱ͢… (https://medium.com/nttlabs/cgroup-v2-596d035be4d7) •ͱ͸͍͑ɺ2021೥ݱࡏͷঢ়گΛ؆୯ʹௐࠪ͠·ͨ͠

  20. ϥϯλΠϜͱcgroupͷઃఆ •Cgroup Driver: ίϯςφʹׂΓ౰ͯΔcgroupΛͲ͏ίϯτϩʔϧ͢Δ͔ •cgroupfs: cgroupfs΁ͷ௚઀ͷϑΝΠϧૢ࡞ •systemd: systemdʹΑΔ؅ཧ •Cgroup Version:

    Ϧιʔε੍ݶʹ v1/v2 ͲͪΒΛར༻͢Δ͔ •/sys/fs/cgroup ʹͲͷϑΝΠϧγεςϜ͕Ϛ΢ϯτ͞ΕͯΔ͔Ͱ൑ఆ •ʢdocker/containerd ͷ৔߹ɻpodman΋ಉ༷ʁʣ
  21. v2ΛͲ͏࢖͏? •ϗετΛv2Ϟʔυʹ͢Δʹ͸ɺΧʔωϧىಈύϥϝʔλͷมߋ͕ඞཁ... •ϗετLinuxΛv1/v2ڞଘ؀ڥͰىಈ͍ͯ͠Δ৔߹Version=v1ͱ൑ఆ͞ΕΔ •CGroup Driver=systemdʹ͢Ε͹ίϯςφ͸v2ͷάϧʔϓʹ΋ॴଐ͢Δ Α͏ʹͳΔʂ systemd͕΍ͬͯ͘ΕΔ໛༷ʁ •੍ݶ஋ͷॻ͖ࠐΈ͸v1ͷAPI͕࢖ΘΕΔ •άϧʔϓID͸ɺී௨ʹऔಘͰ͖ΔΑ͏ʹͳΔ

  22. ֤ίϯςφϥϯλΠϜͰͷରԠঢ়گ •ߴϨϕϧϥϯλΠϜ͸ɺCgroup DriverͷઃఆมߋखॱΛܝࣔ͢Δɻ •௿ϨϕϧϥϯλΠϜͷରԠঢ়گΛࢀߟʹܝࡌ͢Δ

  23. ߴϨϕϧϥϯλΠϜ •docker: •podman: σϑΥϧτͰsystemdɻ໌ࣔ: •containerd: ྫ: •FYI: ൑ఆखॱ

  24. ௿ϨϕϧϥϯλΠϜ •runc, crun •Cgroup v2/systemd driverʹରԠࡁΈ •runsc (gVisor) •ରԠͷͨΊͷIssue͸ཱ͍ͬͯΔ •ݱঢ়͸Τϥʔͷ໛༷

    IUUQTHJUIVCDPNHPPHMFHWJTPSJTTVFT $ sudo podman run --runtime `which runsc` -dt -p 10184:80/tcp httpd:2.4 Error: OCI runtime error: systemd cgroup flag passed, but systemd cgroups not supported. See gvisor.dev/issue/193
  25. ·ͱΊ •֤छϥϯλΠϜ͸͢Ͱʹv2Ͱಈ͘ •cgroupidͷऔಘͳΒ͙͢ʹͰ΋ઃఆͯ͠Ͱ͖Δঢ়ଶ •cAdvisorͳͲ΋ରԠΛਐΊ͍ͯΔ •τϨʔε͸΋ͪΖΜɺPSI΋࢖͑Δ͠ rootless kubernetes ͷເ΋... Զ ͨͪͷ๯ݥ͸࢝·ͬͨ͹͔Γͩ

    IUUQTHJUIVCDPNHPPHMFDBEWJTPSQVMM
  26. ͓·͚: BPF CO-REόΠφϦ •eBPF ToolΛίϯςφ಺෦Ͱಈ͔͢ͷ͸େม... •BPF CO-REͱ͍͏ٕज़ͰɺϓϨίϯύΠϧࡁΈͷBPFόΠφϦΛಈ͔ ͤΔɺΧʔωϧͷϔομϑΝΠϧ΍clangίϚϯυʹґଘͤͣಈ࡞͢Δ •͔͠͠࠷৽ͷΧʔωϧʴ৽͍͠CONFIG͕ඞཁ...

  27. ੿πʔϧͷಈ࡞؀ڥྫ

  28. ࢀߟ: ಈ࡞؀ڥ IUUQTHJTUHJUIVCDPNVE[VSBBFEDCDBEFG •ࠓ೔ݕূͨ͠؀ڥ͸ҎԼʹ·ͱΊ·ͨ͠ɻUbuntu 20.10ϕʔε