Slide 1

Slide 1 text

bpf_get_current_cgroup_id(void) を添えて Uchio Kondo / Container Runtime Meetup #3 ランタイムとcgroupの
 xxxな関係 * Photo by Fukuoka City

Slide 2

Slide 2 text

γχΞɾϓϦϯγύϧΤϯδχΞ ۙ౻ Ӊஐ࿕ / @udzura https://blog.udzura.jp/ Uchio Kondo ٕज़෦ ٕज़ج൫νʔϜ #Ruby #mruby #Containers #eBPF #CRIU #Seccomp #RubyKaigi #CloudNativeDays #Zumba #γϨϯ

Slide 3

Slide 3 text

ToC •τϨʔγϯάͱ eBPF •ίϯςφΛτϨʔε͢ΔͨΊͷલఏ஌ࣝ •eBPF ͰͷίϯςφͱϨʔεͷ࣮ࡍ •ίϯςφϥϯλΠϜͷରԠ •ʢ͓·͚ʣBPF CO-RE

Slide 4

Slide 4 text

eBPF and Containers

Slide 5

Slide 5 text

eBPF ͷ࿩ •https://speakerdeck.com/chikuwait/learn-ebpf

Slide 6

Slide 6 text

eBPF ͱ͸Կ͔ •ϢʔβۭؒͰ࡞ͬͨϓϩάϥϜΛΧʔωϧͰಈ͔ٕ͢ज़ͷͻͱͭ •ϑΟϧλϦϯά͕ಘҙʢtcpdump, seccomp, bpftraceʣ •Χʔωϧͷ৘ใʹΞΫηεͰ͖Δ͕ɺةݥͳίʔυ͸ಈ͔ͳ͍ͳͲ ҆શੑ͕͋Δఔ౓୲อ͞Ε͍ͯΔ

Slide 7

Slide 7 text

τϨʔεπʔϧ΁ͷར༻ •bpftrace •BCC •BPF Performance Tools • execsnoop, runqlat, tcplife... • http://www.brendangregg.com/bpf-performance-tools-book.html

Slide 8

Slide 8 text

ίϯςφΛτϨʔε͍ͨ͠ •લఏ஌ࣝ2ͭ •Linux Namespace •cgroup (v1/v2)

Slide 9

Slide 9 text

Linux Namespaceʢ໊લۭؒʣ •OSͷதͷҰ෦ͷ໊લۭؒΛ੾Γग़͠ɺ ಠཱͨ͠Ϧιʔεʢϗετ໊ɺωοτϫʔΫɺPIDͷ࠾൪ɺϚ΢ϯτ ϙΠϯτͳͲʣΛ࣋ͨͤΔٕज़ɻ IUUQTDPOUBJOFSTFDVSJUZEFWOBNFTQBDF

Slide 10

Slide 10 text

cgroup (Control Groups) •ϓϩηεΛάϧʔϓԽ͠ɺͦͷ୯ҐͰϦιʔεͷར༻ʢCPUɺϝϞ ϦɺϒϩοΫI/Oɺϓϩηε਺ʣΛ੍ݶ͢Δɻ •rlimitͱҧ͍ϢʔβΛލ͍ͰॴଐՄೳɺ·ͨλεΫͷॴଐάϧʔϓ΋ ॊೈʹม͑ΒΕΔ •v1/v2͕͋Δ (v2=2014/8~ Linux 3.16) IUUQTDPOUBJOFSTFDVSJUZEFWDHSPVQɹ

Slide 11

Slide 11 text

Implementations

Slide 12

Slide 12 text

eBPFͰίϯςφΛτϨʔε͢Δ •ઓུ͕͍͔ͭ͋͘Δ •Linux Namespace·ͨ͸cgroup (v2)ͷ৘ใ͕ར༻Ͱ͖Δ

Slide 13

Slide 13 text

ઓུ(1) •task_struct→nsproxy ͔Β namespaceͷ৘ใΛ औಘͯ͠ϑΟϧλ͢Δ ʢcxrayʣ IUUQTHJUIVCDPNNSUDDYSBZCMPCNBTUFSQLHUSBDFSPQFOPQFOHP--

Slide 14

Slide 14 text

ઓུ(2) •BPFϓϩάϥϜͰऔಘͰ͖ͨ tidͱɺϗετͰͷtidΛ ൺֱ͠ɺҰக͠ͳ͚Ε͹ ίϯςφͱ൑ఆ͢Δ ʢTraceeʣ • tasuk_structґଘ IUUQTHJUIVCDPNBRVBTFDVSJUZUSBDFFCMPCNBJOUSBDFFUSBDFFCQGD-ɹ

Slide 15

Slide 15 text

ઓུ(3) •cgroup v2ͷIDΛϗετͱൺֱ͢Δ •bpf-helpers(7)

Slide 16

Slide 16 text

࣮ࡍʹ࢖ͬͯΈ࣮ͨ૷ྫ •udzura/copenclose(8)

Slide 17

Slide 17 text

6TJOHIPTUOBNF 654/4 6TJOH$(SPVQW*%

Slide 18

Slide 18 text

cgroup v2

Slide 19

Slide 19 text

ϥϯλΠϜͷରԠঢ়گ •Suda͞Μͷهࣄ͕ৄ͍͠Ͱ͢… (https://medium.com/nttlabs/cgroup-v2-596d035be4d7) •ͱ͸͍͑ɺ2021೥ݱࡏͷঢ়گΛ؆୯ʹௐࠪ͠·ͨ͠

Slide 20

Slide 20 text

ϥϯλΠϜͱcgroupͷઃఆ •Cgroup Driver: ίϯςφʹׂΓ౰ͯΔcgroupΛͲ͏ίϯτϩʔϧ͢Δ͔ •cgroupfs: cgroupfs΁ͷ௚઀ͷϑΝΠϧૢ࡞ •systemd: systemdʹΑΔ؅ཧ •Cgroup Version: Ϧιʔε੍ݶʹ v1/v2 ͲͪΒΛར༻͢Δ͔ •/sys/fs/cgroup ʹͲͷϑΝΠϧγεςϜ͕Ϛ΢ϯτ͞ΕͯΔ͔Ͱ൑ఆ •ʢdocker/containerd ͷ৔߹ɻpodman΋ಉ༷ʁʣ

Slide 21

Slide 21 text

v2ΛͲ͏࢖͏? •ϗετΛv2Ϟʔυʹ͢Δʹ͸ɺΧʔωϧىಈύϥϝʔλͷมߋ͕ඞཁ... •ϗετLinuxΛv1/v2ڞଘ؀ڥͰىಈ͍ͯ͠Δ৔߹Version=v1ͱ൑ఆ͞ΕΔ •CGroup Driver=systemdʹ͢Ε͹ίϯςφ͸v2ͷάϧʔϓʹ΋ॴଐ͢Δ Α͏ʹͳΔʂ systemd͕΍ͬͯ͘ΕΔ໛༷ʁ •੍ݶ஋ͷॻ͖ࠐΈ͸v1ͷAPI͕࢖ΘΕΔ •άϧʔϓID͸ɺී௨ʹऔಘͰ͖ΔΑ͏ʹͳΔ

Slide 22

Slide 22 text

֤ίϯςφϥϯλΠϜͰͷରԠঢ়گ •ߴϨϕϧϥϯλΠϜ͸ɺCgroup DriverͷઃఆมߋखॱΛܝࣔ͢Δɻ •௿ϨϕϧϥϯλΠϜͷରԠঢ়گΛࢀߟʹܝࡌ͢Δ

Slide 23

Slide 23 text

ߴϨϕϧϥϯλΠϜ •docker: •podman: σϑΥϧτͰsystemdɻ໌ࣔ: •containerd: ྫ: •FYI: ൑ఆखॱ

Slide 24

Slide 24 text

௿ϨϕϧϥϯλΠϜ •runc, crun •Cgroup v2/systemd driverʹରԠࡁΈ •runsc (gVisor) •ରԠͷͨΊͷIssue͸ཱ͍ͬͯΔ •ݱঢ়͸Τϥʔͷ໛༷ IUUQTHJUIVCDPNHPPHMFHWJTPSJTTVFT $ sudo podman run --runtime `which runsc` -dt -p 10184:80/tcp httpd:2.4 Error: OCI runtime error: systemd cgroup flag passed, but systemd cgroups not supported. See gvisor.dev/issue/193

Slide 25

Slide 25 text

·ͱΊ •֤छϥϯλΠϜ͸͢Ͱʹv2Ͱಈ͘ •cgroupidͷऔಘͳΒ͙͢ʹͰ΋ઃఆͯ͠Ͱ͖Δঢ়ଶ •cAdvisorͳͲ΋ରԠΛਐΊ͍ͯΔ •τϨʔε͸΋ͪΖΜɺPSI΋࢖͑Δ͠ rootless kubernetes ͷເ΋... Զ ͨͪͷ๯ݥ͸࢝·ͬͨ͹͔Γͩ IUUQTHJUIVCDPNHPPHMFDBEWJTPSQVMM

Slide 26

Slide 26 text

͓·͚: BPF CO-REόΠφϦ •eBPF ToolΛίϯςφ಺෦Ͱಈ͔͢ͷ͸େม... •BPF CO-REͱ͍͏ٕज़ͰɺϓϨίϯύΠϧࡁΈͷBPFόΠφϦΛಈ͔ ͤΔɺΧʔωϧͷϔομϑΝΠϧ΍clangίϚϯυʹґଘͤͣಈ࡞͢Δ •͔͠͠࠷৽ͷΧʔωϧʴ৽͍͠CONFIG͕ඞཁ...

Slide 27

Slide 27 text

੿πʔϧͷಈ࡞؀ڥྫ

Slide 28

Slide 28 text

ࢀߟ: ಈ࡞؀ڥ IUUQTHJTUHJUIVCDPNVE[VSBBFEDCDBEFG •ࠓ೔ݕূͨ͠؀ڥ͸ҎԼʹ·ͱΊ·ͨ͠ɻUbuntu 20.10ϕʔε