Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
3 state-of-the-art technologies in Linux and fu...
Search
KONDO Uchio
February 21, 2021
Technology
2
610
3 state-of-the-art technologies in Linux and future of the containers #SECKUN
2021.02.21 「新しいセキュリティビジネスキャリア」シンポジウム
KONDO Uchio
February 21, 2021
Tweet
Share
More Decks by KONDO Uchio
See All by KONDO Uchio
大規模レガシーテストを 倒すための CI基盤の作り方 / #CICD2023
udzura
5
2.3k
Ruby x BPF in Action / RubyKaigi 2022
udzura
0
220
Narrative of Ruby & Rust
udzura
0
190
開発者生産性指標の可視化 / pepabo-four-keys
udzura
3
1.6k
Talk of RBS
udzura
0
410
Re: みなさん最近どうですか? / FGN tech meetup in 2021
udzura
0
730
Dockerとやわらかい仮想化 - ProSec-IT/SECKUN 2021 edition -
udzura
2
700
Device access filtering in cgroup v2
udzura
1
820
"Story of Rucy" on RubyKaigi takeout 2021
udzura
0
760
Other Decks in Technology
See All in Technology
Windows の新しい管理者保護モード
murachiakira
0
170
OpenID BizDay#17 KYC WG活動報告(法人) / 20250219-BizDay17-KYC-legalidentity
oidfj
0
290
全文検索+セマンティックランカー+LLMの自然文検索サ−ビスで得られた知見
segavvy
2
130
深層学習と古典的画像アルゴリズムを組み合わせた類似画像検索内製化
shutotakahashi
1
250
php-conference-nagoya-2025
fuwasegu
0
110
わたしのOSS活動
kazupon
2
290
オブザーバビリティの観点でみるAWS / AWS from observability perspective
ymotongpoo
9
1.6k
Amazon S3 Tablesと外部分析基盤連携について / Amazon S3 Tables and External Data Analytics Platform
nttcom
0
150
利用終了したドメイン名の最強終活〜観測環境を育てて、分析・供養している件〜 / The Ultimate End-of-Life Preparation for Discontinued Domain Names
nttcom
2
310
(機械学習システムでも) SLO から始める信頼性構築 - ゆる SRE#9 2025/02/21
daigo0927
0
200
プロダクトエンジニア 360°フィードバックを実施した話
hacomono
PRO
0
120
明日からできる!技術的負債の返済を加速するための実践ガイド~『ホットペッパービューティー』の事例をもとに~
recruitengineers
PRO
3
500
Featured
See All Featured
Testing 201, or: Great Expectations
jmmastey
42
7.2k
A Tale of Four Properties
chriscoyier
158
23k
What’s in a name? Adding method to the madness
productmarketing
PRO
22
3.3k
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.1k
Six Lessons from altMBA
skipperchong
27
3.6k
Fireside Chat
paigeccino
34
3.2k
Gamification - CAS2011
davidbonilla
80
5.1k
Fantastic passwords and where to find them - at NoRuKo
philnash
51
3k
Documentation Writing (for coders)
carmenintech
67
4.6k
The Language of Interfaces
destraynor
156
24k
Into the Great Unknown - MozCon
thekraken
35
1.6k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
45
9.4k
Transcript
ཁૉٕज़ͷstate of the art͔Βߟ͑Δ ۙ౻Ӊஐ࿕ / GMO Pepabo, Inc. 2021.02.21
ʮ৽͍͠ηΩϡϦςΟϏδωεΩϟϦΞʯγϯϙδϜ Linuxίϯςφͷະདྷ
GMOϖύϘגࣜձࣾ γχΞϓϦϯγύϧ ٕज़෦ٕज़ج൫νʔϜॴଐ ۙ౻Ӊஐ࿕ (@udzura)
ۙ౻Ӊஐ࿕ ུྺ • ࡾՏᅳͷਓɻچ٢ాൡߍ࣌शؗߴߍΛଔۀɺ౦ژେֶจֶ෦ຊޠຊจֶઐम՝ఔ ͷֶ࢜ଔʢ2007ʣɻ • ϚείϛͷࣾSEɺECαΠτ։ൃɺΦϯϥΠϯήʔϜ ։ൃͳͲΛܦͯ2013ΑΓݱ৬ɺಉʹԬҠॅɻ • RubyɺίϯςφɺΫϥυωΠςΟϒٕज़ͳͲͷίϛϡχςΟͰ
׆ಈɻஶॻʹʮWebͰ͑ΔmrubyγεςϜϓϩάϥϛϯάೖʯʢC&Rݚڀॴʣ • ͖ͳγεςϜίʔϧʢ࠷ۙʣ socketpair(2) ɻ
ࠓͷ͓ •ίϯςφͷཁૉٕज़ʹ͍ͭͯɺࢲͷߨٛͰʢͬ͘͟Γʣཧղ͞Εͨํ ͚ͷ༰Ͱ͢ɻཁૉٕज़ͷղઆࢀߟࢿྉΛͲ͏ͧɻ •ࢀߟ1: https://container-security.dev/ •ࢀߟ2: ʰίϯςφܕԾԽ֓ʱʢޱ, ΧοτγεςϜʣ •ۙͷΧʔωϧʹؚ·ΕΔ৽ٕज़ͷ͏ͪɺίϯςφʹؔ͢ΔͷΛ հ͠·͢ɻ
•͕࣌ؒ͋Εɺͦͷ্ͰίϯςφͷະདྷΛߟ͑·͢ɻ
cgroup v2
cgroup ͷ͓͞Β͍ •Linux Kernelͷجຊٕज़ͷҰͭɻ •ϓϩηεΛάϧʔϐϯά͠ɺͦͷάϧʔϓ୯ҐͰϦιʔεར༻ͷ੍ݶ Λ͔͚Δٕज़ɻCPUɺϝϞϦɺIOɺϓϩηε... IUUQTHJIZPKQBENJOTFSJBMMJOVY@DPOUBJOFST
cgroup v1ͷྫ •cgroupfs ͱ͍͏ϑΝΠϧγεςϜʹmkdir(2) read(2) write(2)ͳͲΛ࣮ ߦ͠ɺૢ࡞Λߦ͏
cgroup v2 •v1 ͷ͍͔ͭ͘ͷܽ - ओʹ੍ޚରʢίϯτϩʔϥʣ͝ͱʹσΟϨΫ τϦΛ͚ͳ͚Ε͍͚ͳ͍༷ - Λࠀ͘͢։ൃ͞Εͨ •େ͖ͳҧ͍ͱͯ͠ɺ
v1 ͰίϯτϩʔϥผʹσΟϨΫτϦ ΛϚϯ τɺݸผʹάϧʔϓʹॴଐͰ͖ͨͷʹର͠ɺv2ͰશίϯτϩʔϥΛ ·ͱΊͨҰͭͷσΟϨΫτϦͷΈΛϚϯτ͠ɺ·ͱΊͯάϧʔϓΛ ࡞͢ΔڍಈʹͳΔɻ •ίϯςφͱͯͪ͜͠Βͷํ͕߹͕͍͍ɻ
Unified hierarchy /sys/fs/cgroup /sys/fs/cgroup /group-a /group-b /cpu.* /memory.* /io.* ...
/cpu.* /memory.* /io.* ... /cpu /memory /blkio /group-a /group-b /group-a /group-c
ίϯςφϥϯλΠϜͰͷcgroupͷར༻ •ʢOCIܥͷʣϥϯλΠϜͰҎԼͷ2ͭͷઃఆ߲͕͋Δ •Cgroup Driver: ίϯςφʹׂΓͯΔcgroupΛͲ͏ίϯτϩʔϧ͢Δ͔ •cgroupfs: cgroupfsͷͷϑΝΠϧૢ࡞ •systemd: systemdʹΑΔཧ •Cgroup
Version: Ϧιʔε੍ݶʹ v1/v2 ͲͪΒΛར༻͢Δ͔ •/sys/fs/cgroup ʹͲͷϑΝΠϧγεςϜ͕Ϛϯτ͞ΕͯΔ͔Ͱఆ
v2 ͷ৽ػೳ •Unified Hierarchy •PSI(Pressure Stall Information) •eBPFͰcgoup IDͷऔಘ͕Մೳʹ •nsdelegate
(ඇಛݖίϯςφʹॏཁ) •clone3(2) Ͱಛఆͷcgroup෦ʹϓϩηε࡞͕Մೳʹ •ͳͲͳͲ...
e.g. PSI(Pressure Stall Information) •γεςϜશମɺ·ͨcgroup୯ҐͰར༻Ͱ͖Δෛՙͷࢦඪ •CPU, ϝϞϦ, IO Ͱ stall
ͨ͠୯Ґ࣌ؒͰͷׂ߹ ΛܭଌͰ͖Δ •e.g. 1ؒͰ45ඵؒɺάϧʔϓͷ ͋Δϓϩηε͕CPUىҼͰ Ԇͨ͠߹ɺcpu some: 75.00
e.g. eBPFͰͷτϥοΩϯάใ •bpf_get_current_cgroup_id(void) ϔϧύʔ •eBPFͷΠϕϯτ͕ى͖ͨλεΫ͕Ͳͷcgroup(v2)ʹॴଐ͍ͯ͠Δ͔ɺ ͦͷIDΛฦ͢ɻ
How cgroup-v2 and PSI Impacts Cloud Native? Uchio Kondo /
GMO Pepabo, Inc. 2019.07.23 CloudNative Days Tokyo 2019 Image from pixabay: https://pixabay.com/images/id-3193865/
eBPF per containers
eBPFٕज़ͱ •ϢʔβۭؒͰ࡞ͬͨϓϩάϥϜΛΧʔωϧͰಈ͔ٕ͢ज़ͷͻͱͭ •2012ʹseccompͷಋೖɺ2013ʹLinuxͷSDNͰͷԠ༻͕࣮͞ ΕɺͦΕҎ߱ख़͢Δ •ϑΟϧλϦϯά͕ಘҙʢtcpdump, seccomp, bpftraceʣ •ΧʔωϧͷใʹΞΫηεͰ͖Δ͕ɺةݥͳίʔυಈ͔ͳ͍ͳͲ ҆શੑ͕͋Δఔ୲อ͞Ε͍ͯΔ
eBPFͷԠ༻ྫ
ίϯςφͷeBPFτϨʔεઓུ •ઓུ͕͍͔ͭ͋͘Δ •Linux Namespace·ͨcgroup (v2)ͷใ͕ར༻Ͱ͖Δ
ྫ1: task_struct ͷใΛḷΔ •task_struct→nsproxy ͔Β namespaceͷใΛ औಘͯ͠ϑΟϧλ͢Δ ʢcxrayʣ IUUQTHJUIVCDPNNSUDDYSBZCMPCNBTUFSQLHUSBDFSPQFOPQFOHP--
ྫ2: NS/ϗετͰͷPIDΛൺֱ •BPFϓϩάϥϜͰऔಘͰ͖ͨ tidͱɺϗετͰͷtidΛ ൺֱ͠ɺҰக͠ͳ͚Ε ίϯςφͱఆ͢Δ ʢTraceeʣ • task_structґଘ IUUQTHJUIVCDPNBRVBTFDVSJUZUSBDFFCMPCNBJOUSBDFFUSBDFFCQGD-ɹ
ྫ3: cgroup helperΛར༻ IUUQTHJUIVCDPNVE[VSBDPQFODMPTFCMPCNBTUFSTSDCQGDPQFODMPTFCQGD
࣮ྫ •copenclose(8) •ۙ౻ͷPoC (BPF+Rust) •ϑϥάͰtask_struct/ cgroup v2 ID ΛΓସ͑
bpf_get_current_cgroup_id(void) を添えて Uchio Kondo / Container Runtime Meetup #3 ランタイムとcgroupの
xxxな関係 * Photo by Fukuoka City
seccomp
seccompͷ͓͞Β͍ •ϓϩάϥϜʹ͓͚ΔγεςϜίʔϧݺͼग़͠ΛϑΟϧλϦϯά͢Δ •γεςϜίʔϧͷҾͷ݅ʹΑͬͯࢦఆΛม͑ΒΕΔ •blacklist(denylist), whitelist(allowlist) ͳͲΛ࣮Ͱ͖Δ •ϑϥά͕ࡉ͔͘ଘࡏ͠ɺྫ͑γεςϜίʔϧͷauditϩάͷΈɺҙ ͷerrnoΛฦͤ͞ΔɺͳͲͷࢦఆ͕Ͱ͖Δ
seccompͷར༻(mruby)
User space notification •seccompʹΑΓγεςϜίʔϧݺͼग़͠Λݕ͠ɺͦͷڍಈΛϢʔβ ϥϯυͷϓϩάϥϜʹҕͶΔ͜ͱ͕Ͱ͖Δٕज़ •Linux 5.0 (2019/3) ͔Βͷಋೖ •அ͢Δ·ͰɺͦͷγεςϜίʔϧϒϩοΫ͢Δ
•e.g. LXCͰͷσόΠεΞΫηεͷ੍ޚ IUUQTHJIZPKQBENJOTFSJBMMJOVY@DPOUBJOFSTɹ
User space notification IUUQTHJIZPKQBENJOTFSJBMMJOVY@DPOUBJOFSTɹ • LXCͰֶͿίϯςφೖ ୈ47ճɹඇಛݖίϯςφͷՄೳੑΛ͛Δseccomp notifyػೳ ΑΓ
࣮ྫʢmrubyར༻ʣ •ҎԼͷΑ͏ͳ acceptor.rbΛ ༻ҙ͢Δ
࣮ྫʢmrubyར༻ʣ •ҎԼͷinvokerΛܦ༝ͯ͠ϓϩάϥϜΛ ىಈɺ listen(3) ΛݺͿ
listen(2) ͷىಈݕ •acceptor.rb ଆͷίϯιʔϧͰڐՄ/ېࢭΛ੍ޚՄೳɻ •ېࢭͨ͠Βͦͷ··ىಈࣦഊͯ͠invokerϓϩηε͕མͪΔ •ڐՄͨ͠ΒԿͳ͔͔ͬͨͷΑ͏ʹɺىಈΛܧଓͯ͠Ϧοεϯɻ
listen(2) ͷىಈݕ ېࢭ࣌ͷग़ྗ ڐՄ࣌ͷग़ྗ
Ԡ༻ʁ •ʮҙͷϥΠϒϥϦؔݺͼग़͠ʯͰϓϩηεΛఀࢭɺCRIU(*)ʹΑΓ ϓϩηεμϯϓΛ࡞͢Δ࣮ݧΛߦͬͨɻ •LD_PRELOAD + ϥούؔ + ʮԿ͠ͳ͍ʯsyscall + seccomp
IUUQTVE[VSBIBUFOBCMPHKQFOUSZ $IFDLQPJOUBOE3FTUPSF*O6TFSTQBDF ϓϩηεͷঢ়ଶΛอଘɺ͔ͦ͜Β࠶ੜ͢Δٕज़ IUUQTDSJVPSH.BJO@1BHF
ߟ
৽ٕज़ʹΑΓͰ͖Δ͜ͱ૿͑Δ͕... •৽ٕज़ͷʮग़ݱʯͱʮීٴʯͷλΠϛϯάζϨΔ •ͨͱ͑ cgroup v2ͷॳग़2013ɻ •2019 ~ 2020 ʹϥϯλΠϜͰͷରԠ͕ਐΜͩଆ໘ •पลͷπʔϧ͕ग़ݱ͢ΔͷͬͱઌͰ͋Ζ͏
•ग़ݱظʹ୯ମͰٕज़Λݕূ͠ɺʢηΩϡϦςΟؚΊʣͲ͏͍͏͕ ͋Δ͔ɺͲ͏͍͏Մೳੑ͕͋Δ͔ݕূ͢Δҙٛେ͖͍
eBPF Linuxͷجຊٕज़ʹͳΓͭͭ͋Δ •ద༻ൣғ͕ͲΜͲΜ·͍ͬͯΔ •τϨʔγϯάɺଳҬ੍ޚωοτϫʔΩϯάͷ΄͔ɺcgroup(v2) deviceͷఆɺLSM BPF programͳͲͳͲ... •ηΩϡϦςΟͷจ຺ͰτϨʔεɺࠪɺҟৗݕͱ͔ܽͤͳ͍ٕज़ ʹͳΔ͜ͱ͕૾͞ΕΔ •Ұͭͷprog
typeʹ৮͓͚ͬͯͩ͘Ͱײ͔֮Γͦ͏