Upgrade to Pro — share decks privately, control downloads, hide ads and more …

博士論文公聴会: Scaling Telemetry Workloads in Cloud A...

博士論文公聴会: Scaling Telemetry Workloads in Cloud Applications: Techniques for Instrumentation, Storage, and Mining / PhD Defence

博士学位論文 公聴会(本審査)
京都大学大学院情報学研究科 知能情報学専攻
坪内 佑樹

Yuuki Tsubouchi (yuuk1)

February 25, 2025
Tweet

More Decks by Yuuki Tsubouchi (yuuk1)

Other Decks in Research

Transcript

  1. Scaling Telemetry Workloads in Cloud Applications: Techniques for Instrumentation, Storage,

    and Mining ژ౎େֶେֶӃ ৘ใֶݚڀՊ ஌ೳ৘ใֶઐ߈ 2025೥2݄18೔ തֶ࢜Ґ࿦จެௌձ ௶಺ ༎थ
  2. 5 ΞϓϦέʔγϣϯ γεςϜ ٕज़ऀ ར༻ऀ Πϯλʔωοτ ӡ༻؂ࢹͷͨΊͷ γεςϜ շదͳαʔϏεར ༻ͷͨΊͷ৴པੑ

    ͷ޲্ ܭଌ อଘ ෼ੳ ܭࢉࢿݯ ෛՙ૿େ ӡ༻ෛՙͷ ૿େ ߩݙᶃ ߩݙᶄ ߩݙᶅ ӡ༻ͷͨΊͷσʔλऩूෛՙͷ૿େʹର͢Δٕज़ఏҊ σʔλऩू
  3. 7 1. ͸͡Ίʹ 2. OSΧʔωϧ಺ܭ૷๏ͷఏҊʢߩݙᶃʣ 3. ετϨʔδΞʔΩςΫνϟߏ੒๏ͷఏҊʢߩݙᶄʣ 4. ނোࣗಈಛఆͷલॲཧ๏ͷఏҊʢߩݙᶅʣ 5.

    ૯ׅ ༧උ৹ࠪޙͷओͳमਖ਼Օॴ (P14-15) ςϨϝτϦʔͷఆٛͱϢʔεέʔεͷ௥ه (P43) LinuxΧʔωϧ಺ͷฒߦ੍ޚʹ ىҼ͢ΔΦʔόϔουͷٞ࿦ͷ௥ه (P26) ຊݚڀͷ࢈ۀ΁ͷߩݙͷ௥ه (P93) ̏ͭͷݸผͷߩݙΛ௨ఈ͢ Δ݁࿦ͷ௥ه (P89) ࣌ܥྻղੳ๏ͱͯ͠ͷ ෼໺ԣஅͷద༻ੑ ʢ࿦จ:p.1-2, 14) ʢ࿦จ:p. 91-92) ʢ࿦จ:p. 4) ʢ࿦จ:p. 31ʣ ʢ࿦จ:p. 86-87ʣ
  4. 12 Ϋϥ΢υΞϓϦέʔγϣϯͷجຊΞʔΩςΫνϟ എܠ ᶃ ᶄ ᶅ ᶆ ϦΫΤετॲཧͷܦ࿏ͷҰྫ Fig. 2.1

    ϦΫΤετɾϨεϙϯεܕͷܗଶɻ τϥϯεϙʔτ઀ଓΛऴ୺͠தܧ͢Δɻ
  5. 13 Ϋϥ΢υΞϓϦέʔγϣϯͷ৴པੑ എܠ ར༻ऀͷշదͳαʔϏεར༻ͷͨΊʹߴ͍৴པੑ͕ཁٻ͞ΕΔɻ 1,819ݸͷγεςϜো֐ͷ͏ͪ47%͕ղܾ·Ͱʹ2࣌ؒҎ্ཁ͢Δɻ มߋىҼͷো֐ͷׂ߹͕શମͷ49.5%Λ઎ΊΔɻ [58] [13] ো֐ͷ Өڹ

    ো֐ͷ τϦΨʔ ɾ ΞϓϦέʔγϣϯίʔυ΍ઃఆϑΝΠϧɺج൫γεςϜͷมߋͳͲ ো֐ͷൃੜΛલఏʹӨڹΛ͍͔ʹ௿ݮ͢Δ͔ʹԠ͑ΔΞϓϩʔν͕ීٴɻ ΦϖϨʔλʔͷରԠ΋ؚΊͨϑΥʔϧττϨϥϯε͕ॏཁɻ [14] 24࣌ؒ365೔ͷՄ༻ੑɺ௿஗ԆԠ౴ͳͲɻ
  6. 15 ؂ࢹͱ෼ੳͷͨΊʹɺγεςϜɺΞϓϦέʔγϣϯɺαʔϏε͔Βԕִ஍΁ɺ ੑೳ΍ར༻ʹؔ͢ΔσʔλΛࣗಈͰऩू͠ɺૹ৴͢Δɻ ςϨϝτϦʔʹΑΔγεςϜͷ؂ࢹ ܭثͷಡΈऔΓ஋Λه࿥͠ɺૹ৴͢Δϓϩηεɻ Ұൠతͳఆٛ ຊݚڀʹ͓͚Δఆٛ ԕִ஍ ܭث ૹ৴

    ෼ੳ ༧උ৹ࠪࢦఠࣄ߲ ɾ ෺ཧతͳػثΛ໨ࢹ͢Δ͜ͱͰಘΒΕΔ৘ใ͸ݶఆతͰ͋Δɻ ɾϋʔυ΢ΣΞɾιϑτ΢ΣΞɾωοτϫʔΫ௨৴ͷ࿦ཧతͳঢ়ଶΛ؂ࢹ͢Δɻ ςϨϝτϦʔ [62,63,64]
  7. 16 ओཁͳςϨϝτϦʔσʔλ Time-oriented Path-oriented ਺஋ʢϝτϦΫεʣ จࣈྻʢϩάʣ τϨʔε ͋Δ࣌఺ͰͷγεςϜͷੑೳΛఆྔత ʹଌఆͨ͠஋ɻ ݻఆִ࣌ؒؒͰαϯϓϦϯά͞ΕΔɻ

    ྫʣ CPUར༻཰ɺϦΫΤετԠ౴࣌ؒ γεςϜ಺Ͱൃੜ͢ΔΠϕϯτͷඇߏ ଄Խ͞ΕͨจࣈྻʹΑΔه࿥ ྫʣΤϥʔϝοηʔδɺϢʔβʔΞΫ ςΟϏςΟɺγεςϜૢ࡞ͳͲ γεςϜ಺Λ௨ա͢ΔҰ࿈ͷॲཧ΍௨৴ ͷྲྀΕΛදݱ͢Δߏ଄Խ͞Εͨσʔλɻ എܠ ಛʹωοτϫʔΫ௨৴ʹؔΘΔτϨʔε ɾ্Ґ૚ɿϦΫΤετཻ౓ ɾԼҐ૚ɿϑϩʔཻ౓ ߩݙᶄͱᶅ ߩݙᶃ
  8. 17 ओཁͳςϨϝτϦʔσʔλʢϝτϦΫεʣ Time-oriented Topology-oriented Data ਺஋ʢϝτϦΫεʣ จࣈྻʢϩάʣ τϨʔε ͋Δ࣌఺ͰͷγεςϜͷੑೳΛఆྔత ʹଌఆͨ͠஋ɻ

    ݻఆִ࣌ؒؒͰαϯϓϦϯά͞ΕΔɻ ྫʣ CPUར༻཰ɺϦΫΤετԠ౴࣌ؒ ྫʣΤϥʔϝοηʔδɺϢʔβʔΞΫ ςΟϏςΟɺγεςϜૢ࡞ͳͲ - ϦΫΤετཻ౓ʢΞϓϦ૚ʣ - ϑϩʔ·ͨ͸ύέοτཻ౓ʢΠϯϑϥ૚ʣ γεςϜ಺Λ௨ա͢ΔҰ࿈ͷॲཧ΍௨৴ ͷྲྀΕΛදݱ͢Δߏ଄Խ͞Εͨσʔλย ͷू߹ എܠ cpu_seconds{instance=host1,…} λΠϜελϯϓͱ஋ͷ૊ͷ഑ྻͰදݱ͞ΕΔ ྫɿ[(1709298600, 29851.26), …] γεςϜ಺Ͱൃੜ͢ΔΠϕϯτͷඇߏ ଄Խ͞ΕͨจࣈྻʹΑΔه࿥ɻ
  9. 23 ɾΞϓϦέʔγϣϯͷϫʔΫϩʔυɺ͓Αͼɺίϯϙʔωϯτ਺ͷ૿େ ɾΑΓਫ਼៛ͳγεςϜཧղͷͨΊͷςϨϝτϦʔσʔλͷࡉཻ౓Խ ςϨϝτϦʔϫʔΫϩʔυͷ૿େ എܠ ܭଌ ϚΠχϯά ɾܭଌ஋ͷసૹɾू໿ॲཧʹ ཁ͢ΔϦιʔεফඅͷ૿େ ɾΞϓϦέʔγϣϯͷॲཧ஗

    Ԇ૿େ ܭଌɾૹ৴ॲཧྔͷ૿େ ετϨʔδ σʔλऔΓࠐΈྔͷ૿େ ɾॻ͖ࠐΈॲཧͷϦιʔε ফඅͷ૿େ ɾσΟεΫอଘྖҬͷ૿େ ɾಡΈࠐΈॲཧͷϦιʔε ফඅͱ஗Ԇͷ૿େ ֶशॲཧྔͷ૿େ ɾϞσϧग़ྗͷਫ਼౓௿Լ ɾֶशॲཧͷ࣮ߦ࣌ؒͱ Ϧιʔεফඅྔͷ૿େ ཁҼ
  10. 24 ςϨϝτϦʔγεςϜ͕΋ͨΒ͢ӡ༻ͷෳࡶ͞ ల։༰қੑ ϝϯςφϯε༰қੑ ɾαʔϏεࣄۀऀ͸ΞϓϦέʔγϣϯʹՃ͑ͯςϨϝτϦʔγεςϜ΋ӡ༻ ͢Δඞཁ͕͋Δɻ ɾӡ༻ෳࡶੑΛ཈͑Δ͜ͱ͸࣮༻ԽͷͨΊʹॏཁͰ͋Δɻ ܭଌ ϚΠχϯά ετϨʔδ

    खಈʹΑΔܭ૷࡞ۀ DBγεςϜͷߏஙɺઃఆɺνϡʔ χϯάɺόοΫΞοϓͷ࡞ۀෛ୲ σʔληοτͷखಈϥϕϦϯά Ϟσϧͷύϥϝʔλνϡʔχϯά σʔλ෼෍ಛੑͷมԽʹΑΔਫ਼౓௿ Լ΁ͷରԠʢ࠶ֶशɾ࠶νϡʔχϯ άͳͲʣ ܭ૷ݩͷίʔυมߋ΁ͷ௥ै ن໛֦ுͷ࡞ۀ΍ɺόʔδϣϯ Ξοϓɺ࠶νϡʔχϯά എܠ
  11. ༧උ৹ࠪࢦఠࣄ߲ 26 ݚڀ໨త ར༻ऀ ʢҰൠͷফඅऀ΍ اۀͷ୲౰ऀͳͲʣ Ϋϥ΢υ ΦϯϥΠϯαʔϏεࣄۀऀ Ϋϥ΢υαʔϏεࣄۀऀ ΞϓϦέʔγϣϯ

    ΦϖϨʔλʔ͕ςϨϝτϦʔΛհͯ͠ɺ γεςϜΛਫ਼៛ʹ೺ѲՄೳ ΦϖϨʔλʔ ςϨϝτϦʔϫʔΫϩʔ υ͕ফඅ͢Δܭࢉػࢿݯ ͷར༻ޮ཰Խ ௿͍ӡ༻ෳࡶੑʹΑΓ ਓతࢿݯͷޮ཰Խ ৴པੑͷ޲্ʹΑΓ շదʹαʔϏεΛར ༻Մೳ ཱ྆
  12. 27 ݚڀ໨ඪ ςϨϝτϦʔγεςϜ ܭଌ ετϨʔδ ϚΠχϯά ΦϖϨʔλʔ ݚڀ໨త ϫʔΫϩʔυ ςϨϝτϦʔϫʔΫϩʔυͷ૿େʹ

    ର֤ͯ͠૚͝ͱʹޮ཰తʹεέʔϦ ϯά͢Δٕज़ΛఏҊ͢Δɻ ӡ༻ෳࡶੑͷ૿ՃΛ཈͑Δ৚݅ԼͰ Ϧ ι ʛ ε ফ අ ྔ ॲ ཧ ஗ Ԇ
  13. 28 ຊݚڀΛ၆ᛌͨ͠ਤ (Chapter 3) (Chapter 4) (Chapter 5) Path-oriented Time-oriented

    ςϨϝτϦʔγεςϜ ܭଌ ετϨʔδ ϚΠχϯά ΦϖϨʔλʔ OSΧʔωϧ಺ͷޮ཰తू໿ ʹΑΔܭ૷๏ औΓࠐΈෛՙͷ૿େ ࣮ߦ࣌ؒ૿Ճͱਫ਼౓ͷ௿Լ ϝτϦΫεͷݸ਺ͷ૿େ ϝϞϦͱσΟεΫDBͷ ֊૚Խ๏ͱ֊૚ؒҠߦ๏ ো֐ʹؔ࿈͠ͳ͍ϝτϦΫε ͷࣗಈͰ࡟ݮ͢Δલॲཧ๏ ωοτϫʔΫ઀ଓϨʔτ૿େ ˠ ܭଌॲཧෛՙ΋૿େ ݚڀ໨త
  14. 29 ຊݚڀΛ၆ᛌͨ͠ਤ (Chapter 3) (Chapter 4) (Chapter 5) Path-oriented Time-oriented

    ςϨϝτϦʔγεςϜ ܭଌ ετϨʔδ Mining ΦϖϨʔλʔ औΓࠐΈෛՙͷ૿େ ࣮ߦ࣌ؒ૿Ճͱਫ਼౓ͷ௿Լ ϝτϦΫεͷݸ਺ͷ૿େ ϝϞϦͱσΟεΫDBͷ ֊૚Խ๏ͱ֊૚ؒҠߦ๏ ো֐ʹؔ࿈͠ͳ͍ϝτϦΫε ͷࣗಈͰ࡟ݮ͢Δલॲཧ๏ ςϨϝτϦʔ ϫʔΫϩʔυͷ૿େ ωοτϫʔΫ઀ଓϨʔτ૿େ ˠ ܭଌॲཧෛՙ΋૿େ ݚڀ໨త OSΧʔωϧ಺ͷޮ཰తू໿ ʹΑΔܭ૷๏
  15. 30 ຊݚڀΛ၆ᛌͨ͠ਤ (Chapter 3) (Chapter 4) (Chapter 5) Path-oriented Time-oriented

    ςϨϝτϦʔγεςϜ ܭଌ ετϨʔδ Mining ΦϖϨʔλʔ औΓࠐΈෛՙͷ૿େ ࣮ߦ࣌ؒ૿Ճͱਫ਼౓ͷ௿Լ ϝτϦΫεͷݸ਺ͷ૿େ ϝϞϦͱσΟεΫDBͷ ֊૚Խ๏ͱ֊૚ؒҠߦ๏ ো֐ʹؔ࿈͠ͳ͍ϝτϦΫε ͷࣗಈͰ࡟ݮ͢Δલॲཧ๏ εέʔϦϯάٕज़ ͷఏҊ ωοτϫʔΫ઀ଓϨʔτ૿େ ˠ ܭଌॲཧෛՙ΋૿େ ݚڀ໨త OSΧʔωϧ಺ͷޮ཰తू໿ ʹΑΔܭ૷๏
  16. 31 (Chapter 3) Path-oriented Time-oriented ܭଌ ετϨʔδ ϚΠχϯά ΦϖϨʔλʔ Y.

    Tsubouchi, M. Furukawa, R. Matsumoto, Low Overhead TCP/UDP Socket-based Tracing for Discovering Network Services Dependencies, Journal of Information Processing (JIP), Vol.30, pp.260-268, Mar 2022. ӡ༻ෳࡶੑ ܭ૷ͷͨΊͷΞϓϦέʔγϣϯ ίʔυͷमਖ਼Λෆཁͱ͢Δ ωοτϫʔΫ઀ଓϨʔτ૿େ ˠ ܭଌॲཧෛՙ΋૿େ ݚڀ໨త OSΧʔωϧ಺ͷޮ཰తू໿ ʹΑΔܭ૷๏ ςϨϝτϦʔγεςϜ
  17. 32 (Chapter 4) Path-oriented Time-oriented ςϨϝτϦʔγεςϜ ܭଌ ετϨʔδ ϚΠχϯά ΦϖϨʔλʔ

    औΓࠐΈෛՙͷ૿େ ϝτϦΫεͷݸ਺ͷ૿େ ϝϞϦͱσΟεΫDBͷ ֊૚Խ๏ͱ֊૚ؒҠߦ๏ ௶಺༎थ, ࿬ࡔேਓ, ᖛా݈, দ໦խ޾, খྛོߒ, Ѩ෦ത, দຊ྄հ, HeteroTSDB: ҟछ෼ࢄKVSؒͷࣗಈ ֊૚ԽʹΑΔߴੑೳͳ࣌ܥྻσʔλϕʔε, ৘ใॲཧֶձ࿦จࢽ, Vol.62, No.3, pp.818-828, 2021೥3݄. ӡ༻ෳࡶੑ ݚڀ໨త ஌ࣝɾ࣮૷ͷྲྀ༻ੑ ͷߴ͍ଟ໨తͷDBγ εςϜͷൣғ಺Ͱղܾ
  18. 33 (Chapter 3) (Chapter 5) Path-oriented Time-oriented ετϨʔδ ϚΠχϯά ΦϖϨʔλʔ

    OSΧʔωϧ಺ͷޮ཰తू໿ ʹΑΔτϨʔγϯάͷܭ૷๏ ࣮ߦ࣌ؒ૿Ճͱਫ਼౓ͷ௿Լ ϝτϦΫεͷݸ਺ͷ૿େ ো֐ʹؔ࿈͠ͳ͍ϝτϦΫε ͷࣗಈͰ࡟ݮ͢Δલॲཧ๏ Y. Tsubouchi and H. Tsuruta, MetricSifter: Feature Reduction of Multivariate Time Series Data for Efficient Fault Localization in Cloud Applications, IEEE Access, Vol. 12, pp. 37398-37417, March 2024. ωοτϫʔΫ઀ଓϨʔτ૿େ ˠ ܭଌॲཧෛՙ΋૿େ ݚڀ໨త ӡ༻ෳࡶੑ ϥϕϦϯάͱϞσϧͷ܇࿅͕ෆཁͳ ڭࢣͳֶ͠शͷ࿮૊ΈͰղܾɻ ύϥϝʔλͷมԽʹରͯ͠ؤڧͳઃܭ ͱ͠ɺνϡʔχϯάͷෛ୲Λ௿ݮɻ ܭଌ ςϨϝτϦʔγεςϜ
  19. 35 (Chapter 4) (Chapter 5) Path-oriented Time-oriented ςϨϝτϦʔγεςϜ ܭଌ ετϨʔδ

    ϚΠχϯά ΦϖϨʔλʔ OSΧʔωϧ಺ͷޮ཰తू໿ ʹΑΔܭ૷๏ औΓࠐΈෛՙͷ૿େ ࣮ߦ࣌ؒ૿Ճͱਫ਼౓ͷ௿Լ ϝτϦΫεͷݸ਺ͷ૿େ ϝϞϦͱσΟεΫDBͷ ֊૚Խ๏ͱ֊૚ؒҠߦ๏ ো֐ʹؔ࿈͠ͳ͍ϝτϦΫε ͷࣗಈͰ࡟ݮ͢Δલॲཧ๏ Y. Tsubouchi, M. Furukawa, R. Matsumoto, Low Overhead TCP/UDP Socket-based Tracing for Discovering Network Services Dependencies, Journal of Information Processing (JIP), Vol.30, pp.260-268, Mar 2022. ωοτϫʔΫ઀ଓϨʔτ૿େ ˠ ܭଌॲཧෛՙ΋૿େ (Chapter 3)
  20. 36 ωοτϫʔΫίʔϧάϥϑ എܠ ैདྷ͸खಈͰͷ࡞ਤ͕ඞཁͰ͋ͬͨ ͕ɺ࠷ۙͰ͸Path-oriented dataΛجʹ ࣗಈԽ͞Εͭͭ͋Δɻ Cloud Load Balancers

    Database Clusters Web app servers Message queues ֤ίϯϙʔωϯτͷݺͼग़ؔ͠܎ Λ஌Γ͍ͨɻ L7: ϦΫΤετ਺,Τϥʔ਺,Ԡ౴࣌ؒ… L4: ૹ৴ɾड৴Bytes/s, RTT, … - มߋͷӨڹൣғΛ஌Γ͍ͨɻ - ϦϯΫ୯ҐͷϝτϦΫεΛ஌Γ͍ͨɻ
  21. 37 Path-oriented dataͷܭ૷Ξϓϩʔν طଘख๏ Kernel User Proxy Network Stack App

    NIC Application-intrusive ΞϓϦέʔγϣϯίʔυʹܭ૷͢Δɻ Application-non-intrusive ΞϓϦέʔγϣϯҎ֎ͷՕॴʹܭ૷ɻ Switch ωοτϫʔΫ௨৴ܦ࿏্ͷ͍ͣΕ͔ʹܭଌ఺Λઃஔ͢Δɻ ར఺ɿΞϓϦͷίϯςΩετΛ஫ೖՄɻ ܽ఺ɿίʔυ௥Ճͷ࿑ྗ͕େ͖͍ɻ ར఺ͱܽ఺͸App-intrusiveͱٯɻ Χʔωϧͷ্Ґ૚ʢιέοτʣͰͷܭ૷ʹண໨ɻ ରProxy: தܧΦʔόʔϔου͕ͳ͍ɻ ରSwitch: ܭଌෛՙΛΤϯυϗετʹ෼ࢄՄೳɻ
  22. ιέοτ૚ʹ͓͚Δܭ૷ख๏ Kernel User Service Agent ετϦʔϛϯά๏ ϑϩʔू໿๏ ϑϩʔूଋ๏ʢఏҊʣ ✗ ϝοηʔδ਺૿ՃʹԠ

    ͯ͡ɺϢʔβۭؒ΁ͷܭ ଌ஋ͷసૹ਺͕૿Ճɻ ✗ ୹໋ͳϑϩʔ͕૿Ճ͢Δͱɺ సૹσʔλ਺΋૿Ճɻ Ѽઌ͕ಉҰͷϑϩʔΛ ଋͶΔɻ ※ ϑϩʔ = ྆୺ͷΞυϨεͱϙʔτͷ૊͕ಉҰͷ௨৴୯Ґ ݚڀͷҐஔ ͚ͮ Queue ܭଌ఺ Kernel User Service Agent ܭଌ఺ ※ ໼ҹ͸σʔλͷྲྀΕΛද͢ ✔ ϑϩʔ͝ͱʹू໿͞Εͨܭ ଌ஋ͷΈอଘɻసૹσʔλ਺ Λ௿ݮɻ Flow1 Flow2 Flow3 Flow4 Kernel User Service Agent ܭଌ఺ ✔ ୹໋ͳϑϩʔ਺͕ଟ͘ ͱ΋సૹσʔλ਺Λ௿ݮ Bundle 1 Bundle 2 ✔ ܭଌΦʔόʔϔου ͕খ͍͞ ([96,97]) ([27,98])
  23. 39 ߩݙᶃͷ֓ཁ 1. ୹໋ͳϑϩʔ͕ଟ͍؀ڥʹ͓͍ͯ΋ɺܭଌΦʔόʔϔουΛ௿ݮͤ͞Δ Χʔωϧ಺ϑϩʔूଋ๏ΛఏҊ͢Δɻ 2. ϑϩʔ਺͕૿େͨ͠ͱͯ͠΋ɺܭଌΦʔόʔϔουʢCPUෛՙʣ͕े෼ʹ খ͘͞ͳΔ͜ͱΛݕূͨ͠ɻ طଘख๏ʹෆརͳ؀ڥ Web

    App Servers DB Server PHPΞϓϦέʔγϣϯͰ͸ɺϦιʔεͷ ཚ༻Λ๷͙ͨΊʹDB΁ͷӬଓతͳ઀ଓ ͕ਪ঑͞Εͳ͍͜ͱ͕͋Δ[101] ղܾ ϑϩʔ͕࣋ଓ͞Εͣɺ୹໋ͳϑϩʔ͕૿େ͢Δɻ Connections ߩݙ
  24. 40 ϑϩʔͷूଋͷ֓೦ ΫϥΠΞϯτ αʔό ఏҊख๏ 53421 32346 48901 Service Service

    Listen port 80 Ephemeral port Flow 1 Flow 2 Flow N Service Service 80 1ຊͷଋͶΒΕͨϑϩʔͱΈͳ͢
  25. 41 Χʔωϧ಺ͰͷҟͳΔϑϩʔͷूଋ ఏҊख๏ ϑϩʔूଋ๏ʢఏҊʣ Kernel User Service Agent NIC ܭଌ఺

    Bundle 1 Bundle 2 "src_ip": "192.168.1.101", "src_port": 53421, "dst_ip": "192.168.1.200", “dst_port": 80, “recv_bytes”: 2000, “send_bytes”: 500, "src_ip": "192.168.1.101", "src_port": 61390, "dst_ip": "192.168.1.200", "dst_port": 80, “recv_bytes": 1000, “sent_bytes”: 100, Flow 1 Flow 2 Bundle 1 "src_ip": "192.168.1.101", "dst_ip": "192.168.1.200", “dst_port": 80, “recv_bytes”: 3000, “sent_bytes”: 600, Ephemeral portΛ ࡟আͯ͠Ϛʔδ ਺஋σʔλ͸౷ܭॲཧ͞ΕΔ ʢྫͰ͸૯࿨ΛͱΔʣ
  26. 42 ࣮૷ɿུ֓ਤ Hash map Kernel User Service Socket Layer Agent

    tcp_v4_connect() inet_csk_accept() tcp_sendmsg() tcp_cleanup_rbuf() ʢUDPলུʣ ఏҊख๏ {src_addr, dst_addr, listen_port, proto, pid} NIC Keys Values {counts, recv_bytes, send_bytes, …} System Call ܭଌϓϩάϥϜ1 ܭଌϓϩάϥϜ2 ܭଌϓϩάϥϜ3 ܭଌϓϩάϥϜ4 LinuxͷkprobeͰΧʔωϧ ؔ਺ʹΞλον͢Δ Linuxͷ extended Barkley Packet Filter (eBPF) Λ༻͍ͯΧʔωϧΛ֦ுΛ͢Δɻ Mapߏ଄ମΛߋ৽ όονૢ࡞ʹΑΓෳ਺ΞΠ ςϜΛఆظతʹऔಘɾ࡟আ
  27. 43 ࣮૷ɿΧʔωϧ಺ͷฒߦ੍ޚ ఏҊख๏ ༧උ৹ࠪࢦఠࣄ߲ ֤ϝϞϦྖҬͷอޢͷͨΊɺΦʔόʔϔου͕খ͍͞ಉظػߏΛ࢖͏ɻ ܭଌϓϩάϥϜ Hash Map eBPF؅ཧྖҬ Χʔωϧ؅ཧྖҬ

    ΤϯτϦ಺ͷ஋ͷߋ৽ ΞτϛοΫ໋ྩͷ࢖༻ ʢϑΣον໋ྩͱՃࢉ໋ྩʣ ૈཻ౓ʢϚοϓશମʣ ͷεϐϯϩοΫ Agent Φʔόϔου͸࣮ݧͰे෼খ͍͜͞ͱΛ֬ ೝࡁΈ͕ͩɺCPUίΞ͕ଟ͍؀ڥͰ͸ແࢹ Ͱ͖ͳ͘ͳΔՄೳੑ͋Γɻ ϚοϓΤϯτϦͷૠೖ ࡉཻ౓ʢόέοτ୯ҐʣͰ εϐϯϩοΫ Χʔωϧؔ਺ ιέοτߏ଄ମͳͲΛ ಡΈऔΔ͚ͩͰɺϩο Ϋ͠ͳ͍ɻ ※ ܭଌϓϩάϥϜ ܭଌϓϩάϥϜ
  28. 44 ධՁͷઃఆ ධՁ ϕϯνϚʔΫ ϕʔεϥΠϯ ධՁ߲໨ Client Server Agent Agent

    ɾ ΤίʔΫϥΠΞϯτɾαʔόʹΑΓTCP·ͨ͸ UDPͷ௨৴ෛՙΛൃੜͤ͞Δɻ ɾ Ұճͷࢼߦ͸30ඵɺόονऔಘස౓͸1ඵ ɾ Χʔωϧͷιέοτ૚Λର৅ͱͨ͠طଘͷܭ૷ख๏ ɾ ετϦʔϛϯά๏ ɾ Χʔωϧ಺ू໿๏ 1. ୹໋ϑϩʔ਺ͷ૿େʹର͢ΔCPUෛՙͷൺֱ 2. 1ରNͷ௨৴؀ڥʹ͓͚ΔCPUෛՙͷൺֱ 3. ΞϓϦέʔγϣϯͷRTTΦʔόʔϔου
  29. 46 2. ௨৴ઌͷݸ਺Λ૿Ճͨ࣌͠ͷCPUෛՙ ҟͳΔ଴ͪड͚ϙʔτΛ΋ͭ௨৴ઌ͕૿͑Δͱɺूଋ཰͕௿Լ͢Δɻ ↪ ఏҊख๏ͷCPUෛՙ͕૿Ճ͢Δ͸ͣ…ʁ ूଋ཰ : ଋͶΒΕΔϑϩʔ਺ :

    ߹ܭϑϩʔ਺ R = 1 − B/T B T ධՁ R=0.90 R=0.94 R=0.98 ௨৴ઌͷ਺Ͱ ܾ·Δ ݻఆ T = 10k αʔϏε਺ʢ௨৴ઌʣͷ૿Ճʹର͠ ͯɺCPUར༻཰͸2%ҎԼΛҡ࣋ͨ͠ɻ ·Ͱ૿Ճͤ͞ΔͱR=0ͱͳΓɺ طଘख๏΁ͷ༏Ґੑ͸ͳ͘ͳΔɻ T = 100k
  30. 47 3. ܭଌॲཧ͕༩͑Δ஗ԆΦʔόϔουͷൺֱ TCP୹໋઀ଓ UDP RTT 300μs ʹରͯ͠ɺఏҊख๏ͷΦʔόϔου͸࠷େͰ΋ 5.8 μsɻ

    ແܭ૷ͱൺ΂ɺߴʑ2%ͷΦʔόϔου૿Ճʹཹ·Δɻ ධՁ ετϦʔϛϯά๏͕ ࠷খͷRTTΛࣔͨ͠ɻ
  31. 48 ୈ̎෦ ߩݙᶃ ·ͱΊ ·ͱΊ (Chapter 3) Path-oriented ςϨϝτϦʔγεςϜ ܭଌ

    ετϨʔδ ϚΠχϯά ΦϖϨʔλʔ OSΧʔωϧ಺ͷޮ཰తू໿ ʹΑΔܭ૷๏ ωοτϫʔΫ઀ଓϨʔτ૿େ ˠ ܭଌॲཧෛՙ΋૿େ ධՁɿ୹໋ϑϩʔ਺ͷ૿Ճʹରͯ͠ɺఏҊ๏͸ 2.2%ҎԼͷCPUར༻཰Λҡ࣋ͨ͠ɻ ແܭ૷ঢ়ଶʹରͯ͠RTTΦʔόʔϔου͸ߴʑ 2%૿Ճʹཹ·ͬͨɻ ༻్ɿωοτϫʔΫίʔϧάϥϑΛܧଓతʹࣗ ಈߏங͢Δɻ
  32. 50 (Chapter 3) (Chapter 4) (Chapter 5) Path-oriented Time-oriented ςϨϝτϦʔγεςϜ

    ܭଌ ετϨʔδ ϚΠχϯά ΦϖϨʔλʔ OSΧʔωϧ಺ͷޮ཰తू໿ ʹΑΔτϨʔγϯάͷܭ૷๏ औΓࠐΈෛՙͷ૿େ ࣮ߦ࣌ؒ૿Ճͱਫ਼౓ͷ௿Լ ϝτϦΫεͷݸ਺ͷ૿େ ϝϞϦͱσΟεΫDBͷ ֊૚Խ๏ͱ֊૚ؒҠߦ๏ ো֐ʹؔ࿈͠ͳ͍ϝτϦΫε ͷࣗಈͰ࡟ݮ͢Δલॲཧ๏ ௶಺༎थ, ࿬ࡔேਓ, ᖛా݈, দ໦խ޾, খྛོߒ, Ѩ෦ത, দຊ྄հ, HeteroTSDB: ҟछ෼ࢄKVSؒͷࣗಈ ֊૚ԽʹΑΔߴੑೳͳ࣌ܥྻσʔλϕʔε, ৘ใॲཧֶձ࿦จࢽ, Vol.62, No.3, pp.818-828, 2021೥3݄. ωοτϫʔΫ઀ଓϨʔτ૿େ ˠ ܭଌॲཧෛՙ΋૿େ ݚڀ໨త
  33. 52 ϝτϦΫεͷऔΓࠐΈϫʔΫϩʔυྔ͸ɺ̎ͭͷ࣍ݩʹൺྫ͢Δ ϝτϦΫεετϨʔδͷϫʔΫϩʔυ ࣌ؒ cpu_seconds{instance=host1,…} memory_total_bytes{instance=host1,…} http_requests_count{instance=host1,…} http_requests_count{instance=host99,…} എܠ ᶄ

    ϝ τ Ϧ Ϋ ε ͷ ݸ ਺ ᶃ ղ૾౓ (Ұൠʹ1 ~ 60ඵͷൣғ) cpu_seconds{instance=host1,…} cpu_seconds{instance=host1,mode=user,core_no=1,…} cpu_seconds{instance=host1,mode=system,core_no=1,…} cpu_seconds{instance=host1,mode=user,core_no=2,…} ಺༁ͷࡉཻ౓ԽʹΑΔݸ਺૿Ճ ෼ղ
  34. 53 ϝτϦΫεετϨʔδͷεέʔϥϏϦςΟཁٻ औΓࠐΈॲཧεϧʔϓοτ ετϨʔδ༰ྔ σʔλѹॖٕज़΍هԱίετͷ௿͍ ϝσΟΞ΁ͷ௕ظอଘʢSSD/HDDʣ എܠ ɾਫฏ෼ׂ͞Εͨෳ਺ϊʔυͰͷऔΓࠐΈ ɾϝϞϦ্ͷσʔλߏ଄΁ͷޮ཰తͳॻ͖ ࠐΈ

    Ұൠతͳղܾ๏ Slack 12M datapoints / sec Meta 700M datapoints / min LYCorp 12.5M datapoints / min [19] [32] [112] Slack 12 TB / day ByteDance 10 TB/ day LYCorp 2.7 TB / day Mackerel 460 days [19] [35] [69] [108] Ұൠతͳղܾ๏
  35. 54 طଘख๏ͷ෼ྨ ࣌ܥྻDB؅ཧγεςϜํࣜ ʢTSDBMSʣ Client DBMS ؔ࿈ݚڀ ࣌ܥྻσʔλࢦ޲ΞϓϦέʔγϣϯํࣜ ʢTSDAʣ App

    DBMS Client ଟ໨తͳDBγεςϜͰ͋ΔKVSͷ্ʹߏ ங͞ΕΔɻ (OpenTSDB, KairosDB) KVS: Ωʔͱ஋ͷϖΞͷू߹ͱͯ͠ σʔλΛอଘɺݕࡧɺ؅ཧՄೳͳ DBMSɻ Transaction Transaction ࣌ܥྻσʔλॲཧʹ࠷దԽ͞ΕͨDBMSɻ λΠϜελϯϓͷ౳ִؒੑɺ஋ͷ࣌ ؒతۙ઀ੑʹண໨ͨ͠ූ߸Խɻ ѹॖ ߏ଄ σΟεΫϕʔεKVSͰ༻͍ΒΕΔLSMπ ϦʔΛجʹ࣌ܥྻߏ଄ʹ࠷దԽɻݻఆ ͷ࣌ؒ࿮͝ͱʹϑΝΠϧ؅ཧ͞ΕΔɻ (Prometheus, Gorilla, InfluxDBͳͲ) [31,33,35,79] [29,30]
  36. 55 طଘख๏ͷ෼ྨ ࣌ܥྻDB؅ཧγεςϜํࣜ ʢTSDBMSʣ DBMS ؔ࿈ݚڀ ࣌ܥྻσʔλࢦ޲ΞϓϦέʔγϣϯํࣜ ʢTSDAʣ App DBMS

    Client ଟ໨తͳDBγεςϜͰ͋ΔKVSͷ্ʹߏ ங͞ΕΔɻ (OpenTSDB, KairosDB) KVS: Ωʔͱ஋ͷϖΞͷू߹ͱͯ͠ σʔλΛอଘɺݕࡧɺ؅ཧՄೳͳ DBMSɻ Transaction ࣌ܥྻσʔλॲཧʹ࠷దԽ͞ΕͨDBMSɻ λΠϜελϯϓͷ౳ִؒੑɺ஋ͷ࣌ ؒతۙ઀ੑʹண໨ͨ͠ූ߸Խɻ ѹॖ ߏ଄ σΟεΫϕʔεKVSͰ༻͍ΒΕΔLSMπ ϦʔΛجʹ࣌ܥྻߏ଄ʹ࠷దԽɻݻఆ ͷ࣌ؒ࿮͝ͱʹϑΝΠϧ؅ཧ͞ΕΔɻ (Prometheus, Gorilla, InfluxDBͳͲ) • KVS͸޿͘ར༻͞Ε͍ͯΔɻ • DBӡ༻ΛࣗಈԽ͢ΔͨΊͷ”DB as a Service”ͱͯ͠KVSαʔϏε ͕޿͘ఏڙ͞Ε͍ͯΔɻ ӡ༻ෳࡶੑΛߟྀ͠ɺ TSDAํࣜʹண໨ TSDAํࣜ͸ૄ݁߹ੑ͕͋Δͨ Ίɺར༻ऀʹDBMS࣮૷ͷબ୒ ࢶΛఏڙՄೳɻ
  37. 56 KVSͷऔΓࠐΈޮ཰ ϝϞϦϕʔεKVS ϝϞϦ͸ϥϯμϜΞΫ ηεޮ཰ʹ༏ΕΔͨ ΊɺϋογϡදΛ࠾༻ ؔ࿈ݚڀ σΟεΫϕʔεKVS ϝτϦΫε਺͕૿େ͢Δ =

    KVSͷΩʔ਺͕૿େ͢Δ ↳ σʔλΛ௥Ճ͢Δ࣌ͷΠϯσοΫεࢀরޮ཰͕໰୊ͱͳΔ Memory Disk ฏߧ໦ɾεΩο ϓϦετͳͲͷ ιʔτࡁΈߏ଄ ιʔτࡁΈͷͨ ΊσΟεΫΞΫ ηεޮ཰͕ߴ͍ O(logn) ॻ͖ࠐΈ Flush ॻ͖ࠐΈ Memory O(k) σΟεΫ্ʹ͸σʔλ Λอ࣋͠ͳ͍ɻ ʢίϛοτϩάΛআ͘ʣ Disk File
  38. 57 KVSͷऔΓࠐΈޮ཰ ϝϞϦϕʔεKVS ϝϞϦ͸ϥϯμϜΞΫ ηεޮ཰ʹ༏ΕΔͨ ΊɺϋογϡදΛ࠾༻ ؔ࿈ݚڀ σΟεΫϕʔεKVS ϝτϦΫε਺͕૿େ͢Δ =

    KVSͷΩʔ਺͕૿େ͢Δ ↳ σʔλΛ௥Ճ͢Δ࣌ͷΠϯσοΫεࢀরޮ཰͕໰୊ͱͳΔ Memory Disk ฏߧ໦ɾεΩο ϓϦετͳͲͷ ιʔτࡁΈߏ଄ ιʔτ͞Ε͍ͯ ΔͨΊσΟεΫ ΞΫηεޮ཰͕ ߴ͍ O(logn) ॻ͖ࠐΈ Flush ॻ͖ࠐΈ Memory O(k) σΟεΫ্ʹ͸σʔλ Λอ࣋͠ͳ͍ɻ ʢίϛοτϩάΛআ͘ʣ Disk ✘ ϝϞϦ͸هԱྔ͋ͨΓͷඅ༻͕େ ͖͍ͨΊɺ௕ظอ࣋ʹ͸ෆ޲͖ɻ ✘ Ωʔ਺͕େ͖͍࣌ʹɺσʔλͷॻ͖ ࠐΈޮ཰͕௿Լ͢Δɻ
  39. 58 ߩݙᶄͷ·ͱΊ औΓࠐΈॲཧޮ཰ͱ௕ظอଘͷཱ྆ ࣌ܥྻσʔλࢦ޲ΞϓϦέʔγϣϯʢTSDAʣ ࣌ܥྻDB؅ཧ γεςϜ ʢTSDBMSʣ σΟεΫϕʔε ఏҊख๏ ӡ༻

    ෳࡶੑ औΓࠐΈ ޮ཰ ετϨʔδ ༰ྔ ࣌ܥྻѹॖͳͲ ࣌ܥྻσʔλ อଘʹ࠷దԽ ૄ݁߹ੑແ͠ SSD/HDDอଘ σΟεΫΞΫη εޮ཰Λߟྀ ͨ͠ߏ଄ ϥϯμϜΞΫηεޮ཰ʹ༏Εͨ ϝϞϦʹ࠷దԽ ݹ͍σʔλͷΈ SSD/HDDอଘ ૄ݁߹ੑ༗Γ ϝϞϦϕʔε ϝϞϦอଘ ߩݙ ɾӡ༻ෳࡶੑͷ௿͍TSDAํࣜͰɺϝϞϦɾσΟεΫϕʔεͷ֤ಛੑΛ ྆औΓ͢ΔΞʔΩςΫνϟΛઃܭͨ͠ɻ ɾσΟεΫϕʔεͷํࣜͱൺֱ͠ɺ3.98ഒͷऔΓࠐΈੑೳΛୡ੒ͨ͠ɻ ߩݙ
  40. 59 ఏҊख๏ HeteroTSDB Client ఏҊख๏ ϝϞϦϕʔεKVS σΟεΫϕʔεKVS App Flusher ௚ۙͷλΠϜελϯϓΛ΋ͭσʔ

    λ͕֨ೲ͞ΕΔϝϞϦόοϑΝ ϋογϡදʹجͮ͘ߴ଎औΓࠐΈ ݹ͍λΠϜελϯϓΛ΋ͭσʔλ͕ ֨ೲ͞ΕΔσΟεΫετϨʔδ SSD/HDDʹอଘ͢Δ͜ͱʹΑΔ ௕ظอ࣋ίετͷ௿Լ σʔλͷϚΠά Ϩʔγϣϯ ཱ྆
  41. 60 ϝϞϦϕʔεKVSͱσΟεΫϕʔεKVSͷ֊૚Խ ϝϞϦϕʔεKVS ϋογϡද O(k) ౸ண M (ingestions/s) cpu_seconds{…} memory_total_bytes{…}

    http_requests_count{…} dݸ Lookup Insert σΟεΫϕʔεKVS ฏߧ໦ɾεΩοϓϦετ O(logn) dݸͷσʔλ఺Λόονॻ͖ࠐΈ ʹΑΓɺLookupճ਺Λ࡟ݮ M / d (ingestions/s) cpu_seconds{…} Lookup memory_total_bytes{…} http_requests_count{…} ఏҊख๏
  42. 61 λΠϚʔʹجͮ͘ϚΠάϨʔγϣϯ ϝϞϦϕʔεKVS σΟεΫϕʔεKVS cpu_seconds{…} cpu_seconds{…} memory_total_bytes{…} http_requests_count{…} memory_total_bytes{…} http_requests_count{…}

    3511 934 298 TTL ɾΩʔ͝ͱʹTTLʢTime To LiveʣΛઃఆ͠ɺTTL͕0ʹͳΕ͹Ҡಈͤ͞Δ ɾTTLηοτ࣌ʹδολʔΛՃ͑ɺҠಈͷλΠϛϯάΛ෼ࢄͤ͞Δ όονॲཧʹΑΔσʔλҠಈ͸ɺσΟεΫϕʔεKVS΁ͷऔΓࠐΈෛՙ͕ภΔ ఏҊख๏ ʢྫɿ3600ඵʣ
  43. 62 ɾ طଘͷෛՙੜ੒πʔϧ[113]Λ༻͍ͯɺෛՙΛ࠶ݱ͢Δɻ ɾ 1ճͷࢼߦΛ30෼ͱ͠ɺఏҊख๏ͷTTLΛ10෼ͱ͢Δɻ ධՁͷઃఆ ධՁ DB servers Load

    generation client ϕϯνϚʔΫ ϕʔεϥΠϯ ධՁ߲໨ ɾ TSDAํࣜΛͱΔKairosDBΛൺֱର৅ͱ͢Δɻ ɾ KairosDB͸σΟεΫϕʔεKVSͷCassandraΛ༻͍Δɻ 1. औΓࠐΈॲཧޮ཰ͷൺֱ 2. ϝτϦΫε਺ͷ૿Ճʹର͢ΔऔΓࠐΈॲཧޮ཰ͷൺֱ 3. ఏҊख๏ͷKVSؒϚΠάϨʔγϣϯੑೳͷ֬ೝ ϝϞϦKVS: Redis σΟεΫKVS: Cassandra ఏҊख๏
  44. 63 ̍. औΓࠐΈॲཧޮ཰ͷൺֱ ධՁ ϗετ਺ʢ1~8ʣ औ Γ ࠐ Έ ε

    ϧ ʛ ϓ ο τ ఏҊख๏ʢHeteroTSDBʣ͕ ϕʔεϥΠϯͷ3.98ഒɻ 420k datapoints/s ੨ɿKairosDB ᒵɿఏҊख๏ Slackࣾͷ12 m/s ͷϫʔΫϩʔυ ʹஔ͖׵͑Δͱ - ఏҊख๏͸229ݸ - KairosDB͸915ݸ ͷϗετ਺Λඞཁͱ͢Δܭࢉʹͳ Δɻ ϝτϦΫε਺Λ1Mʹݻఆ
  45. ຊ࣮ݧͰ͸ɺ໌֬ʹΠϯσοΫεࢀর ͕ϘτϧωοΫͰ͋Δͱ͸ಛఆͰ͖ͯ ͍ͳ͍ɻ ࠓޙɺ௥ՃͷৄࡉͳϓϩϑΝΠϦϯά ͕ඞཁͰ͋Δɻ 64 ̎. ϝτϦΫε਺ͷ૿Ճʹର͢ΔऔΓࠐΈॲཧޮ཰ͷൺֱ ධՁ औ

    Γ ࠐ Έ ε ϧ ʛ ϓ ο τ ϝτϦΫε਺ʢ100~1,000,000) ੨ɿKairosDB ᒵɿఏҊख๏ 2.32ഒ 3.58ഒ ϝτϦΫε਺૿ՃͷεέʔϥϏϦςΟ͸ ϕʔεϥΠϯΑΓߴ͍ɻ σʔλ఺ͷશମૹ৴Ϩʔτ͸ݻఆ
  46. 65 3. ఏҊख๏ͷKVSؒϚΠάϨʔγϣϯੑೳͷ֬ೝ ධՁ औ Γ ࠐ Έ ε ϧ

    ʛ ϓ ο τ ܦա࣌ؒʢ0~1800ඵ) ੨ɿҠಈεϧʔϓοτ /s ੺ɿϝϞϦϕʔεKVSͷϝϞϦ ࢖༻ྔ (MB) ϝ Ϟ Ϧ ࢖ ༻ ྔ TTLͷشൃ ʮฏۉҠಈεϧʔϓοτʢ52k / sʣʯ > ʮϝϞϦKVS΁ͷऔΓࠐΈεϧʔϓο τʢ51k/sʣ ʯ Ҡಈ͕։࢝͞ΕΔͱɺ Ҡಈεϧʔϓοτ͕ଈ࠲ʹ૿Ճ͠ɺ ϝϞϦKVSͷϝϞϦ࢖༻ྔ͕ݮগ͢Δɻ σΟεΫKVS͕ϘτϧωοΫͱͳ͍ͬͯ ͳ͍͜ͱΛࣔ͢ ϗετ਺Λ̍ʹݻఆ ϝτϦΫε਺Λ1Mݸ ʹݻఆ
  47. 66 (Chapter 3) (Chapter 4) (Chapter 5) Path-oriented Time-oriented ςϨϝτϦʔγεςϜ

    ܭଌ ετϨʔδ ϚΠχϯά ΦϖϨʔλʔ OSΧʔωϧ಺ͷޮ཰తू໿ ʹΑΔτϨʔγϯάͷܭ૷๏ औΓࠐΈෛՙͷ૿େ ࣮ߦ࣌ؒ૿Ճͱਫ਼౓ͷ௿Լ ϝτϦΫεͷݸ਺ͷ૿େ ϝϞϦͱσΟεΫDBͷ ֊૚Խ๏ͱ֊૚ؒҠߦ๏ ো֐ʹؔ࿈͠ͳ͍ϝτϦΫε ͷࣗಈͰ࡟ݮ͢Δલॲཧ๏ ωοτϫʔΫ઀ଓϨʔτ૿େ ˠ ܭଌॲཧෛՙ΋૿େ ·ͱΊ औΓࠐΈॲཧޮ཰ͱ̍೥Ҏ ্ͷ௕ظσʔλอ࣋Λཱ྆ ϝτϦΫε਺100͔Β100ສ ݸͷൣғͰϕʔεϥΠϯʹର ͢ΔεέʔϥϏϦςΟ޲্ 100ສݸͷϝτϦΫεͷऔΓ ࠐΈ࣌ʹɺϕʔεϥΠϯʹର ͯ͠3.98ഒͷੑೳ޲্ ධՁᶃ ධՁᶄ ӡ༻ෳࡶੑΛߟྀ͠ɺ طଘͷKVS্ʹఏҊ๏Λ ࣮ݱ͢Δɻ ໨త ୈ̏෦ ߩݙᶄ ·ͱΊ
  48. 68 (Chapter 3) (Chapter 4) (Chapter 5) Path-oriented Time-oriented ςϨϝτϦγεςϜ

    ܭଌ ετϨʔδ ϚΠχϯά ΦϖϨʔλʔ OSΧʔωϧ಺ͷޮ཰తू໿ ʹΑΔτϨʔγϯάͷܭ૷๏ औΓࠐΈෛՙͷ૿େ ࣮ߦ࣌ؒ૿Ճͱਫ਼౓ͷ௿Լ ϝτϦΫεͷݸ਺ͷ૿େ ϝϞϦͱσΟεΫDBͷ ֊૚Խ๏ͱ֊૚ؒҠߦ๏ ো֐ʹؔ࿈͠ͳ͍ϝτϦΫε ΛࣗಈͰ࡟ݮ͢Δલॲཧ๏ Y. Tsubouchi and H. Tsuruta, MetricSifter: Feature Reduction of Multivariate Time Series Data for Efficient Fault Localization in Cloud Applications, IEEE Access, Vol. 12, pp. 37398-37417, March 2024. ωοτϫʔΫ઀ଓϨʔτ૿େ ˠ ܭଌॲཧෛՙ΋૿େ
  49. ϝτϦΫε ΦϖϨʔλʔ 69 ػցֶशʹΑΔނোಛఆͷࣗಈԽ ࣗಈނোಛఆ എܠ ো֐ݕ஌ ετϨʔδ 2. ೖྗ

    3. ग़ྗ 1. ىಈ ݪҼΛࣔ͢ϝτϦΫε ͷϥϯΩϯά 1. memory_total_bytes{instance=host4,…} 2. disk_write_io{instance=host4,…} 3. net_transmit_bytes{instance=host1,…} 4. … [94,96,124-136] ظ଴͞ΕΔ࣮ߦ࣌ؒ͸ ਺෼εέʔϧ
  50. ϝτϦΫε ΦϖϨʔλʔ 70 ػցֶशʹΑΔނোಛఆͷࣗಈԽ ࣗಈނোಛఆ എܠ ো֐ݕ஌ ετϨʔδ 2. ೖྗ

    3. ग़ྗ 1. ىಈ ϥϯΫ 1. … 2. … 3. … ػցֶश ɾϝτϦΫεͱࠜຊݪҼͷϖΞΛେྔʹ ؚΉσʔληοτ͕ͳ͍ɻ ɾओʹڭࢣͳֶ͠श͕࠾༻͞ΕΔɻ ɾϝτϦΫε͝ͱʹҟৗ౓Λࢉग़ɻ ɾϝτϦΫεؒͷҟৗ఻ൖΛัଊɻ [94,96,124-136]
  51. ϝτϦΫε ਺͕૿େ ΦϖϨʔλʔ 71 ނোಛఆʹ͓͚Δੑೳ௿Լͷ໰୊ ࣗಈނোಛఆ എܠ ো֐ݕ஌ ετϨʔδ 2.

    ೖྗ 3. ग़ྗ 1. ىಈ ϥϯΫ 1. … 2. … 3. … ػցֶश ϝτϦΫεͷ਺ͷ૿େʹΑΓɺਫ਼౓ ͱ࣮ߦ͕࣌ؒ௿Լ͢Δɻ[23,24] [94,96,124-136]
  52. ϝτϦΫε ਺͕૿େ ΦϖϨʔλʔ 72 ނোಛఆʹ͓͚Δੑೳ௿Լͷ໰୊ ࣗಈނোಛఆ എܠ ো֐ݕ஌ ετϨʔδ 2.

    ೖྗ 3. ग़ྗ 1. ىಈ ϥϯΫ 1. … 2. … 3. … ػցֶश ಛ௃ྔ࡟ݮ ϊΠζͱͳΔϝτϦΫε ΛऔΓআ͘ ϝτϦΫε਺ͷ૿େʹΑΓɺਫ਼౓ͱ ࣮ߦ͕࣌ؒ௿Լ͢Δɻ [23,24] [23,87] [94,96,124-136]
  53. 73 ಛ௃ྔ࡟ݮͷ໰୊ఆٛʢOursʣ Fig. 5.2: Three types of metrics on anomaly

    propagation for a failure. ނোʢFaultʣൃੜޙɺϝτϦΫεཻ౓Ͱͷҟ ৗͷ఻ൖϞσϧ ো֐Λݕ஌ͨ͠ΒɺͰ͖ΔݶΓૣ͘ɺ Λಛఆ͢Δ͜ͱɻ MA ∪ MB ໰୊ എܠ ɿ௚઀తʹӨڹ͕ݱΕͨϝτϦΫε ɿؒ઀తʹӨڹ͕ݱΕͨϝτϦΫε ɿແӨڹͷϝτϦΫε MA MB MC ࠜຊݪҼ ͨͩ͠ɺো֐ݕ஌௚ޙ͔Βݻఆͷ࣌ؒൣғ·Ͱ Λೖྗͱ͢Δɻʢ௨ྫͰ͸30~60෼ʣ
  54. 74 طଘͷಛ௃࡟ݮͱͦͷ՝୊ എܠ ҟৗੑʹجͮ͘࡟ݮ ো֐࣌ؒ֎ͷҟৗΛݕ஌͠͏Δɻ ݪҼϝτϦΫεʢ ʣؒͰ͸ྨࣅ͠΍͢ ͍ͨΊɺޡ࡟আ͕ൃੜ͠͏Δɻ MA ҟৗ͕ແ͍࣌ܥྻΛ࡟আ

    ૬ؔੑ΍ܗঢ়ྨࣅੑͷߴ͍࣌ܥྻΛ࡟আ ৑௕ੑʹجͮ͘࡟ݮ ຊདྷ࡟আ͍ͨ࣌͠ܥྻ ʢِཅੑʣ ʢِӄੑʣ ো֐ظؒ [23,124,131] [87,129,133]
  55. 75 طଘͷಛ௃࡟ݮͱͦͷ՝୊ എܠ ҟৗੑʹجͮ͘࡟ݮ ҟৗ͕ແ͍࣌ܥྻΛ࡟আ ૬ؔੑ΍ܗঢ়ྨࣅੑͷߴ͍࣌ܥྻͷॏෳ ࡟আ ৑௕ੑʹجͮ͘࡟ݮ ຊདྷ࡟আ͍ͨ࣌͠ܥྻ ʢِཅੑʣ

    ʢِӄੑʣ ো֐ظؒ Ұ෦ͷϝτϦΫεʹݱΕΔҟৗੑɾ৑௕ੑͷΈΛѻ͏ɻ ہॴత େҬత γεςϜશମͷʮো֐ʯ΁ͷؔ࿈ੑΛଊ͍͑ͨɻ
  56. 76 ؍࡯ͱԾఆ Fig. 5.1: Change points in root fault metric.ΑΓҰ෦ൈਮ

    ނোൃੜ࣌ؒ ނোىҼͷมԽ఺͸ ޓ͍ʹ͍ۙ࣌ؒʹݱΕΔ ؍࡯ ہॴతͳಛ௃͔Β େҬతͳো֐Λ ଊ͑Δ มԽ఺͕࣌ؒ࠷΋ภΔൣғ͕ɺো֐ظؒͱͳΔ Ծఆ എܠ
  57. 77 ɾຊݚڀͰ͸ɺେҬతͳো֐Λଊ͑Δಛ௃ྔ࡟ݮ๏ΛఏҊͨ͠ɻ ɾఏҊख๏͸࠷ྑͷਖ਼ղ཰Λୡ੒͠ɺEnd-to-endͰͷਫ਼౓ͱ࣮ߦޮ཰Λ޲্ͤͨ͞ɻ ߩݙͷ֓ཁ ߩݙ ख๏ छผ ֶशछผ େҬੑ FluxInfer-AD

    BIRCH K-S test NSigma PairCorr k-Shape HDBS+SBD MetricSifter ҟৗੑ ৑௕ੑ ൒ڭࢣ͋Γ ʢਖ਼ৗظؒͷࢦఆʣ ڭࢣͳ͠ ҟৗੑ ڭࢣͳ͠ ✘ ✘ ✘ ✔ ଊ͑Δಛ௃ มԽ఺ ਖ਼ৗ - ҟৗظؒͷ ϢʔΫϦουڑ཭ ܗঢ়ྨࣅੑ ෼෍ͷมԽɾ֎Ε஋ ϐΞιϯ૬ؔੑ ڭࢣͳ͠ ҟछͷಛ௃ྔ࡟ݮ๏Λఆྔൺֱͨ͠ॳͷݚڀ
  58. 79 ఏҊख๏͸ͲͷΑ͏ʹಈ࡞͢Δ͔ʁ Fig. 5.5: An example of feature reduction using

    the MetricSifter framework. STEP 2: มԽ఺࣌ؒͷ෼෍ ΛجʹηάϝϯτΛ෼ׂ STEP 1: ࣌ܥྻ͝ͱʹɺނো༝དྷͷ มԽ఺ީิΛݕग़ STEP 3: ࠷େີ౓ͷηάϝϯτΛબ୒ ఏҊख๏
  59. 80 STEP 1: ୯มྔ࣌ܥྻͷมԽ఺ݕग़ ᶃ ίετؔ਺ɿݕग़͢ΔมԽͷछྨ มԽ఺ݕग़ͷطଘͷ࿮૊Έ[152]ͷ͏ͪɺຊυϝΠϯʹదͨ͠΋ͷΛબ୒͢Δɻ ᶄ ୳ࡧ๏ɿมԽ఺ͷ୳ࡧΞϧΰϦζϜ ᶅ

    ϖφϧςΟ߲ɿݕग़͢ΔมԽ఺ͷ਺ʹ੍໿Λ͔͚Δ L2Ϟσϧ ʢฏۉγϑτʣ Pelt๏ɿݫີղΛٻΊΔ͕৚݅෇͖ͰࢬמΓߴ଎Խ BICʹج͖ͮώϡʔϦεςΟοΫʹܾఆɻͨͩ͠ಠࣗͷዞҙతͳ܎਺ Λ௥Ճɻ ω ఏҊख๏
  60. 81 STEP 2/3: มԽ఺ͷີ౓෼෍ਪఆͱ෼෍ͷ෼ׂ Fig. 5.6: An example of segmentation.

    ᶅ ࠷େີ౓ͷηάϝϯτΛબ୒ ᶄ ηάϝϯςʔγϣϯ ہॴ࠷খ఺ʹڥքઢΛҾ͘ ʢਤ͸10ݸͷηάϝϯτʹ෼ׂʣ ఏҊख๏ ᶃ ີ౓෼෍ͷਪఆ Χʔωϧີ౓ਪఆ๏ʢKDEʣΛ༻ ͍ͯ཭ࢄܕͷ෼෍ີ౓Λੜ੒
  61. 82 ɾ߹੒ɿো֐ͷ਺஋γϛϡϨʔγϣϯ ɾ࣮ূɿ̎छྨͷఆ൪ධՁ༻ΞϓϦέʔγϣϯ΁ͷނো஫ೖʹΑΔো֐࠶ݱ ධՁͷઃఆ ධՁ σʔληοτ ϕʔεϥΠϯ ධՁ߲໨ ධՁࢦඪ 1.

    ಛ௃ྔ࡟ݮ୯ҐͰͷਖ਼֬ੑ 2. End-to-endͷਫ਼౓ͱ࣮ߦ࣌ؒ ɾҟৗੑʹجͮ͘࡟ݮͷάϧʔϓ ɾ৑௕ੑʹجͮ͘࡟ݮͷάϧʔϓ 3. ύϥϝʔλͷහײੑͱAblation Study ɾಛ௃ྔ࡟ݮɿ෼ྨ໰୊ͷఆ൪ධՁࢦඪʢRecall / Specifically / Balanced Accuracy) ɾ End-to-end: ϥϯΩϯάग़ྗʹਖ਼ղؚ͕·ΕΔׂ߹ʢఆ൪ࢦඪΛ࠾༻ʣ ʢ߹ܭ132ݸͷσʔληοτʣ
  62. 84 ಛ௃ྔ࡟ݮͱނোಛఆ๏ͷ૊Έ߹ͤ ධՁ ࣗಈނোಛఆ ಛ௃ྔ࡟ݮ ɾ ఏҊख๏ ɾ ҟৗੑʹجͮ͘࡟ݮͷάϧʔϓ ɾ

    ৑௕ੑʹجͮ͘࡟ݮͷάϧʔϓ ɾ None ɾ Random Selection ɾ CallGraph + PageRank ɾ PC + PageRank ɾ PC + HT ɾ LiNGAM + PageRank ɾ LiNGAM + HT ɾ RCD શͯͷ૊Έ߹ ͤΛ࣮ݧɻ
  63. PC+HT ϥϯμϜબ୒ 85 2: End-to-endͷධՁʢ߹੒ʣ Ұ෦ൈਮ ૯߹ධՁɹ ख๏ ਫ਼౓ උߟ

    Ideal 0.344 ཧ૝஋ MetricSifter 0.299 ࠷ྑ NSigma 0.241 ࣍఺ None 0.175 w /o ಛ௃࡟ݮ શނোಛఆ๏ͱͷ૊Έ߹ͤʹ ର͢Δtop-5ਫ਼౓ͷฏۉ஋ ධՁ MetricSifter͕ ཧ૝ख๏ʹ ͍ۙਫ਼౓Λୡ੒ தԝ஋ਫ਼౓ͷ ϥΠϯ
  64. 86 2: End-to-endධՁ -small SS 64 metrics ശͻ͛ਤɿTop-5ਫ਼౓ ંΕઢɿ࣮ߦ࣌ؒ ධՁ

    ʢ࣮ূʣ ୅දతͳҰ෦ͷ ૊Έ߹ͤΛܝࡌ ɾTop-5ਫ਼౓͸MetricSifter͕࠷ྑͰɺ࣮ ߦޮ཰͸ҟৗੑ࡟ݮΑΓ΋ߴ͍ɻ ࣮ߦ࣌ؒ͸৑௕ੑ࡟ݮʢHDBS-SBD/ HDBS-Rʣ͕࠷ྑ͕ͩਫ਼౓͸࠷΋௿͍ɻ தԝ஋ਫ਼౓ ͷϥΠϯ
  65. 87 2: ࣮ূσʔλৄࡉʢେن໛ >100 metricsʣ -medium SS -large SS -small

    TT -medium TT 184 metrics 1312 383 1349 ಛఆͷނোಛఆ๏ʢRCDʣͷΈ͕ݱ࣮తͳ࣌ؒ಺ʢ3600ඵҎ಺ʣͰॲཧΛ ऴ͑ͨɻ ධՁ ଞ͸ɺނোಛఆΞϧΰϦζϜʹฒྻੑ͕ͳ͍ݱ࣮తͳ࣌ؒ಺ʹ׬ྃͤͣɻ ϝτϦΫε਺>1000Ͱ͸ɺ͍ͣΕͷέʔεʹ͓͍ͯ΋ ඇৗʹ௿͍ਫ਼౓ͱͳͬͨɻ
  66. 88 3: ύϥϝʔλͷහײੑͱAblation Study ධՁ ύϥϝʔλʔ͕ద੾Ͱ͋Ε͹ ਫ਼౓ࠩ͸খ͍͞ɻ ߹੒ͷ͖Ε͍ͳσʔλͰ͸ɺ มԽ఺ݕग़ਫ਼౓͕ߴ͗͢Δͨ ΊͰ͋Δͱߟ͑Δɻ

    STEP1ʢมԽ఺ݕग़ʣͷύϥϝʔ λ ͕௿͍ͱਖ਼֬ੑ͕௿Լɻ ω ͔͠͠ɺSTEP2/3ʹΑΓਫ਼౓ ޲্ɻ ੨ɿMetricSifter ׬શ൛ ஡ɿMetricSifter STEP1ͷΈ
  67. 1. ࣌ܥྻσʔλ্ͷมԽ఺ͱͯ͠ݕग़ՄೳͰ͋Δ͜ͱ 2. γεςϜ಺ͰҟৗͷӨڹ͕఻ൖ͢Δ͜ͱʢਆܦܥɺిྗ໢ɺ΢Πϧεײછɺؾ৅ͳͲʣ 3. ఻ൖ͕࣌ؒ͋Δఔ౓୹͘ɺ͹Β͖͕ͭখ͍͜͞ͱ 89 ɾϩϘοτ޻ֶɿػց͔Βͷηϯαʔσʔλ෼ੳʢԹ౓ɺৼಈɺిྲྀɺѹྗʣ ɾӉ஦޻ֶɿӴ੕γεςϜͷ؂ࢹʢ਺ඦ ~

    ਺ઍͷม਺ΛؚΉߴ࣍ݩσʔλʣ ɾҩྍɿױऀͷٸͳ༰ଶมԽݕग़ͷͨΊͷੜମ৴߸ͷ෼ੳ ٞ࿦ɿ෼໺ԣஅͷద༻ੑ ༧උ৹ࠪࢦఠࣄ߲ [173] [174] [175] [140] ؾ৅ֶͰ͸਺೔͔Β਺ϲ݄ͷ఻ൖ࣌ؒΛཁ͢ΔͨΊɺద༻Ͱ͖ͳ͍Մೳੑ͋Γ ৘ใ௨৴Ҏ֎ͷ෼໺ͷಉܕͷ໰୊ ఏҊख๏ͷద༻৚݅
  68. 90 (Chapter 3) (Chapter 4) (Chapter 5) Path-oriented Time-oriented ςϨϝτϦʔγεςϜ

    ܭଌ ετϨʔδ ϚΠχϯά ΦϖϨʔλʔ OSΧʔωϧ಺ͷޮ཰తू໿ ʹΑΔτϨʔγϯάͷܭ૷๏ औΓࠐΈෛՙͷ૿େ ࣮ߦ࣌ؒ૿Ճͱਫ਼౓ͷ௿Լ ϝτϦΫεͷݸ਺ͷ૿େ ϝϞϦͱσΟεΫDBͷ ֊૚Խ๏ͱ֊૚ؒҠߦ๏ ো֐ʹؔ࿈͠ͳ͍ϝτϦΫε ΛࣗಈͰ࡟ݮ͢Δલॲཧ๏ ωοτϫʔΫ઀ଓϨʔτ૿େ ˠ ܭଌॲཧෛՙ΋૿େ ୈ̐෦ ߩݙᶅ ·ͱΊ ɾಛ௃࡟ݮͷఆྔతͳൺֱධՁΛߦͬͨॳͷݚڀ ɾہॴతͳมԽ఺ͷू߹͔ΒେҬతͳো֐Λଊ͑Δख๏ ΛఏҊɻ ɾ߹੒ɿ࠷ྑͷਖ਼ղ཰ɻEnd-to-endਫ਼౓Λ24%޲্ɻ ɾ࣮ূɿEnd-to-endͰਫ਼౓ͱ࣮ߦޮ཰ͷ྆ํ·ͨ͸͍ ͣΕ͔Λ޲্ɻ
  69. 92 ૯ׅɿςϨϝτϦʔϫʔΫϩʔυεέʔϦϯά ςϨϝτϦʔγεςϜ Ϋϥ΢υ ΞϓϦέʔγϣϯ ΦϖϨʔλʔ Ϣʔβʔ Πϯλʔωοτ ܭଌ ετϨʔδ

    ϚΠχϯά Ϧιʔεফඅ Ϧιʔεফඅ ϫʔΫϩʔυͷ૿େ ⾭ ⾭ ߩݙ ᶃ Χʔωϧ಺ωοτϫʔΫϑ ϩʔͷूଋʹΑΔ௿Φʔόʔ ϔουͳܭ૷๏ͷఏҊɻ ߩݙ ᶄ औΓࠐΈޮ཰ͱ௕ظอ࣋Λ ཱ྆ՄೳͳҟछKVSͷ֊૚ ԽΞʔΩςΫνϟͷఏҊɻ ʢैདྷൺ࠷େ3.98ഒͷεϧʔ ϓοτ޲্ʣ ߩݙ ᶅ ো֐ʹؔ࿈͢ΔϝτϦΫε ͷมԽ఺ͷूதੑʹண໨͠ ͨಛ௃࡟ݮ๏ͷఏҊɻ ʢैདྷൺฏۉ+4.5%ͷਫ਼౓޲্ ฏۉ࣮ߦ࣌ؒ45-52%ͷ޲্ʣ ʢCPU࢖༻཰2.2%ҎԼɺRTT Φʔόʔϔου࠷େ6μsʣ
  70. ΞϓϦέʔγϣϯ ܭଌ 93 جຊݪଇɿαϯϓϦϯάɾू໿ɾಛ௃࡟ݮͳͲͷσʔλ࡟ݮ͸ɺίϯςΩετ ͕๛෋ͳՕॴʢܭ૷ɾϚΠχϯάʣͰద༻͢Δ͜ͱɻ ૯ׅɿςϨϝτϦʔγεςϜઃܭࢦ਑ ςϨϝτϦʔγεςϜ ΦϖϨʔλʔ ετϨʔδ ϚΠχϯά

    ϓϩηεɺιέοτɺτϥϯβΫ γϣϯͳͲɻ ߩݙᶃͰ͸ɺιέοτΛجʹू໿ɻ ΞϓϦέʔγϣϯ ίϯςΩετ ো֐΍ΞϥʔτͳͲɻ ӡ༻ίϯςΩετ σʔλ࡟ݮΛͤͣɺܭࢉ ࢿݯͷར༻ޮ཰޲্Λ ໨ࢦ͢ɻ ߩݙᶅͰ͸ɺো֐ൃੜΛ جʹಛ௃࡟ݮɻ ༧උ৹ࠪࢦఠࣄ߲
  71. 94 ɾ ʮӡ༻ෳࡶੑΛ௿͘཈͑Δ͜ͱʯΛ੍໿৚݅ͱͯ͠ɺʮςϨϝτϦʔϫʔΫϩʔυ εέʔϦϯάʯͱݺͿ໰୊Λຊݚڀಠࣗʹઃఆͨ͠ɻ ɾ ςϨϝτϦʔγεςϜΛ3ͭͷ૚ʹ෼ྨ͠ɺ֤૚ͷ՝୊Λ੔ཧ͠ɺͦΕΒΛղܾ͢ ΔͨΊͷٕज़ఏҊΛࣔͨ͠ɻ ૯ׅɿຊݚڀͷҙٛ ֶज़తߩݙ ࣾձతҙٛ

    ɾ DX͕Ճ଎͢ΔதɺΦϯϥΠϯαʔϏεͷن໛͕֦ு͞ΕΔʹͭΕͯɺςϨϝτϦʔ γεςϜͷϫʔΫϩʔυ͸·͢·͢૿େ͢ΔͩΖ͏ɻ ɾ ༗ݶͷܭࢉػͱਓతࢿݯͷதͰɺӡ༻ෳࡶੑΛ௿ݮ্ͨ͠ͰͷςϨϝτϦʔϫʔΫ ϩʔυͷॲཧޮ཰ͷ޲্͸ඞཁͰ͋Δɻ ɾ ຊݚڀ͸ɺΦϖϨʔλʔͷ࿑ྗͷ࡟ݮͱαʔϏεͷ৴པੑͷ޲্ʹد༩͢Δ΋ͷͰ ͋Δͱߟ͑Δɻ
  72. 95 ຊݚڀͷࣾձ࣮૷ ※3 https://github.com/ai4sre/metricsifter ※2 https://github.com/yuuki/go-conntracer-bpf ※1 https://mackerel.io/ja/blog/entry/weekly/20180126 ɾαʔόʔ؂ࢹSaaSͷΞʔΩςΫνϟͱͯ͠ద༻ࡁΈ ※2

    ※1 ※3 ※2 ͱ ※3 ͸࣮؀ڥͰͷ࢖༻ྫ͕·ͩͳ ͍ͨΊɺࠓޙීٴ׆ಈΛߦ͏ɻ ܭଌ૚ɿߩݙᶃ ɾGoݴޠͷϥΠϒϥϦͱͯ͠ެ։ࡁΈ ɾPythonݴޠͷϥΠϒϥϦͱͯ͠ެ։ࡁΈ ɾݱ৬ʹͯಋೖΛݕ౼த ετϨʔδ૚ɿߩݙᶄ ϚΠχϯά૚ɿߩݙᶅ
  73. 96 ࠓޙͷల๬ 1. Collect-First͔Β Use-First΁ 2. LLMʹΑΔো֐؅ཧ 3. ෼ࢄਂ૚ֶशΠϯϑϥ ͷͨΊͷςϨϝτϦʔ

    σʔλར༻ύλʔϯΛϑΟʔυόοΫ͠ɺඞཁͳσʔλͷΈ Λऩू͢ΔΑ͏ʹࣗಈదԠ͢ΔดϧʔϓγεςϜͷݚڀɻ LLMΛ׆༻ͨ͠ނোಛఆࣗಈԽʹ͍ͭͯɺϓϩϯϓτ௕ͷ্ ݶΛߟྀͨ͠࡟ݮɾѹॖʹجͮ͘ʮো֐εφοϓγϣοτʯ ͷੜ੒ख๏ͷݚڀɻ GPUΛ࢖༻͢Δେن໛Ϋϥελʹ͓͍ͯɺ෼ࢄֶशϫʔΫ ϩʔυͷ࠷దԽ΍଱ো֐ੑ޲্ͷͨΊͷ৽͍͠ςϨϝτϦγ εςϜͷݚڀɻ ςϨϝτϦʔ3૚ͷશମ࠷దԽ ৽ٕज़ʹ͓͚ΔϚΠχϯά૚ͷ ϫʔΫϩʔυεέʔϦϯά Ϋϥ΢υΞϓϦέʔγϣϯ Ҏ֎ͷγεςϜ
  74. 97 ݚڀۀ੷ɹड৆ ɾ ৘ใॲཧֶձΠϯλʔωοτͱӡ༻ٕज़γϯϙδ΢Ϝ2020 ༏ल࿦จ৆ ௶಺༎थ, ௽ాതจ, ݹ઒խେ, TSifter: Ϛ

    ΠΫϩαʔ ビ εʹ͓͚Δੑೳҟৗͷਝ଎ͳ਍அʹ޲͍ͨ࣌ܥྻ デ ʔλͷ࣍ݩ࡟ݮख๏, 2020೥12݄. ɾ ৘ใॲཧֶձΠϯλʔωοτͱӡ༻ٕज़γϯϙδ΢Ϝ2020 ༏लϓϨθϯςʔγϣϯ৆ ௶಺༎थ, TSifter: ϚΠΫ ϩαʔ ビ εʹ͓͚Δੑೳҟৗͷਝ଎ͳ਍அʹ޲͍ͨ࣌ܥྻ デ ʔλͷ࣍ݩ࡟ݮख๏, 2020೥12݄. ɾ 2020೥౓ ৘ใॲཧֶձ ࢁԼه೦ݚڀ৆ɼ௶಺༎थ, Transtracer: ෼ࢄγεςϜʹ͓͚ΔTCP/UDP௨৴ͷऴ୺఺ ͷ؂ࢹʹΑΔϓϩηεؒґଘؔ܎ͷࣗಈ௥੻, 2020೥. ɾ ৘ใॲཧֶձΠϯλʔωοτͱӡ༻ٕज़γϯϙδ΢Ϝ2019ʢIOTS2019ʣ༏ल࿦จ৆ ௶಺༎थ, ݹ઒խେ, দຊ ྄հ, Transtracer: ෼ࢄγεςϜʹ͓͚ΔTCP/UDP௨৴ͷऴ୺఺ͷ؂ࢹʹΑΔϓϩηεؒґଘؔ܎ͷࣗಈ௥੻, 2019೥12݄. ɾ ৘ใॲཧֶձΠϯλʔωοτͱӡ༻ٕज़γϯϙδ΢Ϝ2019ʢIOTS2019ʣף৆: γʔɾΦʔɾίϯϰ৆ ௶಺༎ थ, ݹ઒խେ, দຊ྄հ, Transtracer: ෼ࢄγεςϜʹ͓͚ΔTCP/UDP௨৴ͷऴ୺఺ͷ؂ࢹʹΑΔϓϩηεؒґଘ ؔ܎ͷࣗಈ௥੻, 2019೥12݄.
  75. 98 ɾ Y. Tsubouchi, M. Furukawa, R. Matsumoto, Low Overhead

    TCP/UDP Socket-based Tracing for Discovering Network Services Dependencies, Journal of Information Processing (JIP), Vol.30, pp.260-268, March 2022. ݚڀۀ੷ɹ࿦จࢽɾࠃࡍձٞ ࿦จࢽ ࠃࡍձٞ ɾ Y. Tsubouchi, M. Furukawa, R. Matsumoto, Transtracer: Socket-Based Tracing of Network Dependencies among Processes in Distributed Applications, The 1st IEEE International COMPSAC Workshop on Advanced IoT Computing (AIOT 2020), July 2020. ɾ ௶಺༎थ, ࿬ࡔேਓ, ᖛా݈, দ໦խ޾, খྛོߒ, Ѩ෦ത, দຊ྄հ, HeteroTSDB: ҟछ෼ࢄKVSؒͷࣗ ಈ֊૚ԽʹΑΔߴੑೳͳ࣌ܥྻσʔλϕʔε, ৘ใॲཧֶձ࿦จࢽ, Vol.62, No.3, pp.818-828, 2021೥3݄. ɾ Y. Tsubouchi, A. Wakisaka, K. Hamada, M. Matsuki, H. Abe, R. Matsumoto, HeteroTSDB: An Extensible Time Series Database for Automatically Tiering on Heterogeneous Key-Value Stores, The 43rd Annual IEEE International Computers, Software & Applications Conference (COMPSAC), pp. 264-269, July 2019. ɾ ௶಺༎थ, ҏ໺จ඙, ஔాਅੜ, ࢁ઒૱, ദ໦ַ඙, ഡݪ݉Ұ, ॏෳഉআετϨʔδͷͨΊͷSHA-1ܭࢉγεςϜͷ SSE໋ྩʹΑΔߴεϧʔϓοτԽ, ిࢠ৘ใ௨৴ֶձ࿦จࢽ D, 96(10), pp.2101-2109 2013೥10݄. ɾ Y. Tsubouchi and H. Tsuruta, MetricSifter: Feature Reduction of Multivariate Time Series Data for Ef fi cient Fault Localization in Cloud Applications, IEEE Access, Vol. 12, pp. 37398-37417, March 2024. ʢߩݙ̎ʣ ʢߩݙ̍ʣ ʢߩݙ̏ʣ ʢߩݙ̍ʣ ʢߩݙ̎ʣ
  76. 99 ݚڀۀ੷ɹࠃ಺γϯϙδ΢Ϝʢࠪಡ෇ʣ ɾ ʢߩݙ̏ʣ௶಺༎थ, ௽ాതจ, ݹ઒խେ, TSifter: ϚΠΫϩαʔϏεʹ͓͚Δੑೳҟৗͷਝ଎ͳ਍அʹ޲͍ͨ࣌ ܥྻσʔλͷ࣍ݩ࡟ݮख๏, ৘ใॲཧֶձΠϯλʔωοτͱӡ༻ٕज़γϯϙδ΢Ϝ࿦จू,

    2020, 9-16 (2020- 11-26), 2020೥12݄. ɾ ௶಺༎थ, ੨ࢁਅ໵, MeltriaɿϚΠΫϩαʔϏεʹ͓͚Δҟৗݕ஌ɾݪҼ෼ੳͷͨΊͷσʔληοτͷಈతੜ੒ γεςϜ, ৘ใॲཧֶձΠϯλʔωοτͱӡ༻ٕज़γϯϙδ΢Ϝ࿦จू, 2021, 63-70 (2021-11-18), 2021೥11݄. ɾ ྛ༑Ղ, দݪࠀ໻, ࿯๺ݡ, ௶಺༎थ, Situation Awarenessͱೝ஌৺ཧֶʹ΋ͱ͍ͮͨϚΠΫϩαʔϏεܕγες Ϝ޲͚؂ࢹμογϡϘʔυͷઃܭ, ৘ใॲཧֶձΠϯλʔωοτͱӡ༻ٕज़γϯϙδ΢Ϝ࿦จू, 2021, 97-98 (2021-11-18), 2021೥12݄. ɾ ௽ాതจ, ௶಺༎थ, ෼ࢄγεςϜͷੑೳҟৗʹର͢Δػցֶशͷղऍੑʹجͮ͘ݪҼ਍அख๏, ৘ใॲཧֶձ Πϯλʔωοτͱӡ༻ٕज़γϯϙδ΢Ϝ࿦จू, 2021, 24-31 (2021-11-18), 2021೥11݄. ɾ ʢߩݙ̍ʣ௶಺༎थ, ݹ઒խେ, দຊ྄հ, Transtracer: ෼ࢄγεςϜʹ͓͚ΔTCP/UDP௨৴ͷऴ୺఺ͷ؂ࢹʹΑ Δϓϩηεؒґଘؔ܎ͷࣗಈ௥੻, Πϯλʔωοτͱӡ༻ٕज़γϯϙδ΢Ϝ࿦จू, 2019, 64-71 (2019-11-28), 2019೥12݄. ɾ ʢߩݙ̎ʣ௶಺༎थ, ࿬ࡔேਓ, ᖛా݈, দ໦խ޾, Ѩ෦ത, দຊ྄հ, HeteroTSDB: ҟछࠞ߹Ωʔ バ ϦϡʔετΞ Λ༻͍ͨࣗಈ֊૚ԽͷͨΊͷ࣌ܥྻ デ ʔλ ベ ʔεΞʔΩςΫνϟ, ৘ใॲཧֶձΠϯλʔωοτͱӡ༻ٕज़γϯ ϙδ΢Ϝ࿦จू, 2018, 7-15 (2018-11-29), 2018೥12݄.
  77. 100 ݚڀۀ੷ɹࠃ಺ձٞ࿥ʢࠪಡͳ͠ʣ ɾ ྛ༑Ղ, দݪࠀ໻, ࿯๺ݡ, ௶಺༎थ, ϚΠΫϩαʔϏεܕγεςϜͷ؂ࢹʹ͓͚ΔμογϡϘʔυUIઃܭʹىҼ ͢Δঢ়گೝࣝ΁ͷӨڹ, No.2022-IOT-56,

    Vol.38, pp.1-8, 2022೥3݄. ɾ দຊ྄հ, ௶಺༎थ, ΫϥΠΞϯτϓϩηεͷݖݶ৘ใʹجͮ͘TCPΛհͨ͠ಁաతͳݖݶ෼཭ํࣜͷઃܭ, ৘ ใॲཧֶձݚڀใࠂΠϯλʔωοτͱӡ༻ٕज़ʢIOTʣ, No.2020-IOT-49, Vol.11, pp.1-6, 2020೥5݄. ɾ ྛ༑Ղ, ҏ੎ా࿇, দݪࠀ໻, ࿯๺ݡ, ௶಺༎थ, দຊ྄հ, ಈతదԠੑΛ࣋ͭ෼ࢄγεςϜΛର৅ͱͨ͠γεςϜ ঢ়ଶՄࢹԽख๏ͷݕ౼, ৘ใॲཧֶձݚڀใࠂΠϯλʔωοτͱӡ༻ٕज़ʢIOTʣ, No.2020-IOT-48, Vol.22, pp.1-8, 2020೥3݄. ɾ ௶಺༎थ, ݹ઒խେ, দຊ྄հ, ௒ݸମܕσʔληϯλʔΛ໨ࢦͨ͠ωοτϫʔΫαʔϏεؒґଘؔ܎ͷࣗಈ௥ ੻ͷߏ૝, ϚϧνϝσΟΞɺ෼ࢄɺڠௐͱϞόΠϧʢDICOMO2019ʣγϯϙδ΢Ϝ, 6A-2, pp. 1169-1174, 2019 ೥7݄. ɾ ௶಺༎थ, দຊ྄հ, ௒ݸମܕσʔληϯλʔʹ͓͚Δ෼ࢄڠௐΫΤϦΩϟογϡߏ૝, ৘ใॲཧֶձݚڀใࠂ Πϯλʔωοτͱӡ༻ٕज़ʢIOTʣ, No.2019-IOT-45, Vol.14, pp.1-7, 2019೥5݄. ɾ দຊ྄հ, ௶಺༎थ, ٶԼ߶ี, ෼ࢄܕσʔληϯλʔOSΛ໨ࢦͨ͠ϦΞΫςΟϒੑΛ࣋ͭίϯςφ࣮ߦج൫ٕ ज़, ৘ใॲཧֶձݚڀใࠂΠϯλʔωοτͱӡ༻ٕज़ʢIOTʣ, No.2019-IOT-45, Vol.12, pp.1-8, 2019೥3݄.
  78. ݚڀ֓ཁ: Scaling Telemetry Workloads in Cloud Applications എܠͱ໨త ՝୊ ߩݙ

    1. Ϋϥ΢υΞϓϦέʔγϣϯͷςϨϝτϦʔ 2. ςϨϝτϦʔϫʔΫϩʔυͷ૿େ 3. ςϨϝτϦʔϫʔΫϩʔυεέʔϦϯά 1. ܭଌɿܭଌॲཧΦʔόʔϔουͷ૿େ 2. ετϨʔδɿऔΓࠐΈσʔλྔͷ૿େͱ௕ظอଘ 3. ϚΠχϯάɿނোಛఆͷਫ਼౓ɾ࣮ߦޮ཰ͷ௿Լ 1. ୹໋ͳωοτϫʔΫ௨৴͕૿େ͢ΔͱɺैདྷͷܭଌॲཧͰ͸ɺܭଌݩͷOS Χʔωϧ͔Βͷసૹॲཧίετ͕ߴ͍ɻ ϝτϦΫε਺ͷ૿େʹରͯ͠ɺऔΓࠐΈॲཧޮ཰ͷ޲্ͱ̍೥Ҏ্ͷ௕ ظอଘΛཱ྆͢Δ͜ͱ͕೉͍͠ɻ ϝτϦΫε਺ͷ૿େʹରͯ͠ɺطଘͷಛ௃࡟ݮΛద༻ͨ͠ͱͯ͠΋ɺγες Ϝશମͷো֐Λଊ͑ΒΕͣɺِཅੑɾِӄੑ͕૿Ճ͢Δɻ ܭଌॲཧͷޮ཰Խ [1] Y. Tsubouchi, M. Furukawa, R. Matsumoto, Low Overhead TCP/UDP Socket-based Tracing for Discovering Network Services Dependencies, Journal of Information Processing (JIP), Vol.30, pp.260-268, March 2022. [2] ௶಺༎थ, ࿬ࡔேਓ, ᖛా݈, দ໦խ޾, খྛོߒ, Ѩ෦ത, দຊ ྄հ, HeteroTSDB: ҟछ෼ࢄKVSؒͷࣗಈ֊૚ԽʹΑΔߴੑೳͳ ࣌ܥྻσʔλϕʔε, ৘ใॲཧֶձ࿦จࢽ, Vol.62, No.3, pp.818- 828, 2021೥3݄. [3] Y. Tsubouchi and H. Tsuruta, MetricSifter: Feature Reduction of Multivariate Time Series Data for Ef fi cient Fault Localization in Cloud Applications, IEEE Access, Vol. 12, pp. 37398-37417, March 2024. 2. औΓࠐΈॲཧͱ௕ظอଘͷޮ཰ͷ޲্ 3. ނোಛఆͷલॲཧͰো֐ʹؔ࿈͠ͳ͍มྔͷ࡟ݮ OSΧʔωϧ಺ͰTCP/UDP௨৴ΠϕϯτΛूଋ͢Δ͜ͱʹΑΔసૹॲཧޮ཰ͷ޲্ ҟछKVSΛ֊૚Խ͠ɺΠϯσοΫεࢀরޮ཰ͱ҆ՁͳετϨʔδ΁ͷ֨ೲΛ࣮ݱɻ ো֐ൃੜ࣌ʹ֤࣌ܥྻͷมԽ఺͕࣌ؒूத͢Δ͜ͱΛߟྀͨ͠ಛ௃࡟ݮʹΑΓɺ ނোಛఆਫ਼౓ͱ࣌ؒΛվળɻ ֤૚ͷϫʔΫϩʔυ૿େ࣌ͷ՝୊ղܾ ςϨϝτϦʔϫʔΫϩʔυ૿େͷ՝୊ ޮ཰తʹεέʔϧՄೳͳςϨϝτϦʔγ εςϜͷ࣮ݱʹ޲͚ͯ ΞϓϦέʔγϣϯ͕ෳࡶԽ͓ͯ͠ΓɺςϨϝτϦʔʹΑΔӡ༻ ؅ཧ͕ඞਢͰ͋Δɻ [1] [2] [3] ςϨϝτϦʔγεςϜͰɺܭଌɾετϨʔδɾϚΠχϯάͷ֤૚ ͰϫʔΫϩʔυ͕૿େ͍ͯ͠Δɻ ܭࢉػࢿݯͷফඅ૿େͳͲͷ໰୊ʹରͯ͠ޮ཰Α͘εέʔϧͤ͞Δ ͜ͱΛ໨తͱ͢Δɻͨͩ͠ɺӡ༻ෳࡶੑΛߟྀ͢Δ͜ͱɻ