Slide 1

Slide 1 text

Scaling Telemetry Workloads in Cloud Applications: Techniques for Instrumentation, Storage, and Mining ژ౎େֶେֶӃ ৘ใֶݚڀՊ ஌ೳ৘ใֶઐ߈ 2025೥2݄18೔ തֶ࢜Ґ࿦จެௌձ ௶಺ ༎थ

Slide 2

Slide 2 text

2 ɾ2012೥ʹେࡕେֶେֶӃ৘ใՊֶݚڀՊ ത࢜લظ՝ఔʹೖֶɻ2013೥ʹಉ ઐ߈Λத్ୀֶɻ ɾ2013೥ʹגࣜձࣾ͸ͯͳʹೖ৬ɻWebαʔϏε΍γεςϜӡ༻؅ཧαʔϏ εͷ։ൃɾӡ༻ۀ຿ʹैࣄɻ ɾ2019೥ʹ͘͞ΒΠϯλʔωοτגࣜձࣾʹೖ৬ɻΫϥ΢υͷςϨϝτϦʔ ͷऩूͱσʔλղੳ౳ͷݚڀۀ຿ʹैࣄɻ ɾ2020೥ʹژ౎େֶେֶӃ৘ใֶݚڀՊ ത࢜ޙظ՝ఔʹฤೖֶɻ2023೥ʹಉ ݚڀՊΛݚڀࢦಋೝఆୀֶɻ ུྺ

Slide 3

Slide 3 text

0. ຊ࿦จͷ֓؍

Slide 4

Slide 4 text

4 ΦϯϥΠϯαʔϏεͷར༻ऀʹର͢Δ৴པੑ޲্ ΞϓϦέʔγϣϯ γεςϜ ٕज़ऀ ར༻ऀ Πϯλʔωοτ ӡ༻؂ࢹͷͨΊͷ γεςϜ շదͳαʔϏεར ༻ͷͨΊͷ৴པੑ ͷ޲্ ܭଌ อଘ ෼ੳ ࡾ૚ߏ଄ʹ෼ׂ͞ΕΔ σʔλऩू

Slide 5

Slide 5 text

5 ΞϓϦέʔγϣϯ γεςϜ ٕज़ऀ ར༻ऀ Πϯλʔωοτ ӡ༻؂ࢹͷͨΊͷ γεςϜ շదͳαʔϏεར ༻ͷͨΊͷ৴པੑ ͷ޲্ ܭଌ อଘ ෼ੳ ܭࢉࢿݯ ෛՙ૿େ ӡ༻ෛՙͷ ૿େ ߩݙᶃ ߩݙᶄ ߩݙᶅ ӡ༻ͷͨΊͷσʔλऩूෛՙͷ૿େʹର͢Δٕज़ఏҊ σʔλऩू

Slide 6

Slide 6 text

6 ޱ಄ൃදͷྲྀΕ 1. ͸͡Ίʹ 2. OSΧʔωϧ಺ܭ૷๏ͷఏҊʢߩݙᶃʣ 3. ετϨʔδΞʔΩςΫνϟߏ੒๏ͷఏҊʢߩݙᶄʣ 4. ނোࣗಈಛఆͷલॲཧ๏ͷఏҊʢߩݙᶅʣ 5. ૯ׅ

Slide 7

Slide 7 text

7 1. ͸͡Ίʹ 2. OSΧʔωϧ಺ܭ૷๏ͷఏҊʢߩݙᶃʣ 3. ετϨʔδΞʔΩςΫνϟߏ੒๏ͷఏҊʢߩݙᶄʣ 4. ނোࣗಈಛఆͷલॲཧ๏ͷఏҊʢߩݙᶅʣ 5. ૯ׅ ༧උ৹ࠪޙͷओͳमਖ਼Օॴ (P14-15) ςϨϝτϦʔͷఆٛͱϢʔεέʔεͷ௥ه (P43) LinuxΧʔωϧ಺ͷฒߦ੍ޚʹ ىҼ͢ΔΦʔόϔουͷٞ࿦ͷ௥ه (P26) ຊݚڀͷ࢈ۀ΁ͷߩݙͷ௥ه (P93) ̏ͭͷݸผͷߩݙΛ௨ఈ͢ Δ݁࿦ͷ௥ه (P89) ࣌ܥྻղੳ๏ͱͯ͠ͷ ෼໺ԣஅͷద༻ੑ ʢ࿦จ:p.1-2, 14) ʢ࿦จ:p. 91-92) ʢ࿦จ:p. 4) ʢ࿦จ:p. 31ʣ ʢ࿦จ:p. 86-87ʣ

Slide 8

Slide 8 text

1. ͸͡Ίʹ (Chapter 1 and Chapter 2)

Slide 9

Slide 9 text

9 Ϋϥ΢υίϯϐϡʔςΟϯάͷීٴ Cloud ΦϯϥΠϯαʔϏεࣄۀऀ͸ Ϋϥ΢υ؀ڥʹΞϓϦέʔγϣϯΛ ߏங͠ɺΠϯλʔωοτΛհͯ͠ɺ ར༻ऀʹαʔϏεΛఏڙɻ ɾιʔγϟϧωοτϫʔΩϯά ɾEίϚʔε ɾΦϯϥΠϯήʔϜ ɾϝσΟΞ഑৴ ɾϖΠϝϯτ ɾIoT ɾ… Applications Datacenters (ར༻ऀ) എܠ

Slide 10

Slide 10 text

10 Ϋϥ΢υΞϓϦέʔγϣϯͷجຊΞʔΩςΫνϟ Fig. 2.1 എܠ όοΫΤϯυ૚ ϏδωεϩδοΫॲཧ ϑϩϯτΤϯυ૚ σʔλϕʔεʢDBʣ ΫϥελʹΑΔϏδ ωεσʔλͷ؅ཧɻ

Slide 11

Slide 11 text

11 Ϋϥ΢υΞϓϦέʔγϣϯͷجຊΞʔΩςΫνϟ Fig. 2.1 എܠ ෛՙ෼ࢄͱ৑௕Խ ඇಉظॲཧ ༻్͝ͱͷ ҟछDBγεςϜ Մ༻ੑͱن໛֦ுͷͨΊͷෳ਺ͷٕ ज़͕ંΓॏͳΓෳࡶԽ͍ͯ͠Δɻ

Slide 12

Slide 12 text

12 Ϋϥ΢υΞϓϦέʔγϣϯͷجຊΞʔΩςΫνϟ എܠ ᶃ ᶄ ᶅ ᶆ ϦΫΤετॲཧͷܦ࿏ͷҰྫ Fig. 2.1 ϦΫΤετɾϨεϙϯεܕͷܗଶɻ τϥϯεϙʔτ઀ଓΛऴ୺͠தܧ͢Δɻ

Slide 13

Slide 13 text

13 Ϋϥ΢υΞϓϦέʔγϣϯͷ৴པੑ എܠ ར༻ऀͷշదͳαʔϏεར༻ͷͨΊʹߴ͍৴པੑ͕ཁٻ͞ΕΔɻ 1,819ݸͷγεςϜো֐ͷ͏ͪ47%͕ղܾ·Ͱʹ2࣌ؒҎ্ཁ͢Δɻ มߋىҼͷো֐ͷׂ߹͕શମͷ49.5%Λ઎ΊΔɻ [58] [13] ো֐ͷ Өڹ ো֐ͷ τϦΨʔ ɾ ΞϓϦέʔγϣϯίʔυ΍ઃఆϑΝΠϧɺج൫γεςϜͷมߋͳͲ ো֐ͷൃੜΛલఏʹӨڹΛ͍͔ʹ௿ݮ͢Δ͔ʹԠ͑ΔΞϓϩʔν͕ීٴɻ ΦϖϨʔλʔͷରԠ΋ؚΊͨϑΥʔϧττϨϥϯε͕ॏཁɻ [14] 24࣌ؒ365೔ͷՄ༻ੑɺ௿஗ԆԠ౴ͳͲɻ

Slide 14

Slide 14 text

14 ɾ಺ଆͷނোʢFaultʣͷӨڹ͕Α Γ֎ଆ·Ͱ೾ٴ͢Δ ɾ֤૚ͷϑΥʔϧττϨϥϯεػ ߏʹΑΓɺͦͷ೾ٴΛ཈͑Δ 3૚ͷϑΥʔϧττϨϥϯε Fig. 2.2: [60]ͷFigure 1-1Λجʹվม എܠ ࠷֎૚ʹண໨ ɾো֐ͷݕ஌ɾݪҼಛఆɾճ෮ ɾऩ༰ೳྗͷ૿ڧ ɾ…

Slide 15

Slide 15 text

15 ؂ࢹͱ෼ੳͷͨΊʹɺγεςϜɺΞϓϦέʔγϣϯɺαʔϏε͔Βԕִ஍΁ɺ ੑೳ΍ར༻ʹؔ͢ΔσʔλΛࣗಈͰऩू͠ɺૹ৴͢Δɻ ςϨϝτϦʔʹΑΔγεςϜͷ؂ࢹ ܭثͷಡΈऔΓ஋Λه࿥͠ɺૹ৴͢Δϓϩηεɻ Ұൠతͳఆٛ ຊݚڀʹ͓͚Δఆٛ ԕִ஍ ܭث ૹ৴ ෼ੳ ༧උ৹ࠪࢦఠࣄ߲ ɾ ෺ཧతͳػثΛ໨ࢹ͢Δ͜ͱͰಘΒΕΔ৘ใ͸ݶఆతͰ͋Δɻ ɾϋʔυ΢ΣΞɾιϑτ΢ΣΞɾωοτϫʔΫ௨৴ͷ࿦ཧతͳঢ়ଶΛ؂ࢹ͢Δɻ ςϨϝτϦʔ [62,63,64]

Slide 16

Slide 16 text

16 ओཁͳςϨϝτϦʔσʔλ Time-oriented Path-oriented ਺஋ʢϝτϦΫεʣ จࣈྻʢϩάʣ τϨʔε ͋Δ࣌఺ͰͷγεςϜͷੑೳΛఆྔత ʹଌఆͨ͠஋ɻ ݻఆִ࣌ؒؒͰαϯϓϦϯά͞ΕΔɻ ྫʣ CPUར༻཰ɺϦΫΤετԠ౴࣌ؒ γεςϜ಺Ͱൃੜ͢ΔΠϕϯτͷඇߏ ଄Խ͞ΕͨจࣈྻʹΑΔه࿥ ྫʣΤϥʔϝοηʔδɺϢʔβʔΞΫ ςΟϏςΟɺγεςϜૢ࡞ͳͲ γεςϜ಺Λ௨ա͢ΔҰ࿈ͷॲཧ΍௨৴ ͷྲྀΕΛදݱ͢Δߏ଄Խ͞Εͨσʔλɻ എܠ ಛʹωοτϫʔΫ௨৴ʹؔΘΔτϨʔε ɾ্Ґ૚ɿϦΫΤετཻ౓ ɾԼҐ૚ɿϑϩʔཻ౓ ߩݙᶄͱᶅ ߩݙᶃ

Slide 17

Slide 17 text

17 ओཁͳςϨϝτϦʔσʔλʢϝτϦΫεʣ Time-oriented Topology-oriented Data ਺஋ʢϝτϦΫεʣ จࣈྻʢϩάʣ τϨʔε ͋Δ࣌఺ͰͷγεςϜͷੑೳΛఆྔత ʹଌఆͨ͠஋ɻ ݻఆִ࣌ؒؒͰαϯϓϦϯά͞ΕΔɻ ྫʣ CPUར༻཰ɺϦΫΤετԠ౴࣌ؒ ྫʣΤϥʔϝοηʔδɺϢʔβʔΞΫ ςΟϏςΟɺγεςϜૢ࡞ͳͲ - ϦΫΤετཻ౓ʢΞϓϦ૚ʣ - ϑϩʔ·ͨ͸ύέοτཻ౓ʢΠϯϑϥ૚ʣ γεςϜ಺Λ௨ա͢ΔҰ࿈ͷॲཧ΍௨৴ ͷྲྀΕΛදݱ͢Δߏ଄Խ͞Εͨσʔλย ͷू߹ എܠ cpu_seconds{instance=host1,…} λΠϜελϯϓͱ஋ͷ૊ͷ഑ྻͰදݱ͞ΕΔ ྫɿ[(1709298600, 29851.26), …] γεςϜ಺Ͱൃੜ͢ΔΠϕϯτͷඇߏ ଄Խ͞ΕͨจࣈྻʹΑΔه࿥ɻ

Slide 18

Slide 18 text

18 ओཁͳςϨϝτϦʔσʔλʢτϨʔεʣ Path-oriented τϨʔε γεςϜ಺Λ௨ա͢ΔҰ࿈ͷॲཧ΍௨৴ ͷྲྀΕΛදݱ͢Δߏ଄Խ͞Εͨσʔλ എܠ ಛʹωοτϫʔΫ௨৴ʹؔΘΔτϨʔε ɾ্Ґ૚ɿϦΫΤετཻ౓ ɾԼҐ૚ɿϑϩʔཻ౓ B C D A ίʔϧάϥϑ 10.0.10.1:80 10.0.20.1:3306 listen port 80 3306 9200 9092 10.0.30.1:9200 10.0.40.1:9092 ʢຊݚڀର৅֎ʣ

Slide 19

Slide 19 text

19 ςϨϝτϦʔγεςϜ ܭଌ૚ ʢInstrumentationʣ ετϨʔδ૚ ʢStorageʣ ϚΠχϯά૚ ʢMiningʣ ຊݚڀͰ͸̏֊૚ʹ ෼ׂ͢Δɻ എܠ Fig. 2.3: Overview of one possible telemetry system.

Slide 20

Slide 20 text

20 ΞϓϦέʔγϣϯγ εςϜʹܭث͕૊Έ ࠐ·ΕΔɻ தԝͷετϨʔδ΁ σʔλ͕ૹ৴͞ΕΔɻ ςϨϝτϦʔγεςϜɿܭଌʢInstrumentationʣ എܠ Fig. 2.3: Overview of one possible telemetry system.

Slide 21

Slide 21 text

21 ϚΠχϯά૚͔ΒDB ʹඞཁͳσʔλ͕໰ ͍߹Θͤ͞ΕΔɻ ૹ৴͞Εͨσʔλ ͸DBγεςϜʹऔ Γࠐ·ΕΔɻ ςϨϝτϦʔγεςϜɿετϨʔδʢStorageʣ എܠ Fig. 2.3: Overview of one possible telemetry system.

Slide 22

Slide 22 text

22 ςϨϝτϦʔγεςϜɿϚΠχϯάʢMiningʣ ՄࢹԽ͞ΕͨϏϡʔ ͱҟৗͷൃੜΛࣔ͢ ΞϥʔτΛఏڙɻ ػցֶशʹΑΔσʔλͷࣗ ಈղੳثΛ௨ͯ͠ΦϖϨʔ λʔͷෛ୲Λ௿ݮɻ ʢߩݙᶅͷର৅ʣ എܠ ࣗಈϚΠχϯά खಈϚΠχϯά Fig. 2.3: Overview of one possible telemetry system.

Slide 23

Slide 23 text

23 ɾΞϓϦέʔγϣϯͷϫʔΫϩʔυɺ͓Αͼɺίϯϙʔωϯτ਺ͷ૿େ ɾΑΓਫ਼៛ͳγεςϜཧղͷͨΊͷςϨϝτϦʔσʔλͷࡉཻ౓Խ ςϨϝτϦʔϫʔΫϩʔυͷ૿େ എܠ ܭଌ ϚΠχϯά ɾܭଌ஋ͷసૹɾू໿ॲཧʹ ཁ͢ΔϦιʔεফඅͷ૿େ ɾΞϓϦέʔγϣϯͷॲཧ஗ Ԇ૿େ ܭଌɾૹ৴ॲཧྔͷ૿େ ετϨʔδ σʔλऔΓࠐΈྔͷ૿େ ɾॻ͖ࠐΈॲཧͷϦιʔε ফඅͷ૿େ ɾσΟεΫอଘྖҬͷ૿େ ɾಡΈࠐΈॲཧͷϦιʔε ফඅͱ஗Ԇͷ૿େ ֶशॲཧྔͷ૿େ ɾϞσϧग़ྗͷਫ਼౓௿Լ ɾֶशॲཧͷ࣮ߦ࣌ؒͱ Ϧιʔεফඅྔͷ૿େ ཁҼ

Slide 24

Slide 24 text

24 ςϨϝτϦʔγεςϜ͕΋ͨΒ͢ӡ༻ͷෳࡶ͞ ల։༰қੑ ϝϯςφϯε༰қੑ ɾαʔϏεࣄۀऀ͸ΞϓϦέʔγϣϯʹՃ͑ͯςϨϝτϦʔγεςϜ΋ӡ༻ ͢Δඞཁ͕͋Δɻ ɾӡ༻ෳࡶੑΛ཈͑Δ͜ͱ͸࣮༻ԽͷͨΊʹॏཁͰ͋Δɻ ܭଌ ϚΠχϯά ετϨʔδ खಈʹΑΔܭ૷࡞ۀ DBγεςϜͷߏஙɺઃఆɺνϡʔ χϯάɺόοΫΞοϓͷ࡞ۀෛ୲ σʔληοτͷखಈϥϕϦϯά Ϟσϧͷύϥϝʔλνϡʔχϯά σʔλ෼෍ಛੑͷมԽʹΑΔਫ਼౓௿ Լ΁ͷରԠʢ࠶ֶशɾ࠶νϡʔχϯ άͳͲʣ ܭ૷ݩͷίʔυมߋ΁ͷ௥ै ن໛֦ுͷ࡞ۀ΍ɺόʔδϣϯ Ξοϓɺ࠶νϡʔχϯά എܠ

Slide 25

Slide 25 text

ݚڀ໨త

Slide 26

Slide 26 text

༧උ৹ࠪࢦఠࣄ߲ 26 ݚڀ໨త ར༻ऀ ʢҰൠͷফඅऀ΍ اۀͷ୲౰ऀͳͲʣ Ϋϥ΢υ ΦϯϥΠϯαʔϏεࣄۀऀ Ϋϥ΢υαʔϏεࣄۀऀ ΞϓϦέʔγϣϯ ΦϖϨʔλʔ͕ςϨϝτϦʔΛհͯ͠ɺ γεςϜΛਫ਼៛ʹ೺ѲՄೳ ΦϖϨʔλʔ ςϨϝτϦʔϫʔΫϩʔ υ͕ফඅ͢Δܭࢉػࢿݯ ͷར༻ޮ཰Խ ௿͍ӡ༻ෳࡶੑʹΑΓ ਓతࢿݯͷޮ཰Խ ৴པੑͷ޲্ʹΑΓ շదʹαʔϏεΛར ༻Մೳ ཱ྆

Slide 27

Slide 27 text

27 ݚڀ໨ඪ ςϨϝτϦʔγεςϜ ܭଌ ετϨʔδ ϚΠχϯά ΦϖϨʔλʔ ݚڀ໨త ϫʔΫϩʔυ ςϨϝτϦʔϫʔΫϩʔυͷ૿େʹ ର֤ͯ͠૚͝ͱʹޮ཰తʹεέʔϦ ϯά͢Δٕज़ΛఏҊ͢Δɻ ӡ༻ෳࡶੑͷ૿ՃΛ཈͑Δ৚݅ԼͰ Ϧ ι ʛ ε ফ අ ྔ ॲ ཧ ஗ Ԇ

Slide 28

Slide 28 text

28 ຊݚڀΛ၆ᛌͨ͠ਤ (Chapter 3) (Chapter 4) (Chapter 5) Path-oriented Time-oriented ςϨϝτϦʔγεςϜ ܭଌ ετϨʔδ ϚΠχϯά ΦϖϨʔλʔ OSΧʔωϧ಺ͷޮ཰తू໿ ʹΑΔܭ૷๏ औΓࠐΈෛՙͷ૿େ ࣮ߦ࣌ؒ૿Ճͱਫ਼౓ͷ௿Լ ϝτϦΫεͷݸ਺ͷ૿େ ϝϞϦͱσΟεΫDBͷ ֊૚Խ๏ͱ֊૚ؒҠߦ๏ ো֐ʹؔ࿈͠ͳ͍ϝτϦΫε ͷࣗಈͰ࡟ݮ͢Δલॲཧ๏ ωοτϫʔΫ઀ଓϨʔτ૿େ ˠ ܭଌॲཧෛՙ΋૿େ ݚڀ໨త

Slide 29

Slide 29 text

29 ຊݚڀΛ၆ᛌͨ͠ਤ (Chapter 3) (Chapter 4) (Chapter 5) Path-oriented Time-oriented ςϨϝτϦʔγεςϜ ܭଌ ετϨʔδ Mining ΦϖϨʔλʔ औΓࠐΈෛՙͷ૿େ ࣮ߦ࣌ؒ૿Ճͱਫ਼౓ͷ௿Լ ϝτϦΫεͷݸ਺ͷ૿େ ϝϞϦͱσΟεΫDBͷ ֊૚Խ๏ͱ֊૚ؒҠߦ๏ ো֐ʹؔ࿈͠ͳ͍ϝτϦΫε ͷࣗಈͰ࡟ݮ͢Δલॲཧ๏ ςϨϝτϦʔ ϫʔΫϩʔυͷ૿େ ωοτϫʔΫ઀ଓϨʔτ૿େ ˠ ܭଌॲཧෛՙ΋૿େ ݚڀ໨త OSΧʔωϧ಺ͷޮ཰తू໿ ʹΑΔܭ૷๏

Slide 30

Slide 30 text

30 ຊݚڀΛ၆ᛌͨ͠ਤ (Chapter 3) (Chapter 4) (Chapter 5) Path-oriented Time-oriented ςϨϝτϦʔγεςϜ ܭଌ ετϨʔδ Mining ΦϖϨʔλʔ औΓࠐΈෛՙͷ૿େ ࣮ߦ࣌ؒ૿Ճͱਫ਼౓ͷ௿Լ ϝτϦΫεͷݸ਺ͷ૿େ ϝϞϦͱσΟεΫDBͷ ֊૚Խ๏ͱ֊૚ؒҠߦ๏ ো֐ʹؔ࿈͠ͳ͍ϝτϦΫε ͷࣗಈͰ࡟ݮ͢Δલॲཧ๏ εέʔϦϯάٕज़ ͷఏҊ ωοτϫʔΫ઀ଓϨʔτ૿େ ˠ ܭଌॲཧෛՙ΋૿େ ݚڀ໨త OSΧʔωϧ಺ͷޮ཰తू໿ ʹΑΔܭ૷๏

Slide 31

Slide 31 text

31 (Chapter 3) Path-oriented Time-oriented ܭଌ ετϨʔδ ϚΠχϯά ΦϖϨʔλʔ Y. Tsubouchi, M. Furukawa, R. Matsumoto, Low Overhead TCP/UDP Socket-based Tracing for Discovering Network Services Dependencies, Journal of Information Processing (JIP), Vol.30, pp.260-268, Mar 2022. ӡ༻ෳࡶੑ ܭ૷ͷͨΊͷΞϓϦέʔγϣϯ ίʔυͷमਖ਼Λෆཁͱ͢Δ ωοτϫʔΫ઀ଓϨʔτ૿େ ˠ ܭଌॲཧෛՙ΋૿େ ݚڀ໨త OSΧʔωϧ಺ͷޮ཰తू໿ ʹΑΔܭ૷๏ ςϨϝτϦʔγεςϜ

Slide 32

Slide 32 text

32 (Chapter 4) Path-oriented Time-oriented ςϨϝτϦʔγεςϜ ܭଌ ετϨʔδ ϚΠχϯά ΦϖϨʔλʔ औΓࠐΈෛՙͷ૿େ ϝτϦΫεͷݸ਺ͷ૿େ ϝϞϦͱσΟεΫDBͷ ֊૚Խ๏ͱ֊૚ؒҠߦ๏ ௶಺༎थ, ࿬ࡔேਓ, ᖛా݈, দ໦խ޾, খྛོߒ, Ѩ෦ത, দຊ྄հ, HeteroTSDB: ҟछ෼ࢄKVSؒͷࣗಈ ֊૚ԽʹΑΔߴੑೳͳ࣌ܥྻσʔλϕʔε, ৘ใॲཧֶձ࿦จࢽ, Vol.62, No.3, pp.818-828, 2021೥3݄. ӡ༻ෳࡶੑ ݚڀ໨త ஌ࣝɾ࣮૷ͷྲྀ༻ੑ ͷߴ͍ଟ໨తͷDBγ εςϜͷൣғ಺Ͱղܾ

Slide 33

Slide 33 text

33 (Chapter 3) (Chapter 5) Path-oriented Time-oriented ετϨʔδ ϚΠχϯά ΦϖϨʔλʔ OSΧʔωϧ಺ͷޮ཰తू໿ ʹΑΔτϨʔγϯάͷܭ૷๏ ࣮ߦ࣌ؒ૿Ճͱਫ਼౓ͷ௿Լ ϝτϦΫεͷݸ਺ͷ૿େ ো֐ʹؔ࿈͠ͳ͍ϝτϦΫε ͷࣗಈͰ࡟ݮ͢Δલॲཧ๏ Y. Tsubouchi and H. Tsuruta, MetricSifter: Feature Reduction of Multivariate Time Series Data for Efficient Fault Localization in Cloud Applications, IEEE Access, Vol. 12, pp. 37398-37417, March 2024. ωοτϫʔΫ઀ଓϨʔτ૿େ ˠ ܭଌॲཧෛՙ΋૿େ ݚڀ໨త ӡ༻ෳࡶੑ ϥϕϦϯάͱϞσϧͷ܇࿅͕ෆཁͳ ڭࢣͳֶ͠शͷ࿮૊ΈͰղܾɻ ύϥϝʔλͷมԽʹରͯ͠ؤڧͳઃܭ ͱ͠ɺνϡʔχϯάͷෛ୲Λ௿ݮɻ ܭଌ ςϨϝτϦʔγεςϜ

Slide 34

Slide 34 text

2. OSΧʔωϧ಺ܭ૷๏ͷఏҊʢߩݙᶃʣ ܭଌ૚ (Chapter 3)

Slide 35

Slide 35 text

35 (Chapter 4) (Chapter 5) Path-oriented Time-oriented ςϨϝτϦʔγεςϜ ܭଌ ετϨʔδ ϚΠχϯά ΦϖϨʔλʔ OSΧʔωϧ಺ͷޮ཰తू໿ ʹΑΔܭ૷๏ औΓࠐΈෛՙͷ૿େ ࣮ߦ࣌ؒ૿Ճͱਫ਼౓ͷ௿Լ ϝτϦΫεͷݸ਺ͷ૿େ ϝϞϦͱσΟεΫDBͷ ֊૚Խ๏ͱ֊૚ؒҠߦ๏ ো֐ʹؔ࿈͠ͳ͍ϝτϦΫε ͷࣗಈͰ࡟ݮ͢Δલॲཧ๏ Y. Tsubouchi, M. Furukawa, R. Matsumoto, Low Overhead TCP/UDP Socket-based Tracing for Discovering Network Services Dependencies, Journal of Information Processing (JIP), Vol.30, pp.260-268, Mar 2022. ωοτϫʔΫ઀ଓϨʔτ૿େ ˠ ܭଌॲཧෛՙ΋૿େ (Chapter 3)

Slide 36

Slide 36 text

36 ωοτϫʔΫίʔϧάϥϑ എܠ ैདྷ͸खಈͰͷ࡞ਤ͕ඞཁͰ͋ͬͨ ͕ɺ࠷ۙͰ͸Path-oriented dataΛجʹ ࣗಈԽ͞Εͭͭ͋Δɻ Cloud Load Balancers Database Clusters Web app servers Message queues ֤ίϯϙʔωϯτͷݺͼग़ؔ͠܎ Λ஌Γ͍ͨɻ L7: ϦΫΤετ਺,Τϥʔ਺,Ԡ౴࣌ؒ… L4: ૹ৴ɾड৴Bytes/s, RTT, … - มߋͷӨڹൣғΛ஌Γ͍ͨɻ - ϦϯΫ୯ҐͷϝτϦΫεΛ஌Γ͍ͨɻ

Slide 37

Slide 37 text

37 Path-oriented dataͷܭ૷Ξϓϩʔν طଘख๏ Kernel User Proxy Network Stack App NIC Application-intrusive ΞϓϦέʔγϣϯίʔυʹܭ૷͢Δɻ Application-non-intrusive ΞϓϦέʔγϣϯҎ֎ͷՕॴʹܭ૷ɻ Switch ωοτϫʔΫ௨৴ܦ࿏্ͷ͍ͣΕ͔ʹܭଌ఺Λઃஔ͢Δɻ ར఺ɿΞϓϦͷίϯςΩετΛ஫ೖՄɻ ܽ఺ɿίʔυ௥Ճͷ࿑ྗ͕େ͖͍ɻ ར఺ͱܽ఺͸App-intrusiveͱٯɻ Χʔωϧͷ্Ґ૚ʢιέοτʣͰͷܭ૷ʹண໨ɻ ରProxy: தܧΦʔόʔϔου͕ͳ͍ɻ ରSwitch: ܭଌෛՙΛΤϯυϗετʹ෼ࢄՄೳɻ

Slide 38

Slide 38 text

ιέοτ૚ʹ͓͚Δܭ૷ख๏ Kernel User Service Agent ετϦʔϛϯά๏ ϑϩʔू໿๏ ϑϩʔूଋ๏ʢఏҊʣ ✗ ϝοηʔδ਺૿ՃʹԠ ͯ͡ɺϢʔβۭؒ΁ͷܭ ଌ஋ͷసૹ਺͕૿Ճɻ ✗ ୹໋ͳϑϩʔ͕૿Ճ͢Δͱɺ సૹσʔλ਺΋૿Ճɻ Ѽઌ͕ಉҰͷϑϩʔΛ ଋͶΔɻ ※ ϑϩʔ = ྆୺ͷΞυϨεͱϙʔτͷ૊͕ಉҰͷ௨৴୯Ґ ݚڀͷҐஔ ͚ͮ Queue ܭଌ఺ Kernel User Service Agent ܭଌ఺ ※ ໼ҹ͸σʔλͷྲྀΕΛද͢ ✔ ϑϩʔ͝ͱʹू໿͞Εͨܭ ଌ஋ͷΈอଘɻసૹσʔλ਺ Λ௿ݮɻ Flow1 Flow2 Flow3 Flow4 Kernel User Service Agent ܭଌ఺ ✔ ୹໋ͳϑϩʔ਺͕ଟ͘ ͱ΋సૹσʔλ਺Λ௿ݮ Bundle 1 Bundle 2 ✔ ܭଌΦʔόʔϔου ͕খ͍͞ ([96,97]) ([27,98])

Slide 39

Slide 39 text

39 ߩݙᶃͷ֓ཁ 1. ୹໋ͳϑϩʔ͕ଟ͍؀ڥʹ͓͍ͯ΋ɺܭଌΦʔόʔϔουΛ௿ݮͤ͞Δ Χʔωϧ಺ϑϩʔूଋ๏ΛఏҊ͢Δɻ 2. ϑϩʔ਺͕૿େͨ͠ͱͯ͠΋ɺܭଌΦʔόʔϔουʢCPUෛՙʣ͕े෼ʹ খ͘͞ͳΔ͜ͱΛݕূͨ͠ɻ طଘख๏ʹෆརͳ؀ڥ Web App Servers DB Server PHPΞϓϦέʔγϣϯͰ͸ɺϦιʔεͷ ཚ༻Λ๷͙ͨΊʹDB΁ͷӬଓతͳ઀ଓ ͕ਪ঑͞Εͳ͍͜ͱ͕͋Δ[101] ղܾ ϑϩʔ͕࣋ଓ͞Εͣɺ୹໋ͳϑϩʔ͕૿େ͢Δɻ Connections ߩݙ

Slide 40

Slide 40 text

40 ϑϩʔͷूଋͷ֓೦ ΫϥΠΞϯτ αʔό ఏҊख๏ 53421 32346 48901 Service Service Listen port 80 Ephemeral port Flow 1 Flow 2 Flow N Service Service 80 1ຊͷଋͶΒΕͨϑϩʔͱΈͳ͢

Slide 41

Slide 41 text

41 Χʔωϧ಺ͰͷҟͳΔϑϩʔͷूଋ ఏҊख๏ ϑϩʔूଋ๏ʢఏҊʣ Kernel User Service Agent NIC ܭଌ఺ Bundle 1 Bundle 2 "src_ip": "192.168.1.101", "src_port": 53421, "dst_ip": "192.168.1.200", “dst_port": 80, “recv_bytes”: 2000, “send_bytes”: 500, "src_ip": "192.168.1.101", "src_port": 61390, "dst_ip": "192.168.1.200", "dst_port": 80, “recv_bytes": 1000, “sent_bytes”: 100, Flow 1 Flow 2 Bundle 1 "src_ip": "192.168.1.101", "dst_ip": "192.168.1.200", “dst_port": 80, “recv_bytes”: 3000, “sent_bytes”: 600, Ephemeral portΛ ࡟আͯ͠Ϛʔδ ਺஋σʔλ͸౷ܭॲཧ͞ΕΔ ʢྫͰ͸૯࿨ΛͱΔʣ

Slide 42

Slide 42 text

42 ࣮૷ɿུ֓ਤ Hash map Kernel User Service Socket Layer Agent tcp_v4_connect() inet_csk_accept() tcp_sendmsg() tcp_cleanup_rbuf() ʢUDPলུʣ ఏҊख๏ {src_addr, dst_addr, listen_port, proto, pid} NIC Keys Values {counts, recv_bytes, send_bytes, …} System Call ܭଌϓϩάϥϜ1 ܭଌϓϩάϥϜ2 ܭଌϓϩάϥϜ3 ܭଌϓϩάϥϜ4 LinuxͷkprobeͰΧʔωϧ ؔ਺ʹΞλον͢Δ Linuxͷ extended Barkley Packet Filter (eBPF) Λ༻͍ͯΧʔωϧΛ֦ுΛ͢Δɻ Mapߏ଄ମΛߋ৽ όονૢ࡞ʹΑΓෳ਺ΞΠ ςϜΛఆظతʹऔಘɾ࡟আ

Slide 43

Slide 43 text

43 ࣮૷ɿΧʔωϧ಺ͷฒߦ੍ޚ ఏҊख๏ ༧උ৹ࠪࢦఠࣄ߲ ֤ϝϞϦྖҬͷอޢͷͨΊɺΦʔόʔϔου͕খ͍͞ಉظػߏΛ࢖͏ɻ ܭଌϓϩάϥϜ Hash Map eBPF؅ཧྖҬ Χʔωϧ؅ཧྖҬ ΤϯτϦ಺ͷ஋ͷߋ৽ ΞτϛοΫ໋ྩͷ࢖༻ ʢϑΣον໋ྩͱՃࢉ໋ྩʣ ૈཻ౓ʢϚοϓશମʣ ͷεϐϯϩοΫ Agent Φʔόϔου͸࣮ݧͰे෼খ͍͜͞ͱΛ֬ ೝࡁΈ͕ͩɺCPUίΞ͕ଟ͍؀ڥͰ͸ແࢹ Ͱ͖ͳ͘ͳΔՄೳੑ͋Γɻ ϚοϓΤϯτϦͷૠೖ ࡉཻ౓ʢόέοτ୯ҐʣͰ εϐϯϩοΫ Χʔωϧؔ਺ ιέοτߏ଄ମͳͲΛ ಡΈऔΔ͚ͩͰɺϩο Ϋ͠ͳ͍ɻ ※ ܭଌϓϩάϥϜ ܭଌϓϩάϥϜ

Slide 44

Slide 44 text

44 ධՁͷઃఆ ධՁ ϕϯνϚʔΫ ϕʔεϥΠϯ ධՁ߲໨ Client Server Agent Agent ɾ ΤίʔΫϥΠΞϯτɾαʔόʹΑΓTCP·ͨ͸ UDPͷ௨৴ෛՙΛൃੜͤ͞Δɻ ɾ Ұճͷࢼߦ͸30ඵɺόονऔಘස౓͸1ඵ ɾ Χʔωϧͷιέοτ૚Λର৅ͱͨ͠طଘͷܭ૷ख๏ ɾ ετϦʔϛϯά๏ ɾ Χʔωϧ಺ू໿๏ 1. ୹໋ϑϩʔ਺ͷ૿େʹର͢ΔCPUෛՙͷൺֱ 2. 1ରNͷ௨৴؀ڥʹ͓͚ΔCPUෛՙͷൺֱ 3. ΞϓϦέʔγϣϯͷRTTΦʔόʔϔου

Slide 45

Slide 45 text

45 1. ୹໋ͳTCPϑϩʔ਺ͷ૿େʹର͢ΔCPUෛՙͷൺֱ ఏҊख๏ ɾ2.2%ҎԼͷCPUར༻཰Λҡ࣋ɻ ධՁ ετϦʔϛϯά๏ ࠷େ21.3%·ͰCPUར༻཰͕૿Ճɻ Χʔωϧ಺ू໿๏ ࠷େ11.5%·ͰCPUར༻཰͕૿Ճɻ UDPϝοηʔδϨʔτ͕૿େ͢Δ࣮ݧʹͭ ͍ͯ΋ྨࣅͷ݁Ռ͕ಘΒΕͨɻ

Slide 46

Slide 46 text

46 2. ௨৴ઌͷݸ਺Λ૿Ճͨ࣌͠ͷCPUෛՙ ҟͳΔ଴ͪड͚ϙʔτΛ΋ͭ௨৴ઌ͕૿͑Δͱɺूଋ཰͕௿Լ͢Δɻ ↪ ఏҊख๏ͷCPUෛՙ͕૿Ճ͢Δ͸ͣ…ʁ ूଋ཰ : ଋͶΒΕΔϑϩʔ਺ : ߹ܭϑϩʔ਺ R = 1 − B/T B T ධՁ R=0.90 R=0.94 R=0.98 ௨৴ઌͷ਺Ͱ ܾ·Δ ݻఆ T = 10k αʔϏε਺ʢ௨৴ઌʣͷ૿Ճʹର͠ ͯɺCPUར༻཰͸2%ҎԼΛҡ࣋ͨ͠ɻ ·Ͱ૿Ճͤ͞ΔͱR=0ͱͳΓɺ طଘख๏΁ͷ༏Ґੑ͸ͳ͘ͳΔɻ T = 100k

Slide 47

Slide 47 text

47 3. ܭଌॲཧ͕༩͑Δ஗ԆΦʔόϔουͷൺֱ TCP୹໋઀ଓ UDP RTT 300μs ʹରͯ͠ɺఏҊख๏ͷΦʔόϔου͸࠷େͰ΋ 5.8 μsɻ ແܭ૷ͱൺ΂ɺߴʑ2%ͷΦʔόϔου૿Ճʹཹ·Δɻ ධՁ ετϦʔϛϯά๏͕ ࠷খͷRTTΛࣔͨ͠ɻ

Slide 48

Slide 48 text

48 ୈ̎෦ ߩݙᶃ ·ͱΊ ·ͱΊ (Chapter 3) Path-oriented ςϨϝτϦʔγεςϜ ܭଌ ετϨʔδ ϚΠχϯά ΦϖϨʔλʔ OSΧʔωϧ಺ͷޮ཰తू໿ ʹΑΔܭ૷๏ ωοτϫʔΫ઀ଓϨʔτ૿େ ˠ ܭଌॲཧෛՙ΋૿େ ධՁɿ୹໋ϑϩʔ਺ͷ૿Ճʹରͯ͠ɺఏҊ๏͸ 2.2%ҎԼͷCPUར༻཰Λҡ࣋ͨ͠ɻ ແܭ૷ঢ়ଶʹରͯ͠RTTΦʔόʔϔου͸ߴʑ 2%૿Ճʹཹ·ͬͨɻ ༻్ɿωοτϫʔΫίʔϧάϥϑΛܧଓతʹࣗ ಈߏங͢Δɻ

Slide 49

Slide 49 text

3. ετϨʔδΞʔΩςΫνϟߏ੒๏ͷఏҊ ʢߩݙᶄʣ (Chapter 4) ετϨʔδ૚

Slide 50

Slide 50 text

50 (Chapter 3) (Chapter 4) (Chapter 5) Path-oriented Time-oriented ςϨϝτϦʔγεςϜ ܭଌ ετϨʔδ ϚΠχϯά ΦϖϨʔλʔ OSΧʔωϧ಺ͷޮ཰తू໿ ʹΑΔτϨʔγϯάͷܭ૷๏ औΓࠐΈෛՙͷ૿େ ࣮ߦ࣌ؒ૿Ճͱਫ਼౓ͷ௿Լ ϝτϦΫεͷݸ਺ͷ૿େ ϝϞϦͱσΟεΫDBͷ ֊૚Խ๏ͱ֊૚ؒҠߦ๏ ো֐ʹؔ࿈͠ͳ͍ϝτϦΫε ͷࣗಈͰ࡟ݮ͢Δલॲཧ๏ ௶಺༎थ, ࿬ࡔேਓ, ᖛా݈, দ໦խ޾, খྛོߒ, Ѩ෦ത, দຊ྄հ, HeteroTSDB: ҟछ෼ࢄKVSؒͷࣗಈ ֊૚ԽʹΑΔߴੑೳͳ࣌ܥྻσʔλϕʔε, ৘ใॲཧֶձ࿦จࢽ, Vol.62, No.3, pp.818-828, 2021೥3݄. ωοτϫʔΫ઀ଓϨʔτ૿େ ˠ ܭଌॲཧෛՙ΋૿େ ݚڀ໨త

Slide 51

Slide 51 text

51 ϝτϦΫεͷऔΓࠐΈϫʔΫϩʔυྔ͸ɺ̎ͭͷ࣍ݩʹൺྫ͢Δ ϝτϦΫεετϨʔδͷϫʔΫϩʔυ ࣌ؒ cpu_seconds{instance=host1,…} memory_total_bytes{instance=host1,…} http_requests_count{instance=host1,…} http_requests_count{instance=host99,…} എܠ औΓࠐΈ ᶄ ϝ τ Ϧ Ϋ ε ͷ ݸ ਺ ᶃ ղ૾౓ (Ұൠʹ1 ~ 60ඵͷൣғ)

Slide 52

Slide 52 text

52 ϝτϦΫεͷऔΓࠐΈϫʔΫϩʔυྔ͸ɺ̎ͭͷ࣍ݩʹൺྫ͢Δ ϝτϦΫεετϨʔδͷϫʔΫϩʔυ ࣌ؒ cpu_seconds{instance=host1,…} memory_total_bytes{instance=host1,…} http_requests_count{instance=host1,…} http_requests_count{instance=host99,…} എܠ ᶄ ϝ τ Ϧ Ϋ ε ͷ ݸ ਺ ᶃ ղ૾౓ (Ұൠʹ1 ~ 60ඵͷൣғ) cpu_seconds{instance=host1,…} cpu_seconds{instance=host1,mode=user,core_no=1,…} cpu_seconds{instance=host1,mode=system,core_no=1,…} cpu_seconds{instance=host1,mode=user,core_no=2,…} ಺༁ͷࡉཻ౓ԽʹΑΔݸ਺૿Ճ ෼ղ

Slide 53

Slide 53 text

53 ϝτϦΫεετϨʔδͷεέʔϥϏϦςΟཁٻ औΓࠐΈॲཧεϧʔϓοτ ετϨʔδ༰ྔ σʔλѹॖٕज़΍هԱίετͷ௿͍ ϝσΟΞ΁ͷ௕ظอଘʢSSD/HDDʣ എܠ ɾਫฏ෼ׂ͞Εͨෳ਺ϊʔυͰͷऔΓࠐΈ ɾϝϞϦ্ͷσʔλߏ଄΁ͷޮ཰తͳॻ͖ ࠐΈ Ұൠతͳղܾ๏ Slack 12M datapoints / sec Meta 700M datapoints / min LYCorp 12.5M datapoints / min [19] [32] [112] Slack 12 TB / day ByteDance 10 TB/ day LYCorp 2.7 TB / day Mackerel 460 days [19] [35] [69] [108] Ұൠతͳղܾ๏

Slide 54

Slide 54 text

54 طଘख๏ͷ෼ྨ ࣌ܥྻDB؅ཧγεςϜํࣜ ʢTSDBMSʣ Client DBMS ؔ࿈ݚڀ ࣌ܥྻσʔλࢦ޲ΞϓϦέʔγϣϯํࣜ ʢTSDAʣ App DBMS Client ଟ໨తͳDBγεςϜͰ͋ΔKVSͷ্ʹߏ ங͞ΕΔɻ (OpenTSDB, KairosDB) KVS: Ωʔͱ஋ͷϖΞͷू߹ͱͯ͠ σʔλΛอଘɺݕࡧɺ؅ཧՄೳͳ DBMSɻ Transaction Transaction ࣌ܥྻσʔλॲཧʹ࠷దԽ͞ΕͨDBMSɻ λΠϜελϯϓͷ౳ִؒੑɺ஋ͷ࣌ ؒతۙ઀ੑʹண໨ͨ͠ූ߸Խɻ ѹॖ ߏ଄ σΟεΫϕʔεKVSͰ༻͍ΒΕΔLSMπ ϦʔΛجʹ࣌ܥྻߏ଄ʹ࠷దԽɻݻఆ ͷ࣌ؒ࿮͝ͱʹϑΝΠϧ؅ཧ͞ΕΔɻ (Prometheus, Gorilla, InfluxDBͳͲ) [31,33,35,79] [29,30]

Slide 55

Slide 55 text

55 طଘख๏ͷ෼ྨ ࣌ܥྻDB؅ཧγεςϜํࣜ ʢTSDBMSʣ DBMS ؔ࿈ݚڀ ࣌ܥྻσʔλࢦ޲ΞϓϦέʔγϣϯํࣜ ʢTSDAʣ App DBMS Client ଟ໨తͳDBγεςϜͰ͋ΔKVSͷ্ʹߏ ங͞ΕΔɻ (OpenTSDB, KairosDB) KVS: Ωʔͱ஋ͷϖΞͷू߹ͱͯ͠ σʔλΛอଘɺݕࡧɺ؅ཧՄೳͳ DBMSɻ Transaction ࣌ܥྻσʔλॲཧʹ࠷దԽ͞ΕͨDBMSɻ λΠϜελϯϓͷ౳ִؒੑɺ஋ͷ࣌ ؒతۙ઀ੑʹண໨ͨ͠ූ߸Խɻ ѹॖ ߏ଄ σΟεΫϕʔεKVSͰ༻͍ΒΕΔLSMπ ϦʔΛجʹ࣌ܥྻߏ଄ʹ࠷దԽɻݻఆ ͷ࣌ؒ࿮͝ͱʹϑΝΠϧ؅ཧ͞ΕΔɻ (Prometheus, Gorilla, InfluxDBͳͲ) • KVS͸޿͘ར༻͞Ε͍ͯΔɻ • DBӡ༻ΛࣗಈԽ͢ΔͨΊͷ”DB as a Service”ͱͯ͠KVSαʔϏε ͕޿͘ఏڙ͞Ε͍ͯΔɻ ӡ༻ෳࡶੑΛߟྀ͠ɺ TSDAํࣜʹண໨ TSDAํࣜ͸ૄ݁߹ੑ͕͋Δͨ Ίɺར༻ऀʹDBMS࣮૷ͷબ୒ ࢶΛఏڙՄೳɻ

Slide 56

Slide 56 text

56 KVSͷऔΓࠐΈޮ཰ ϝϞϦϕʔεKVS ϝϞϦ͸ϥϯμϜΞΫ ηεޮ཰ʹ༏ΕΔͨ ΊɺϋογϡදΛ࠾༻ ؔ࿈ݚڀ σΟεΫϕʔεKVS ϝτϦΫε਺͕૿େ͢Δ = KVSͷΩʔ਺͕૿େ͢Δ ↳ σʔλΛ௥Ճ͢Δ࣌ͷΠϯσοΫεࢀরޮ཰͕໰୊ͱͳΔ Memory Disk ฏߧ໦ɾεΩο ϓϦετͳͲͷ ιʔτࡁΈߏ଄ ιʔτࡁΈͷͨ ΊσΟεΫΞΫ ηεޮ཰͕ߴ͍ O(logn) ॻ͖ࠐΈ Flush ॻ͖ࠐΈ Memory O(k) σΟεΫ্ʹ͸σʔλ Λอ࣋͠ͳ͍ɻ ʢίϛοτϩάΛআ͘ʣ Disk File

Slide 57

Slide 57 text

57 KVSͷऔΓࠐΈޮ཰ ϝϞϦϕʔεKVS ϝϞϦ͸ϥϯμϜΞΫ ηεޮ཰ʹ༏ΕΔͨ ΊɺϋογϡදΛ࠾༻ ؔ࿈ݚڀ σΟεΫϕʔεKVS ϝτϦΫε਺͕૿େ͢Δ = KVSͷΩʔ਺͕૿େ͢Δ ↳ σʔλΛ௥Ճ͢Δ࣌ͷΠϯσοΫεࢀরޮ཰͕໰୊ͱͳΔ Memory Disk ฏߧ໦ɾεΩο ϓϦετͳͲͷ ιʔτࡁΈߏ଄ ιʔτ͞Ε͍ͯ ΔͨΊσΟεΫ ΞΫηεޮ཰͕ ߴ͍ O(logn) ॻ͖ࠐΈ Flush ॻ͖ࠐΈ Memory O(k) σΟεΫ্ʹ͸σʔλ Λอ࣋͠ͳ͍ɻ ʢίϛοτϩάΛআ͘ʣ Disk ✘ ϝϞϦ͸هԱྔ͋ͨΓͷඅ༻͕େ ͖͍ͨΊɺ௕ظอ࣋ʹ͸ෆ޲͖ɻ ✘ Ωʔ਺͕େ͖͍࣌ʹɺσʔλͷॻ͖ ࠐΈޮ཰͕௿Լ͢Δɻ

Slide 58

Slide 58 text

58 ߩݙᶄͷ·ͱΊ औΓࠐΈॲཧޮ཰ͱ௕ظอଘͷཱ྆ ࣌ܥྻσʔλࢦ޲ΞϓϦέʔγϣϯʢTSDAʣ ࣌ܥྻDB؅ཧ γεςϜ ʢTSDBMSʣ σΟεΫϕʔε ఏҊख๏ ӡ༻ ෳࡶੑ औΓࠐΈ ޮ཰ ετϨʔδ ༰ྔ ࣌ܥྻѹॖͳͲ ࣌ܥྻσʔλ อଘʹ࠷దԽ ૄ݁߹ੑແ͠ SSD/HDDอଘ σΟεΫΞΫη εޮ཰Λߟྀ ͨ͠ߏ଄ ϥϯμϜΞΫηεޮ཰ʹ༏Εͨ ϝϞϦʹ࠷దԽ ݹ͍σʔλͷΈ SSD/HDDอଘ ૄ݁߹ੑ༗Γ ϝϞϦϕʔε ϝϞϦอଘ ߩݙ ɾӡ༻ෳࡶੑͷ௿͍TSDAํࣜͰɺϝϞϦɾσΟεΫϕʔεͷ֤ಛੑΛ ྆औΓ͢ΔΞʔΩςΫνϟΛઃܭͨ͠ɻ ɾσΟεΫϕʔεͷํࣜͱൺֱ͠ɺ3.98ഒͷऔΓࠐΈੑೳΛୡ੒ͨ͠ɻ ߩݙ

Slide 59

Slide 59 text

59 ఏҊख๏ HeteroTSDB Client ఏҊख๏ ϝϞϦϕʔεKVS σΟεΫϕʔεKVS App Flusher ௚ۙͷλΠϜελϯϓΛ΋ͭσʔ λ͕֨ೲ͞ΕΔϝϞϦόοϑΝ ϋογϡදʹجͮ͘ߴ଎औΓࠐΈ ݹ͍λΠϜελϯϓΛ΋ͭσʔλ͕ ֨ೲ͞ΕΔσΟεΫετϨʔδ SSD/HDDʹอଘ͢Δ͜ͱʹΑΔ ௕ظอ࣋ίετͷ௿Լ σʔλͷϚΠά Ϩʔγϣϯ ཱ྆

Slide 60

Slide 60 text

60 ϝϞϦϕʔεKVSͱσΟεΫϕʔεKVSͷ֊૚Խ ϝϞϦϕʔεKVS ϋογϡද O(k) ౸ண M (ingestions/s) cpu_seconds{…} memory_total_bytes{…} http_requests_count{…} dݸ Lookup Insert σΟεΫϕʔεKVS ฏߧ໦ɾεΩοϓϦετ O(logn) dݸͷσʔλ఺Λόονॻ͖ࠐΈ ʹΑΓɺLookupճ਺Λ࡟ݮ M / d (ingestions/s) cpu_seconds{…} Lookup memory_total_bytes{…} http_requests_count{…} ఏҊख๏

Slide 61

Slide 61 text

61 λΠϚʔʹجͮ͘ϚΠάϨʔγϣϯ ϝϞϦϕʔεKVS σΟεΫϕʔεKVS cpu_seconds{…} cpu_seconds{…} memory_total_bytes{…} http_requests_count{…} memory_total_bytes{…} http_requests_count{…} 3511 934 298 TTL ɾΩʔ͝ͱʹTTLʢTime To LiveʣΛઃఆ͠ɺTTL͕0ʹͳΕ͹Ҡಈͤ͞Δ ɾTTLηοτ࣌ʹδολʔΛՃ͑ɺҠಈͷλΠϛϯάΛ෼ࢄͤ͞Δ όονॲཧʹΑΔσʔλҠಈ͸ɺσΟεΫϕʔεKVS΁ͷऔΓࠐΈෛՙ͕ภΔ ఏҊख๏ ʢྫɿ3600ඵʣ

Slide 62

Slide 62 text

62 ɾ طଘͷෛՙੜ੒πʔϧ[113]Λ༻͍ͯɺෛՙΛ࠶ݱ͢Δɻ ɾ 1ճͷࢼߦΛ30෼ͱ͠ɺఏҊख๏ͷTTLΛ10෼ͱ͢Δɻ ධՁͷઃఆ ධՁ DB servers Load generation client ϕϯνϚʔΫ ϕʔεϥΠϯ ධՁ߲໨ ɾ TSDAํࣜΛͱΔKairosDBΛൺֱର৅ͱ͢Δɻ ɾ KairosDB͸σΟεΫϕʔεKVSͷCassandraΛ༻͍Δɻ 1. औΓࠐΈॲཧޮ཰ͷൺֱ 2. ϝτϦΫε਺ͷ૿Ճʹର͢ΔऔΓࠐΈॲཧޮ཰ͷൺֱ 3. ఏҊख๏ͷKVSؒϚΠάϨʔγϣϯੑೳͷ֬ೝ ϝϞϦKVS: Redis σΟεΫKVS: Cassandra ఏҊख๏

Slide 63

Slide 63 text

63 ̍. औΓࠐΈॲཧޮ཰ͷൺֱ ධՁ ϗετ਺ʢ1~8ʣ औ Γ ࠐ Έ ε ϧ ʛ ϓ ο τ ఏҊख๏ʢHeteroTSDBʣ͕ ϕʔεϥΠϯͷ3.98ഒɻ 420k datapoints/s ੨ɿKairosDB ᒵɿఏҊख๏ Slackࣾͷ12 m/s ͷϫʔΫϩʔυ ʹஔ͖׵͑Δͱ - ఏҊख๏͸229ݸ - KairosDB͸915ݸ ͷϗετ਺Λඞཁͱ͢Δܭࢉʹͳ Δɻ ϝτϦΫε਺Λ1Mʹݻఆ

Slide 64

Slide 64 text

ຊ࣮ݧͰ͸ɺ໌֬ʹΠϯσοΫεࢀর ͕ϘτϧωοΫͰ͋Δͱ͸ಛఆͰ͖ͯ ͍ͳ͍ɻ ࠓޙɺ௥ՃͷৄࡉͳϓϩϑΝΠϦϯά ͕ඞཁͰ͋Δɻ 64 ̎. ϝτϦΫε਺ͷ૿Ճʹର͢ΔऔΓࠐΈॲཧޮ཰ͷൺֱ ධՁ औ Γ ࠐ Έ ε ϧ ʛ ϓ ο τ ϝτϦΫε਺ʢ100~1,000,000) ੨ɿKairosDB ᒵɿఏҊख๏ 2.32ഒ 3.58ഒ ϝτϦΫε਺૿ՃͷεέʔϥϏϦςΟ͸ ϕʔεϥΠϯΑΓߴ͍ɻ σʔλ఺ͷશମૹ৴Ϩʔτ͸ݻఆ

Slide 65

Slide 65 text

65 3. ఏҊख๏ͷKVSؒϚΠάϨʔγϣϯੑೳͷ֬ೝ ධՁ औ Γ ࠐ Έ ε ϧ ʛ ϓ ο τ ܦա࣌ؒʢ0~1800ඵ) ੨ɿҠಈεϧʔϓοτ /s ੺ɿϝϞϦϕʔεKVSͷϝϞϦ ࢖༻ྔ (MB) ϝ Ϟ Ϧ ࢖ ༻ ྔ TTLͷشൃ ʮฏۉҠಈεϧʔϓοτʢ52k / sʣʯ > ʮϝϞϦKVS΁ͷऔΓࠐΈεϧʔϓο τʢ51k/sʣ ʯ Ҡಈ͕։࢝͞ΕΔͱɺ Ҡಈεϧʔϓοτ͕ଈ࠲ʹ૿Ճ͠ɺ ϝϞϦKVSͷϝϞϦ࢖༻ྔ͕ݮগ͢Δɻ σΟεΫKVS͕ϘτϧωοΫͱͳ͍ͬͯ ͳ͍͜ͱΛࣔ͢ ϗετ਺Λ̍ʹݻఆ ϝτϦΫε਺Λ1Mݸ ʹݻఆ

Slide 66

Slide 66 text

66 (Chapter 3) (Chapter 4) (Chapter 5) Path-oriented Time-oriented ςϨϝτϦʔγεςϜ ܭଌ ετϨʔδ ϚΠχϯά ΦϖϨʔλʔ OSΧʔωϧ಺ͷޮ཰తू໿ ʹΑΔτϨʔγϯάͷܭ૷๏ औΓࠐΈෛՙͷ૿େ ࣮ߦ࣌ؒ૿Ճͱਫ਼౓ͷ௿Լ ϝτϦΫεͷݸ਺ͷ૿େ ϝϞϦͱσΟεΫDBͷ ֊૚Խ๏ͱ֊૚ؒҠߦ๏ ো֐ʹؔ࿈͠ͳ͍ϝτϦΫε ͷࣗಈͰ࡟ݮ͢Δલॲཧ๏ ωοτϫʔΫ઀ଓϨʔτ૿େ ˠ ܭଌॲཧෛՙ΋૿େ ·ͱΊ औΓࠐΈॲཧޮ཰ͱ̍೥Ҏ ্ͷ௕ظσʔλอ࣋Λཱ྆ ϝτϦΫε਺100͔Β100ສ ݸͷൣғͰϕʔεϥΠϯʹର ͢ΔεέʔϥϏϦςΟ޲্ 100ສݸͷϝτϦΫεͷऔΓ ࠐΈ࣌ʹɺϕʔεϥΠϯʹର ͯ͠3.98ഒͷੑೳ޲্ ධՁᶃ ධՁᶄ ӡ༻ෳࡶੑΛߟྀ͠ɺ طଘͷKVS্ʹఏҊ๏Λ ࣮ݱ͢Δɻ ໨త ୈ̏෦ ߩݙᶄ ·ͱΊ

Slide 67

Slide 67 text

4. ނোࣗಈಛఆʹ͓͚Δલॲཧ๏ͷఏҊʢߩݙᶅʣ (Chapter 5) ϚΠχϯά૚

Slide 68

Slide 68 text

68 (Chapter 3) (Chapter 4) (Chapter 5) Path-oriented Time-oriented ςϨϝτϦγεςϜ ܭଌ ετϨʔδ ϚΠχϯά ΦϖϨʔλʔ OSΧʔωϧ಺ͷޮ཰తू໿ ʹΑΔτϨʔγϯάͷܭ૷๏ औΓࠐΈෛՙͷ૿େ ࣮ߦ࣌ؒ૿Ճͱਫ਼౓ͷ௿Լ ϝτϦΫεͷݸ਺ͷ૿େ ϝϞϦͱσΟεΫDBͷ ֊૚Խ๏ͱ֊૚ؒҠߦ๏ ো֐ʹؔ࿈͠ͳ͍ϝτϦΫε ΛࣗಈͰ࡟ݮ͢Δલॲཧ๏ Y. Tsubouchi and H. Tsuruta, MetricSifter: Feature Reduction of Multivariate Time Series Data for Efficient Fault Localization in Cloud Applications, IEEE Access, Vol. 12, pp. 37398-37417, March 2024. ωοτϫʔΫ઀ଓϨʔτ૿େ ˠ ܭଌॲཧෛՙ΋૿େ

Slide 69

Slide 69 text

ϝτϦΫε ΦϖϨʔλʔ 69 ػցֶशʹΑΔނোಛఆͷࣗಈԽ ࣗಈނোಛఆ എܠ ো֐ݕ஌ ετϨʔδ 2. ೖྗ 3. ग़ྗ 1. ىಈ ݪҼΛࣔ͢ϝτϦΫε ͷϥϯΩϯά 1. memory_total_bytes{instance=host4,…} 2. disk_write_io{instance=host4,…} 3. net_transmit_bytes{instance=host1,…} 4. … [94,96,124-136] ظ଴͞ΕΔ࣮ߦ࣌ؒ͸ ਺෼εέʔϧ

Slide 70

Slide 70 text

ϝτϦΫε ΦϖϨʔλʔ 70 ػցֶशʹΑΔނোಛఆͷࣗಈԽ ࣗಈނোಛఆ എܠ ো֐ݕ஌ ετϨʔδ 2. ೖྗ 3. ग़ྗ 1. ىಈ ϥϯΫ 1. … 2. … 3. … ػցֶश ɾϝτϦΫεͱࠜຊݪҼͷϖΞΛେྔʹ ؚΉσʔληοτ͕ͳ͍ɻ ɾओʹڭࢣͳֶ͠श͕࠾༻͞ΕΔɻ ɾϝτϦΫε͝ͱʹҟৗ౓Λࢉग़ɻ ɾϝτϦΫεؒͷҟৗ఻ൖΛัଊɻ [94,96,124-136]

Slide 71

Slide 71 text

ϝτϦΫε ਺͕૿େ ΦϖϨʔλʔ 71 ނোಛఆʹ͓͚Δੑೳ௿Լͷ໰୊ ࣗಈނোಛఆ എܠ ো֐ݕ஌ ετϨʔδ 2. ೖྗ 3. ग़ྗ 1. ىಈ ϥϯΫ 1. … 2. … 3. … ػցֶश ϝτϦΫεͷ਺ͷ૿େʹΑΓɺਫ਼౓ ͱ࣮ߦ͕࣌ؒ௿Լ͢Δɻ[23,24] [94,96,124-136]

Slide 72

Slide 72 text

ϝτϦΫε ਺͕૿େ ΦϖϨʔλʔ 72 ނোಛఆʹ͓͚Δੑೳ௿Լͷ໰୊ ࣗಈނোಛఆ എܠ ো֐ݕ஌ ετϨʔδ 2. ೖྗ 3. ग़ྗ 1. ىಈ ϥϯΫ 1. … 2. … 3. … ػցֶश ಛ௃ྔ࡟ݮ ϊΠζͱͳΔϝτϦΫε ΛऔΓআ͘ ϝτϦΫε਺ͷ૿େʹΑΓɺਫ਼౓ͱ ࣮ߦ͕࣌ؒ௿Լ͢Δɻ [23,24] [23,87] [94,96,124-136]

Slide 73

Slide 73 text

73 ಛ௃ྔ࡟ݮͷ໰୊ఆٛʢOursʣ Fig. 5.2: Three types of metrics on anomaly propagation for a failure. ނোʢFaultʣൃੜޙɺϝτϦΫεཻ౓Ͱͷҟ ৗͷ఻ൖϞσϧ ো֐Λݕ஌ͨ͠ΒɺͰ͖ΔݶΓૣ͘ɺ Λಛఆ͢Δ͜ͱɻ MA ∪ MB ໰୊ എܠ ɿ௚઀తʹӨڹ͕ݱΕͨϝτϦΫε ɿؒ઀తʹӨڹ͕ݱΕͨϝτϦΫε ɿແӨڹͷϝτϦΫε MA MB MC ࠜຊݪҼ ͨͩ͠ɺো֐ݕ஌௚ޙ͔Βݻఆͷ࣌ؒൣғ·Ͱ Λೖྗͱ͢Δɻʢ௨ྫͰ͸30~60෼ʣ

Slide 74

Slide 74 text

74 طଘͷಛ௃࡟ݮͱͦͷ՝୊ എܠ ҟৗੑʹجͮ͘࡟ݮ ো֐࣌ؒ֎ͷҟৗΛݕ஌͠͏Δɻ ݪҼϝτϦΫεʢ ʣؒͰ͸ྨࣅ͠΍͢ ͍ͨΊɺޡ࡟আ͕ൃੜ͠͏Δɻ MA ҟৗ͕ແ͍࣌ܥྻΛ࡟আ ૬ؔੑ΍ܗঢ়ྨࣅੑͷߴ͍࣌ܥྻΛ࡟আ ৑௕ੑʹجͮ͘࡟ݮ ຊདྷ࡟আ͍ͨ࣌͠ܥྻ ʢِཅੑʣ ʢِӄੑʣ ো֐ظؒ [23,124,131] [87,129,133]

Slide 75

Slide 75 text

75 طଘͷಛ௃࡟ݮͱͦͷ՝୊ എܠ ҟৗੑʹجͮ͘࡟ݮ ҟৗ͕ແ͍࣌ܥྻΛ࡟আ ૬ؔੑ΍ܗঢ়ྨࣅੑͷߴ͍࣌ܥྻͷॏෳ ࡟আ ৑௕ੑʹجͮ͘࡟ݮ ຊདྷ࡟আ͍ͨ࣌͠ܥྻ ʢِཅੑʣ ʢِӄੑʣ ো֐ظؒ Ұ෦ͷϝτϦΫεʹݱΕΔҟৗੑɾ৑௕ੑͷΈΛѻ͏ɻ ہॴత େҬత γεςϜશମͷʮো֐ʯ΁ͷؔ࿈ੑΛଊ͍͑ͨɻ

Slide 76

Slide 76 text

76 ؍࡯ͱԾఆ Fig. 5.1: Change points in root fault metric.ΑΓҰ෦ൈਮ ނোൃੜ࣌ؒ ނোىҼͷมԽ఺͸ ޓ͍ʹ͍ۙ࣌ؒʹݱΕΔ ؍࡯ ہॴతͳಛ௃͔Β େҬతͳো֐Λ ଊ͑Δ มԽ఺͕࣌ؒ࠷΋ภΔൣғ͕ɺো֐ظؒͱͳΔ Ծఆ എܠ

Slide 77

Slide 77 text

77 ɾຊݚڀͰ͸ɺେҬతͳো֐Λଊ͑Δಛ௃ྔ࡟ݮ๏ΛఏҊͨ͠ɻ ɾఏҊख๏͸࠷ྑͷਖ਼ղ཰Λୡ੒͠ɺEnd-to-endͰͷਫ਼౓ͱ࣮ߦޮ཰Λ޲্ͤͨ͞ɻ ߩݙͷ֓ཁ ߩݙ ख๏ छผ ֶशछผ େҬੑ FluxInfer-AD BIRCH K-S test NSigma PairCorr k-Shape HDBS+SBD MetricSifter ҟৗੑ ৑௕ੑ ൒ڭࢣ͋Γ ʢਖ਼ৗظؒͷࢦఆʣ ڭࢣͳ͠ ҟৗੑ ڭࢣͳ͠ ✘ ✘ ✘ ✔ ଊ͑Δಛ௃ มԽ఺ ਖ਼ৗ - ҟৗظؒͷ ϢʔΫϦουڑ཭ ܗঢ়ྨࣅੑ ෼෍ͷมԽɾ֎Ε஋ ϐΞιϯ૬ؔੑ ڭࢣͳ͠ ҟछͷಛ௃ྔ࡟ݮ๏Λఆྔൺֱͨ͠ॳͷݚڀ

Slide 78

Slide 78 text

78 ఏҊɿো֐ࢦ޲ͷಛ௃࡟ݮ MetricSifter 2. େҬతͳΠϕϯτͱͯ͠ʮো֐ͷ࣌ؒൣғʯΛಛఆ͢Δ มԽ఺࣌ؒͷ෼෍ͷ࠷େͷๆ 1. ہॴతͳΠϕϯτͱͯ࣌͠ܥྻ͝ͱʹʮมԽ఺ʯΛݕग़͢Δ 3. ʮো֐ͷ࣌ؒൣғʯʹมԽ఺͕ ͋Δ → อ࣋ ͳ͍ → ࡟আ t ఏҊख๏

Slide 79

Slide 79 text

79 ఏҊख๏͸ͲͷΑ͏ʹಈ࡞͢Δ͔ʁ Fig. 5.5: An example of feature reduction using the MetricSifter framework. STEP 2: มԽ఺࣌ؒͷ෼෍ ΛجʹηάϝϯτΛ෼ׂ STEP 1: ࣌ܥྻ͝ͱʹɺނো༝དྷͷ มԽ఺ީิΛݕग़ STEP 3: ࠷େີ౓ͷηάϝϯτΛબ୒ ఏҊख๏

Slide 80

Slide 80 text

80 STEP 1: ୯มྔ࣌ܥྻͷมԽ఺ݕग़ ᶃ ίετؔ਺ɿݕग़͢ΔมԽͷछྨ มԽ఺ݕग़ͷطଘͷ࿮૊Έ[152]ͷ͏ͪɺຊυϝΠϯʹదͨ͠΋ͷΛબ୒͢Δɻ ᶄ ୳ࡧ๏ɿมԽ఺ͷ୳ࡧΞϧΰϦζϜ ᶅ ϖφϧςΟ߲ɿݕग़͢ΔมԽ఺ͷ਺ʹ੍໿Λ͔͚Δ L2Ϟσϧ ʢฏۉγϑτʣ Pelt๏ɿݫີղΛٻΊΔ͕৚݅෇͖ͰࢬמΓߴ଎Խ BICʹج͖ͮώϡʔϦεςΟοΫʹܾఆɻͨͩ͠ಠࣗͷዞҙతͳ܎਺ Λ௥Ճɻ ω ఏҊख๏

Slide 81

Slide 81 text

81 STEP 2/3: มԽ఺ͷີ౓෼෍ਪఆͱ෼෍ͷ෼ׂ Fig. 5.6: An example of segmentation. ᶅ ࠷େີ౓ͷηάϝϯτΛબ୒ ᶄ ηάϝϯςʔγϣϯ ہॴ࠷খ఺ʹڥքઢΛҾ͘ ʢਤ͸10ݸͷηάϝϯτʹ෼ׂʣ ఏҊख๏ ᶃ ີ౓෼෍ͷਪఆ Χʔωϧີ౓ਪఆ๏ʢKDEʣΛ༻ ͍ͯ཭ࢄܕͷ෼෍ີ౓Λੜ੒

Slide 82

Slide 82 text

82 ɾ߹੒ɿো֐ͷ਺஋γϛϡϨʔγϣϯ ɾ࣮ূɿ̎छྨͷఆ൪ධՁ༻ΞϓϦέʔγϣϯ΁ͷނো஫ೖʹΑΔো֐࠶ݱ ධՁͷઃఆ ධՁ σʔληοτ ϕʔεϥΠϯ ධՁ߲໨ ධՁࢦඪ 1. ಛ௃ྔ࡟ݮ୯ҐͰͷਖ਼֬ੑ 2. End-to-endͷਫ਼౓ͱ࣮ߦ࣌ؒ ɾҟৗੑʹجͮ͘࡟ݮͷάϧʔϓ ɾ৑௕ੑʹجͮ͘࡟ݮͷάϧʔϓ 3. ύϥϝʔλͷහײੑͱAblation Study ɾಛ௃ྔ࡟ݮɿ෼ྨ໰୊ͷఆ൪ධՁࢦඪʢRecall / Specifically / Balanced Accuracy) ɾ End-to-end: ϥϯΩϯάग़ྗʹਖ਼ղؚ͕·ΕΔׂ߹ʢఆ൪ࢦඪΛ࠾༻ʣ ʢ߹ܭ132ݸͷσʔληοτʣ

Slide 83

Slide 83 text

83 1: ಛ௃ྔ࡟ݮ୯ମͷධՁʢ߹੒ʣ MetricSifterͷਖ਼ղ཰ͷฏۉ஋ 0.981ͱͳΓɺ࠷ྑ஋Λࣔͨ͠ɻ ৑௕࡟ݮάϧʔϓ͸ɺ૯ͯ͡ ௿είΞͱͳͬͨɻ ಺Ͱ࣌ܥྻ͕ྨࣅɾ૬ؔ ͢Δ΋ͷ͕࡟আ͞ΕΔͨΊɻ MA ∪ MB ධՁ ਖ਼ ղ ཰ ಛ௃ྔ࡟ݮ๏

Slide 84

Slide 84 text

84 ಛ௃ྔ࡟ݮͱނোಛఆ๏ͷ૊Έ߹ͤ ධՁ ࣗಈނোಛఆ ಛ௃ྔ࡟ݮ ɾ ఏҊख๏ ɾ ҟৗੑʹجͮ͘࡟ݮͷάϧʔϓ ɾ ৑௕ੑʹجͮ͘࡟ݮͷάϧʔϓ ɾ None ɾ Random Selection ɾ CallGraph + PageRank ɾ PC + PageRank ɾ PC + HT ɾ LiNGAM + PageRank ɾ LiNGAM + HT ɾ RCD શͯͷ૊Έ߹ ͤΛ࣮ݧɻ

Slide 85

Slide 85 text

PC+HT ϥϯμϜબ୒ 85 2: End-to-endͷධՁʢ߹੒ʣ Ұ෦ൈਮ ૯߹ධՁɹ ख๏ ਫ਼౓ උߟ Ideal 0.344 ཧ૝஋ MetricSifter 0.299 ࠷ྑ NSigma 0.241 ࣍఺ None 0.175 w /o ಛ௃࡟ݮ શނোಛఆ๏ͱͷ૊Έ߹ͤʹ ର͢Δtop-5ਫ਼౓ͷฏۉ஋ ධՁ MetricSifter͕ ཧ૝ख๏ʹ ͍ۙਫ਼౓Λୡ੒ தԝ஋ਫ਼౓ͷ ϥΠϯ

Slide 86

Slide 86 text

86 2: End-to-endධՁ -small SS 64 metrics ശͻ͛ਤɿTop-5ਫ਼౓ ંΕઢɿ࣮ߦ࣌ؒ ධՁ ʢ࣮ূʣ ୅දతͳҰ෦ͷ ૊Έ߹ͤΛܝࡌ ɾTop-5ਫ਼౓͸MetricSifter͕࠷ྑͰɺ࣮ ߦޮ཰͸ҟৗੑ࡟ݮΑΓ΋ߴ͍ɻ ࣮ߦ࣌ؒ͸৑௕ੑ࡟ݮʢHDBS-SBD/ HDBS-Rʣ͕࠷ྑ͕ͩਫ਼౓͸࠷΋௿͍ɻ தԝ஋ਫ਼౓ ͷϥΠϯ

Slide 87

Slide 87 text

87 2: ࣮ূσʔλৄࡉʢେن໛ >100 metricsʣ -medium SS -large SS -small TT -medium TT 184 metrics 1312 383 1349 ಛఆͷނোಛఆ๏ʢRCDʣͷΈ͕ݱ࣮తͳ࣌ؒ಺ʢ3600ඵҎ಺ʣͰॲཧΛ ऴ͑ͨɻ ධՁ ଞ͸ɺނোಛఆΞϧΰϦζϜʹฒྻੑ͕ͳ͍ݱ࣮తͳ࣌ؒ಺ʹ׬ྃͤͣɻ ϝτϦΫε਺>1000Ͱ͸ɺ͍ͣΕͷέʔεʹ͓͍ͯ΋ ඇৗʹ௿͍ਫ਼౓ͱͳͬͨɻ

Slide 88

Slide 88 text

88 3: ύϥϝʔλͷහײੑͱAblation Study ධՁ ύϥϝʔλʔ͕ద੾Ͱ͋Ε͹ ਫ਼౓ࠩ͸খ͍͞ɻ ߹੒ͷ͖Ε͍ͳσʔλͰ͸ɺ มԽ఺ݕग़ਫ਼౓͕ߴ͗͢Δͨ ΊͰ͋Δͱߟ͑Δɻ STEP1ʢมԽ఺ݕग़ʣͷύϥϝʔ λ ͕௿͍ͱਖ਼֬ੑ͕௿Լɻ ω ͔͠͠ɺSTEP2/3ʹΑΓਫ਼౓ ޲্ɻ ੨ɿMetricSifter ׬શ൛ ஡ɿMetricSifter STEP1ͷΈ

Slide 89

Slide 89 text

1. ࣌ܥྻσʔλ্ͷมԽ఺ͱͯ͠ݕग़ՄೳͰ͋Δ͜ͱ 2. γεςϜ಺ͰҟৗͷӨڹ͕఻ൖ͢Δ͜ͱʢਆܦܥɺిྗ໢ɺ΢Πϧεײછɺؾ৅ͳͲʣ 3. ఻ൖ͕࣌ؒ͋Δఔ౓୹͘ɺ͹Β͖͕ͭখ͍͜͞ͱ 89 ɾϩϘοτ޻ֶɿػց͔Βͷηϯαʔσʔλ෼ੳʢԹ౓ɺৼಈɺిྲྀɺѹྗʣ ɾӉ஦޻ֶɿӴ੕γεςϜͷ؂ࢹʢ਺ඦ ~ ਺ઍͷม਺ΛؚΉߴ࣍ݩσʔλʣ ɾҩྍɿױऀͷٸͳ༰ଶมԽݕग़ͷͨΊͷੜମ৴߸ͷ෼ੳ ٞ࿦ɿ෼໺ԣஅͷద༻ੑ ༧උ৹ࠪࢦఠࣄ߲ [173] [174] [175] [140] ؾ৅ֶͰ͸਺೔͔Β਺ϲ݄ͷ఻ൖ࣌ؒΛཁ͢ΔͨΊɺద༻Ͱ͖ͳ͍Մೳੑ͋Γ ৘ใ௨৴Ҏ֎ͷ෼໺ͷಉܕͷ໰୊ ఏҊख๏ͷద༻৚݅

Slide 90

Slide 90 text

90 (Chapter 3) (Chapter 4) (Chapter 5) Path-oriented Time-oriented ςϨϝτϦʔγεςϜ ܭଌ ετϨʔδ ϚΠχϯά ΦϖϨʔλʔ OSΧʔωϧ಺ͷޮ཰తू໿ ʹΑΔτϨʔγϯάͷܭ૷๏ औΓࠐΈෛՙͷ૿େ ࣮ߦ࣌ؒ૿Ճͱਫ਼౓ͷ௿Լ ϝτϦΫεͷݸ਺ͷ૿େ ϝϞϦͱσΟεΫDBͷ ֊૚Խ๏ͱ֊૚ؒҠߦ๏ ো֐ʹؔ࿈͠ͳ͍ϝτϦΫε ΛࣗಈͰ࡟ݮ͢Δલॲཧ๏ ωοτϫʔΫ઀ଓϨʔτ૿େ ˠ ܭଌॲཧෛՙ΋૿େ ୈ̐෦ ߩݙᶅ ·ͱΊ ɾಛ௃࡟ݮͷఆྔతͳൺֱධՁΛߦͬͨॳͷݚڀ ɾہॴతͳมԽ఺ͷू߹͔ΒେҬతͳো֐Λଊ͑Δख๏ ΛఏҊɻ ɾ߹੒ɿ࠷ྑͷਖ਼ղ཰ɻEnd-to-endਫ਼౓Λ24%޲্ɻ ɾ࣮ূɿEnd-to-endͰਫ਼౓ͱ࣮ߦޮ཰ͷ྆ํ·ͨ͸͍ ͣΕ͔Λ޲্ɻ

Slide 91

Slide 91 text

5. ૯ׅ (Chapter 6)

Slide 92

Slide 92 text

92 ૯ׅɿςϨϝτϦʔϫʔΫϩʔυεέʔϦϯά ςϨϝτϦʔγεςϜ Ϋϥ΢υ ΞϓϦέʔγϣϯ ΦϖϨʔλʔ Ϣʔβʔ Πϯλʔωοτ ܭଌ ετϨʔδ ϚΠχϯά Ϧιʔεফඅ Ϧιʔεফඅ ϫʔΫϩʔυͷ૿େ ⾭ ⾭ ߩݙ ᶃ Χʔωϧ಺ωοτϫʔΫϑ ϩʔͷूଋʹΑΔ௿Φʔόʔ ϔουͳܭ૷๏ͷఏҊɻ ߩݙ ᶄ औΓࠐΈޮ཰ͱ௕ظอ࣋Λ ཱ྆ՄೳͳҟछKVSͷ֊૚ ԽΞʔΩςΫνϟͷఏҊɻ ʢैདྷൺ࠷େ3.98ഒͷεϧʔ ϓοτ޲্ʣ ߩݙ ᶅ ো֐ʹؔ࿈͢ΔϝτϦΫε ͷมԽ఺ͷूதੑʹண໨͠ ͨಛ௃࡟ݮ๏ͷఏҊɻ ʢैདྷൺฏۉ+4.5%ͷਫ਼౓޲্ ฏۉ࣮ߦ࣌ؒ45-52%ͷ޲্ʣ ʢCPU࢖༻཰2.2%ҎԼɺRTT Φʔόʔϔου࠷େ6μsʣ

Slide 93

Slide 93 text

ΞϓϦέʔγϣϯ ܭଌ 93 جຊݪଇɿαϯϓϦϯάɾू໿ɾಛ௃࡟ݮͳͲͷσʔλ࡟ݮ͸ɺίϯςΩετ ͕๛෋ͳՕॴʢܭ૷ɾϚΠχϯάʣͰద༻͢Δ͜ͱɻ ૯ׅɿςϨϝτϦʔγεςϜઃܭࢦ਑ ςϨϝτϦʔγεςϜ ΦϖϨʔλʔ ετϨʔδ ϚΠχϯά ϓϩηεɺιέοτɺτϥϯβΫ γϣϯͳͲɻ ߩݙᶃͰ͸ɺιέοτΛجʹू໿ɻ ΞϓϦέʔγϣϯ ίϯςΩετ ো֐΍ΞϥʔτͳͲɻ ӡ༻ίϯςΩετ σʔλ࡟ݮΛͤͣɺܭࢉ ࢿݯͷར༻ޮ཰޲্Λ ໨ࢦ͢ɻ ߩݙᶅͰ͸ɺো֐ൃੜΛ جʹಛ௃࡟ݮɻ ༧උ৹ࠪࢦఠࣄ߲

Slide 94

Slide 94 text

94 ɾ ʮӡ༻ෳࡶੑΛ௿͘཈͑Δ͜ͱʯΛ੍໿৚݅ͱͯ͠ɺʮςϨϝτϦʔϫʔΫϩʔυ εέʔϦϯάʯͱݺͿ໰୊Λຊݚڀಠࣗʹઃఆͨ͠ɻ ɾ ςϨϝτϦʔγεςϜΛ3ͭͷ૚ʹ෼ྨ͠ɺ֤૚ͷ՝୊Λ੔ཧ͠ɺͦΕΒΛղܾ͢ ΔͨΊͷٕज़ఏҊΛࣔͨ͠ɻ ૯ׅɿຊݚڀͷҙٛ ֶज़తߩݙ ࣾձతҙٛ ɾ DX͕Ճ଎͢ΔதɺΦϯϥΠϯαʔϏεͷن໛͕֦ு͞ΕΔʹͭΕͯɺςϨϝτϦʔ γεςϜͷϫʔΫϩʔυ͸·͢·͢૿େ͢ΔͩΖ͏ɻ ɾ ༗ݶͷܭࢉػͱਓతࢿݯͷதͰɺӡ༻ෳࡶੑΛ௿ݮ্ͨ͠ͰͷςϨϝτϦʔϫʔΫ ϩʔυͷॲཧޮ཰ͷ޲্͸ඞཁͰ͋Δɻ ɾ ຊݚڀ͸ɺΦϖϨʔλʔͷ࿑ྗͷ࡟ݮͱαʔϏεͷ৴པੑͷ޲্ʹد༩͢Δ΋ͷͰ ͋Δͱߟ͑Δɻ

Slide 95

Slide 95 text

95 ຊݚڀͷࣾձ࣮૷ ※3 https://github.com/ai4sre/metricsifter ※2 https://github.com/yuuki/go-conntracer-bpf ※1 https://mackerel.io/ja/blog/entry/weekly/20180126 ɾαʔόʔ؂ࢹSaaSͷΞʔΩςΫνϟͱͯ͠ద༻ࡁΈ ※2 ※1 ※3 ※2 ͱ ※3 ͸࣮؀ڥͰͷ࢖༻ྫ͕·ͩͳ ͍ͨΊɺࠓޙීٴ׆ಈΛߦ͏ɻ ܭଌ૚ɿߩݙᶃ ɾGoݴޠͷϥΠϒϥϦͱͯ͠ެ։ࡁΈ ɾPythonݴޠͷϥΠϒϥϦͱͯ͠ެ։ࡁΈ ɾݱ৬ʹͯಋೖΛݕ౼த ετϨʔδ૚ɿߩݙᶄ ϚΠχϯά૚ɿߩݙᶅ

Slide 96

Slide 96 text

96 ࠓޙͷల๬ 1. Collect-First͔Β Use-First΁ 2. LLMʹΑΔো֐؅ཧ 3. ෼ࢄਂ૚ֶशΠϯϑϥ ͷͨΊͷςϨϝτϦʔ σʔλར༻ύλʔϯΛϑΟʔυόοΫ͠ɺඞཁͳσʔλͷΈ Λऩू͢ΔΑ͏ʹࣗಈదԠ͢ΔดϧʔϓγεςϜͷݚڀɻ LLMΛ׆༻ͨ͠ނোಛఆࣗಈԽʹ͍ͭͯɺϓϩϯϓτ௕ͷ্ ݶΛߟྀͨ͠࡟ݮɾѹॖʹجͮ͘ʮো֐εφοϓγϣοτʯ ͷੜ੒ख๏ͷݚڀɻ GPUΛ࢖༻͢Δେن໛Ϋϥελʹ͓͍ͯɺ෼ࢄֶशϫʔΫ ϩʔυͷ࠷దԽ΍଱ো֐ੑ޲্ͷͨΊͷ৽͍͠ςϨϝτϦγ εςϜͷݚڀɻ ςϨϝτϦʔ3૚ͷશମ࠷దԽ ৽ٕज़ʹ͓͚ΔϚΠχϯά૚ͷ ϫʔΫϩʔυεέʔϦϯά Ϋϥ΢υΞϓϦέʔγϣϯ Ҏ֎ͷγεςϜ

Slide 97

Slide 97 text

97 ݚڀۀ੷ɹड৆ ɾ ৘ใॲཧֶձΠϯλʔωοτͱӡ༻ٕज़γϯϙδ΢Ϝ2020 ༏ल࿦จ৆ ௶಺༎थ, ௽ాതจ, ݹ઒խେ, TSifter: Ϛ ΠΫϩαʔ ビ εʹ͓͚Δੑೳҟৗͷਝ଎ͳ਍அʹ޲͍ͨ࣌ܥྻ デ ʔλͷ࣍ݩ࡟ݮख๏, 2020೥12݄. ɾ ৘ใॲཧֶձΠϯλʔωοτͱӡ༻ٕज़γϯϙδ΢Ϝ2020 ༏लϓϨθϯςʔγϣϯ৆ ௶಺༎थ, TSifter: ϚΠΫ ϩαʔ ビ εʹ͓͚Δੑೳҟৗͷਝ଎ͳ਍அʹ޲͍ͨ࣌ܥྻ デ ʔλͷ࣍ݩ࡟ݮख๏, 2020೥12݄. ɾ 2020೥౓ ৘ใॲཧֶձ ࢁԼه೦ݚڀ৆ɼ௶಺༎थ, Transtracer: ෼ࢄγεςϜʹ͓͚ΔTCP/UDP௨৴ͷऴ୺఺ ͷ؂ࢹʹΑΔϓϩηεؒґଘؔ܎ͷࣗಈ௥੻, 2020೥. ɾ ৘ใॲཧֶձΠϯλʔωοτͱӡ༻ٕज़γϯϙδ΢Ϝ2019ʢIOTS2019ʣ༏ल࿦จ৆ ௶಺༎थ, ݹ઒խେ, দຊ ྄հ, Transtracer: ෼ࢄγεςϜʹ͓͚ΔTCP/UDP௨৴ͷऴ୺఺ͷ؂ࢹʹΑΔϓϩηεؒґଘؔ܎ͷࣗಈ௥੻, 2019೥12݄. ɾ ৘ใॲཧֶձΠϯλʔωοτͱӡ༻ٕज़γϯϙδ΢Ϝ2019ʢIOTS2019ʣף৆: γʔɾΦʔɾίϯϰ৆ ௶಺༎ थ, ݹ઒խେ, দຊ྄հ, Transtracer: ෼ࢄγεςϜʹ͓͚ΔTCP/UDP௨৴ͷऴ୺఺ͷ؂ࢹʹΑΔϓϩηεؒґଘ ؔ܎ͷࣗಈ௥੻, 2019೥12݄.

Slide 98

Slide 98 text

98 ɾ Y. Tsubouchi, M. Furukawa, R. Matsumoto, Low Overhead TCP/UDP Socket-based Tracing for Discovering Network Services Dependencies, Journal of Information Processing (JIP), Vol.30, pp.260-268, March 2022. ݚڀۀ੷ɹ࿦จࢽɾࠃࡍձٞ ࿦จࢽ ࠃࡍձٞ ɾ Y. Tsubouchi, M. Furukawa, R. Matsumoto, Transtracer: Socket-Based Tracing of Network Dependencies among Processes in Distributed Applications, The 1st IEEE International COMPSAC Workshop on Advanced IoT Computing (AIOT 2020), July 2020. ɾ ௶಺༎थ, ࿬ࡔேਓ, ᖛా݈, দ໦խ޾, খྛོߒ, Ѩ෦ത, দຊ྄հ, HeteroTSDB: ҟछ෼ࢄKVSؒͷࣗ ಈ֊૚ԽʹΑΔߴੑೳͳ࣌ܥྻσʔλϕʔε, ৘ใॲཧֶձ࿦จࢽ, Vol.62, No.3, pp.818-828, 2021೥3݄. ɾ Y. Tsubouchi, A. Wakisaka, K. Hamada, M. Matsuki, H. Abe, R. Matsumoto, HeteroTSDB: An Extensible Time Series Database for Automatically Tiering on Heterogeneous Key-Value Stores, The 43rd Annual IEEE International Computers, Software & Applications Conference (COMPSAC), pp. 264-269, July 2019. ɾ ௶಺༎थ, ҏ໺จ඙, ஔాਅੜ, ࢁ઒૱, ദ໦ַ඙, ഡݪ݉Ұ, ॏෳഉআετϨʔδͷͨΊͷSHA-1ܭࢉγεςϜͷ SSE໋ྩʹΑΔߴεϧʔϓοτԽ, ిࢠ৘ใ௨৴ֶձ࿦จࢽ D, 96(10), pp.2101-2109 2013೥10݄. ɾ Y. Tsubouchi and H. Tsuruta, MetricSifter: Feature Reduction of Multivariate Time Series Data for Ef fi cient Fault Localization in Cloud Applications, IEEE Access, Vol. 12, pp. 37398-37417, March 2024. ʢߩݙ̎ʣ ʢߩݙ̍ʣ ʢߩݙ̏ʣ ʢߩݙ̍ʣ ʢߩݙ̎ʣ

Slide 99

Slide 99 text

99 ݚڀۀ੷ɹࠃ಺γϯϙδ΢Ϝʢࠪಡ෇ʣ ɾ ʢߩݙ̏ʣ௶಺༎थ, ௽ాതจ, ݹ઒խେ, TSifter: ϚΠΫϩαʔϏεʹ͓͚Δੑೳҟৗͷਝ଎ͳ਍அʹ޲͍ͨ࣌ ܥྻσʔλͷ࣍ݩ࡟ݮख๏, ৘ใॲཧֶձΠϯλʔωοτͱӡ༻ٕज़γϯϙδ΢Ϝ࿦จू, 2020, 9-16 (2020- 11-26), 2020೥12݄. ɾ ௶಺༎थ, ੨ࢁਅ໵, MeltriaɿϚΠΫϩαʔϏεʹ͓͚Δҟৗݕ஌ɾݪҼ෼ੳͷͨΊͷσʔληοτͷಈతੜ੒ γεςϜ, ৘ใॲཧֶձΠϯλʔωοτͱӡ༻ٕज़γϯϙδ΢Ϝ࿦จू, 2021, 63-70 (2021-11-18), 2021೥11݄. ɾ ྛ༑Ղ, দݪࠀ໻, ࿯๺ݡ, ௶಺༎थ, Situation Awarenessͱೝ஌৺ཧֶʹ΋ͱ͍ͮͨϚΠΫϩαʔϏεܕγες Ϝ޲͚؂ࢹμογϡϘʔυͷઃܭ, ৘ใॲཧֶձΠϯλʔωοτͱӡ༻ٕज़γϯϙδ΢Ϝ࿦จू, 2021, 97-98 (2021-11-18), 2021೥12݄. ɾ ௽ాതจ, ௶಺༎थ, ෼ࢄγεςϜͷੑೳҟৗʹର͢Δػցֶशͷղऍੑʹجͮ͘ݪҼ਍அख๏, ৘ใॲཧֶձ Πϯλʔωοτͱӡ༻ٕज़γϯϙδ΢Ϝ࿦จू, 2021, 24-31 (2021-11-18), 2021೥11݄. ɾ ʢߩݙ̍ʣ௶಺༎थ, ݹ઒խେ, দຊ྄հ, Transtracer: ෼ࢄγεςϜʹ͓͚ΔTCP/UDP௨৴ͷऴ୺఺ͷ؂ࢹʹΑ Δϓϩηεؒґଘؔ܎ͷࣗಈ௥੻, Πϯλʔωοτͱӡ༻ٕज़γϯϙδ΢Ϝ࿦จू, 2019, 64-71 (2019-11-28), 2019೥12݄. ɾ ʢߩݙ̎ʣ௶಺༎थ, ࿬ࡔேਓ, ᖛా݈, দ໦խ޾, Ѩ෦ത, দຊ྄հ, HeteroTSDB: ҟछࠞ߹Ωʔ バ ϦϡʔετΞ Λ༻͍ͨࣗಈ֊૚ԽͷͨΊͷ࣌ܥྻ デ ʔλ ベ ʔεΞʔΩςΫνϟ, ৘ใॲཧֶձΠϯλʔωοτͱӡ༻ٕज़γϯ ϙδ΢Ϝ࿦จू, 2018, 7-15 (2018-11-29), 2018೥12݄.

Slide 100

Slide 100 text

100 ݚڀۀ੷ɹࠃ಺ձٞ࿥ʢࠪಡͳ͠ʣ ɾ ྛ༑Ղ, দݪࠀ໻, ࿯๺ݡ, ௶಺༎थ, ϚΠΫϩαʔϏεܕγεςϜͷ؂ࢹʹ͓͚ΔμογϡϘʔυUIઃܭʹىҼ ͢Δঢ়گೝࣝ΁ͷӨڹ, No.2022-IOT-56, Vol.38, pp.1-8, 2022೥3݄. ɾ দຊ྄հ, ௶಺༎थ, ΫϥΠΞϯτϓϩηεͷݖݶ৘ใʹجͮ͘TCPΛհͨ͠ಁաతͳݖݶ෼཭ํࣜͷઃܭ, ৘ ใॲཧֶձݚڀใࠂΠϯλʔωοτͱӡ༻ٕज़ʢIOTʣ, No.2020-IOT-49, Vol.11, pp.1-6, 2020೥5݄. ɾ ྛ༑Ղ, ҏ੎ా࿇, দݪࠀ໻, ࿯๺ݡ, ௶಺༎थ, দຊ྄հ, ಈతదԠੑΛ࣋ͭ෼ࢄγεςϜΛର৅ͱͨ͠γεςϜ ঢ়ଶՄࢹԽख๏ͷݕ౼, ৘ใॲཧֶձݚڀใࠂΠϯλʔωοτͱӡ༻ٕज़ʢIOTʣ, No.2020-IOT-48, Vol.22, pp.1-8, 2020೥3݄. ɾ ௶಺༎थ, ݹ઒խେ, দຊ྄հ, ௒ݸମܕσʔληϯλʔΛ໨ࢦͨ͠ωοτϫʔΫαʔϏεؒґଘؔ܎ͷࣗಈ௥ ੻ͷߏ૝, ϚϧνϝσΟΞɺ෼ࢄɺڠௐͱϞόΠϧʢDICOMO2019ʣγϯϙδ΢Ϝ, 6A-2, pp. 1169-1174, 2019 ೥7݄. ɾ ௶಺༎थ, দຊ྄հ, ௒ݸମܕσʔληϯλʔʹ͓͚Δ෼ࢄڠௐΫΤϦΩϟογϡߏ૝, ৘ใॲཧֶձݚڀใࠂ Πϯλʔωοτͱӡ༻ٕज़ʢIOTʣ, No.2019-IOT-45, Vol.14, pp.1-7, 2019೥5݄. ɾ দຊ྄հ, ௶಺༎थ, ٶԼ߶ี, ෼ࢄܕσʔληϯλʔOSΛ໨ࢦͨ͠ϦΞΫςΟϒੑΛ࣋ͭίϯςφ࣮ߦج൫ٕ ज़, ৘ใॲཧֶձݚڀใࠂΠϯλʔωοτͱӡ༻ٕज़ʢIOTʣ, No.2019-IOT-45, Vol.12, pp.1-8, 2019೥3݄.

Slide 101

Slide 101 text

Appendix

Slide 102

Slide 102 text

ݚڀ֓ཁ: Scaling Telemetry Workloads in Cloud Applications എܠͱ໨త ՝୊ ߩݙ 1. Ϋϥ΢υΞϓϦέʔγϣϯͷςϨϝτϦʔ 2. ςϨϝτϦʔϫʔΫϩʔυͷ૿େ 3. ςϨϝτϦʔϫʔΫϩʔυεέʔϦϯά 1. ܭଌɿܭଌॲཧΦʔόʔϔουͷ૿େ 2. ετϨʔδɿऔΓࠐΈσʔλྔͷ૿େͱ௕ظอଘ 3. ϚΠχϯάɿނোಛఆͷਫ਼౓ɾ࣮ߦޮ཰ͷ௿Լ 1. ୹໋ͳωοτϫʔΫ௨৴͕૿େ͢ΔͱɺैདྷͷܭଌॲཧͰ͸ɺܭଌݩͷOS Χʔωϧ͔Βͷసૹॲཧίετ͕ߴ͍ɻ ϝτϦΫε਺ͷ૿େʹରͯ͠ɺऔΓࠐΈॲཧޮ཰ͷ޲্ͱ̍೥Ҏ্ͷ௕ ظอଘΛཱ྆͢Δ͜ͱ͕೉͍͠ɻ ϝτϦΫε਺ͷ૿େʹରͯ͠ɺطଘͷಛ௃࡟ݮΛద༻ͨ͠ͱͯ͠΋ɺγες Ϝશମͷো֐Λଊ͑ΒΕͣɺِཅੑɾِӄੑ͕૿Ճ͢Δɻ ܭଌॲཧͷޮ཰Խ [1] Y. Tsubouchi, M. Furukawa, R. Matsumoto, Low Overhead TCP/UDP Socket-based Tracing for Discovering Network Services Dependencies, Journal of Information Processing (JIP), Vol.30, pp.260-268, March 2022. [2] ௶಺༎थ, ࿬ࡔேਓ, ᖛా݈, দ໦խ޾, খྛོߒ, Ѩ෦ത, দຊ ྄հ, HeteroTSDB: ҟछ෼ࢄKVSؒͷࣗಈ֊૚ԽʹΑΔߴੑೳͳ ࣌ܥྻσʔλϕʔε, ৘ใॲཧֶձ࿦จࢽ, Vol.62, No.3, pp.818- 828, 2021೥3݄. [3] Y. Tsubouchi and H. Tsuruta, MetricSifter: Feature Reduction of Multivariate Time Series Data for Ef fi cient Fault Localization in Cloud Applications, IEEE Access, Vol. 12, pp. 37398-37417, March 2024. 2. औΓࠐΈॲཧͱ௕ظอଘͷޮ཰ͷ޲্ 3. ނোಛఆͷલॲཧͰো֐ʹؔ࿈͠ͳ͍มྔͷ࡟ݮ OSΧʔωϧ಺ͰTCP/UDP௨৴ΠϕϯτΛूଋ͢Δ͜ͱʹΑΔసૹॲཧޮ཰ͷ޲্ ҟछKVSΛ֊૚Խ͠ɺΠϯσοΫεࢀরޮ཰ͱ҆ՁͳετϨʔδ΁ͷ֨ೲΛ࣮ݱɻ ো֐ൃੜ࣌ʹ֤࣌ܥྻͷมԽ఺͕࣌ؒूத͢Δ͜ͱΛߟྀͨ͠ಛ௃࡟ݮʹΑΓɺ ނোಛఆਫ਼౓ͱ࣌ؒΛվળɻ ֤૚ͷϫʔΫϩʔυ૿େ࣌ͷ՝୊ղܾ ςϨϝτϦʔϫʔΫϩʔυ૿େͷ՝୊ ޮ཰తʹεέʔϧՄೳͳςϨϝτϦʔγ εςϜͷ࣮ݱʹ޲͚ͯ ΞϓϦέʔγϣϯ͕ෳࡶԽ͓ͯ͠ΓɺςϨϝτϦʔʹΑΔӡ༻ ؅ཧ͕ඞਢͰ͋Δɻ [1] [2] [3] ςϨϝτϦʔγεςϜͰɺܭଌɾετϨʔδɾϚΠχϯάͷ֤૚ ͰϫʔΫϩʔυ͕૿େ͍ͯ͠Δɻ ܭࢉػࢿݯͷফඅ૿େͳͲͷ໰୊ʹରͯ͠ޮ཰Α͘εέʔϧͤ͞Δ ͜ͱΛ໨తͱ͢Δɻͨͩ͠ɺӡ༻ෳࡶੑΛߟྀ͢Δ͜ͱɻ