Slide 1

Slide 1 text

෼ࢄΞϓϦέʔγϣϯͷҟৗͷݪҼΛ ଈ࣌ʹ਍அ͢ΔͨΊͷख๏ͷߏ૝ ژ౎େֶ৘ใֶݚڀՊ ஌ೳ৘ใֶઐ߈ D1 ௶಺ ༎थ Ԭ෦ɾٶ࡚ݚڀࣨ ݚڀձ 2020೥5݄7೔

Slide 2

Slide 2 text

2 1. എܠͱ໨త 2. ؔ࿈ݚڀ 3. ఏҊख๏ 4. ࣮ݧ༧ఆ 5. ·ͱΊͱࠓޙͷ༧ఆ ໨࣍

Slide 3

Slide 3 text

1. എܠͱ໨త

Slide 4

Slide 4 text

4 ɾ୯ҰͷڊେͳΞϓϦέʔγϣϯΛߏ੒͢ΔͷͰ͸ͳ͘ɺখ͞ͳαʔ ϏεΛ૊Έ߹ΘͤΔ෼ࢄߏ੒ͷ୆಄ ɾଟ͘ͷߏ੒ཁૉ͕ޓ͍ʹ௨৴ͯ͠ಈ࡞͢ΔͨΊɺҟৗͷݪҼͷಛఆ ͕ࠔ೉ͱͳΔ ɾෳࡶͳωοτϫʔΫґଘؔ܎ ɾґଘؔ܎ͷಈతͳมߋ ɾߏ੒ཁૉ͝ͱʹ؂ࢹ͢ΔେྔͷϝτϦοΫ ෼ࢄΞϓϦέʔγϣϯͷҟৗݪҼͷಛఆͷࠔ೉͞

Slide 5

Slide 5 text

5 ɾ௿ϊΠζੑ: ར༻ऀʹѱӨڹͷ͋Δ঱ঢ়ʹର͢ΔΞϥʔτͷΈΛγε ςϜ؅ཧऀʹ௨஌ͯ͠΄͍͠ ɾCPUར༻཰͕100%Ͱ͋ͬͯ΋ར༻ऀʹӨڹ͕͋Δͱ͸ݶΒͳ͍ ɾϊΠζ͕૿͑Δͱɺ؅ཧऀͷೝ஌ෛՙ͕ߴ·Γɺ؃աʹͭͳ͕Δ ɾଈ࣌ੑ: ҟৗΛݕ஌ͨ͠ͷͪʹɺଈ࣌ʹݪҼΛಛఆ͍ͨ͠ ɾߏ੒ཁૉ਺ͱϝτϦοΫͷݸ਺͕૿େͯ͠΋ɺଈ࣌ੑΛҡ࣋ͨ͠ ͍ ҟৗݪҼͷಛఆʹର͢Δཁٻ

Slide 6

Slide 6 text

6 ɾϦΫΤετ୯ҐͷτϨʔγϯάʹΑΔࠜຊݪҼ෼ੳ[1] ɾΞϓϦέʔγϣϯʹܭଌίʔυΛ௥Ճ͠ͳ͚Ε͹ͳΒͳ͍ ɾߏ੒ཁૉؒͷґଘάϥϑͱϝτϦοΫΛར༻ͨࠜ͠ຊݪҼ෼ੳ[2][3] ɾґଘؔ܎ͷมԽ΍ϫʔΫϩʔυͷมԽ͕͋Δͱɺґଘؔ܎ͷநग़ॲ ཧΛ΍Γ௚͞ͳ͚Ε͹ͳΒͳ͍ => ଈ࣌ੑΛຬͨ͞ͳ͍ ɾ·ͨ͸ɺ֤ߏ੒ཁૉ͝ͱʹαʔϏεϨϕϧΛ୅ද͢ΔࢦඪʢSLOʣ Λઃܭ͠ɺ؂ࢹ͢ΔͨΊͷख͕ؒ͋Δ ઌߦݚڀͷ՝୊ [1]: H. Jayathilaka, C. Krintz and R. Wolski, Performance monitoring and root cause analysis for cloud-hosted web applications, WWW, pp. 469–478, 2017. [2]: J. Thalheim, A. Rodrigues, I. Akkus, and others, Sieve: Actionable Insights from Monitored Metrics in Distributed Systems, ACM/IFIP/USENIX Middleware, pp.14-27 2017. [3]: J. Lin, C. Pengfei, and Z. Zibin, "Microscope: Pinpoint performance issues with causal graphs in micro-service environments." ICSO, pp.3-20, 2018.

Slide 7

Slide 7 text

7 1. ௿ϊΠζͰ͋Δ͜ͱͱߏ੒ཁૉ୯ҐͰࢦඪΛઃܭ͢Δखؒͷ௿ݮ ↪ αʔϏεΛ୅ද͢ΔࢦඪͷΈΛ؂ࢹ (Ԡ౴࣌ؒ΍Ԡ౴Τϥʔ཰ͳͲ) ؂ࢹΞϥʔτΛܖػʹTCP઀ଓґଘάϥϑͱ֤ϊʔυ্ͷϝτϦο ΫΛ୳ࡧ ୅දࢦඪͱ࣌ܥྻతಛ௃͕૬ؔ͢Δ΋ͷΛݪҼͷީิͱ͢Δ 2. ଈ࣌ੑ ֤ϊʔυ্ͷશϝτϦοΫΛ෼ੳ͢ΔͱͳΔͱॲཧ͕஗͘ͳΔ ↪ ϝτϦοΫऔಘॲཧΛ֤ϊʔυʹ෼ࢄอ࣋ɾ໰͍߹ΘͤʹΑΓߴ ଎Խ ↪ ࣌ܥྻͷ૬ؔ෼ੳॲཧΛGPGPUʹΑΓߴ଎Խ ↪ աڈͷॲཧ݁ՌΛ࠶ར༻͠ɺߴ଎൑ఆ ݚڀͷ໨త

Slide 8

Slide 8 text

8 ɾγεςϜ؅ཧऀ͸ɺࣄલʹ֤ߏ੒ཁૉ্ͰϝτϦοΫΛऩू͠ɺগ ਺ͷ୅දࢦඪͷΈʹ؂ࢹΞϥʔτΛઃఆ͍ͯ͑͠͞Ε͹ɺΞϥʔτ Λܖػʹଈ࣌ݪҼՕॴΛਪ࿦Մೳ ɾ࣮؀ڥͰൃੜͨ͠ҟৗγφϦΦέʔεʹରͯ͠ɺఏҊख๏͕ʓʓ% ͷਫ਼౓Λୡ੒ ظ଴͞ΕΔݚڀͷߩݙ

Slide 9

Slide 9 text

2. ؔ࿈ݚڀ

Slide 10

Slide 10 text

10 ɾΞϓϦέʔγϣϯ૚ͷ֤ϦΫΤετʹࣝผࢠΛׂΓৼΓɼޙଓͷϦ ΫΤετʹຒΊࠐΜ্ͩͰɼޙଓͷϓϩηε΁఻ൖͤ͞Δ ɾࣝผࢠΛཔΓʹɼ೚ҙͷߏ੒ཁૉؒͷ஗ԆΛܭଌՄೳ ɾར఺: ΞϓϦέʔγϣϯ಺෦ͷৄࡉͳ௥੻͕Մೳ ɾ՝୊: ܭଌͷͨΊʹɺΞϓϦέʔγϣϯίʔυʹมߋ͕ඞཁ B. H. Sigelman, et al., Dapper, a Large-Scale Distributed Systems Tracing Infrastructure, Technical report, Google 2010. ؔ࿈ݚڀᶃ: ϦΫΤετ୯ҐͷτϨʔγϯά

Slide 11

Slide 11 text

11 ؔ࿈ݚڀᶄ: Sieve B. H. Sigelman, et al., Dapper, a Large-Scale Distributed Systems Tracing Infrastructure, Technical report, Google 2010. ɾطଘͷ؂ࢹج൫͔ΒΞΫγϣϯՄೳͳώϯτΛಘΔγεςϜΛఏҊ ɾ࣌ܥྻΫϥελ෼ੳʹΑΓίϯϙʔωϯτؒͷґଘؔ܎Λࣝผ͢Δ ɾ՝୊: ߏ੒มߋ΍ϫʔΫϩʔυͷมԽʹ௥ै͢ΔͨΊʹ͸ɺ෼ੳε ςοϓΛ࠷ॳ͔Β΍Γ௚͢ඞཁ͕͋Δ [2]: J. Thalheim, A. Rodrigues, I. Akkus, and others, Sieve: Actionable Insights from Monitored Metrics in Distributed Systems, ACM/IFIP/USENIX Middleware, pp.14-27 2017.

Slide 12

Slide 12 text

12 ؔ࿈ݚڀᶅ: Microscope ɾϚΠΫϩαʔϏεʹܭଌίʔυΛ௥Ճ͢Δ͜ͱͳ͘ɺཁૉؒͷҼՌ ؔ܎άϥϑΛޮ཰తʹߏங͠ɺҟৗͷݪҼΛϦΞϧλΠϜͰਪଌ ɾ՝୊1: ֤ߏ੒ཁૉͷ୅දϝτϦοΫͷมಈΛΈͯߜΓࠐΉ͕ɺͦΕ Ҏ֎ͷݪҼͱͳΔϝτϦοΫީิΛग़ྗ͠ͳ͍ ɾ՝୊2: ߏ੒ཁૉ͝ͱʹ୅දϝτϦοΫͷઃఆͷͨΊʹΞϓϦέʔ γϣϯ஌͕ࣝඞཁ [3]: J. Lin, C. Pengfei, and Z. Zibin, "Microscope: Pinpoint performance issues with causal graphs in micro-service environments." ICSO, pp.3-20, 2018.

Slide 13

Slide 13 text

13 ؔ࿈ݚڀͷ՝୊ͷ·ͱΊ ɾؔ࿈ݚڀᶅ Microscope͸αʔϏεશମͷ୅දࢦඪʢSLOʣͷΈʹ ؂ࢹΞϥʔτΛઃఆ͢ΔͨΊɺ௿ϊΠζੑΛຬͨ͢ ɾMicroscope͸ɺҼՌάϥϑߏஙॲཧΛฒྻԽ͢Δ͜ͱʹΑΓଈ࣌ੑ Λຬͨ͢ ɾ͔͠͠ɺαʔϏεશମ͚ͩͰͳ͘ɺ֤ߏ੒ཁૉ͝ͱʹ୅දࢦඪΛઃ ܭ͠ɺ࣌ܥྻσʔλͱͯ͠ऩू͠ͳ͚Ε͹ͳΒͳ͍ ɾߏ੒ཁૉ͕ଟ͍΄Ͳɺࢦඪͷઃܭίετ͸૿େ͢Δ

Slide 14

Slide 14 text

3. ఏҊख๏

Slide 15

Slide 15 text

15 ఏҊख๏ͷલఏ ɾؔ࿈ݚڀᶅ MicroscopeͷҼՌਪ࿦ख๏Λϕʔεʹ͢Δ ɾαʔϏεશମͷࢦඪͷΈΛఆٛ͠ɺ؂ࢹΞϥʔτΛઃఆ͢Δ ɾϗετ্ͷ͢΂ͯͷϝτϦοΫ ɾϓϩηεΛϊʔυɺTCP઀ଓΛΤοδͱͨ͠׬શͳωοτϫʔΫґ ଘάϥϑ[4] Λར༻ͯ͠ɺҟৗͷҼՌؔ܎άϥϑΛߏங͢Δ [4] ௶಺༎थ, ݹ઒խେ, দຊ྄հ, “Transtracer: ෼ࢄγεςϜʹ͓͚ΔTCP/UDP௨৴ͷऴ୺఺ͷ؂ࢹʹΑΔϓϩηεؒґଘؔ܎ͷࣗಈ௥ ੻”, Πϯλʔωοτͱӡ༻ٕज़γϯϙδ΢Ϝ࿦จू, 2019, 64-71 (2019-11-28), 2019೥12݄.

Slide 16

Slide 16 text

16 ఏҊख๏ͷϫʔΫϑϩʔʢਤʣ reqs/sec errors/sec latency CPU usage … ݪҼީิϦετ Frontend Component ֤ϊʔυ͸ ෳ਺ͷϝτϦοΫΛ΋ͭ 1. ୅දࢦඪͷ ҟৗݕ஌ 2. ґଘάϥϑ͔Βҟৗ ͷҼՌؔ܎άϥϑΛߏங (component, metric, score) (component, metric, score) . . .

Slide 17

Slide 17 text

17 ఏҊख๏ͷϫʔΫϑϩʔ 1. SLOͷҧ൓Λݕ஌ 2. ґଘؔ܎άϥϑΛ΋ͱʹҼՌਪ࿦Λ࣮ߦ ɾϑϩϯτϊʔυ͔Βྡ઀͢ΔϊʔυΛऔಘ ɾ֤ϊʔυ্ͰαʔϏεશମͷ୅දࢦඪͷ࣌ܥྻಛ௃ͱ૬ؔ͢Δϝτ ϦοΫΛอ࣋͢Δ͔Ͳ͏͔Λݕఆ ɾᮢ஋Λ௒͑ͨ৔߹౰֘ϊʔυͱϝτϦοΫͷ૊ΛݪҼީิϦετ΁ ɾશϊʔυΛ୳ࡧ͢Δ·ͰґଘάϥϑΛ୳ࡧ ɾ૬ؔ౓߹͍ΛείΞͱͯ͠ީิϦετΛϥϯΩϯάԽ 0. ϑϩϯτϊʔυͷ୅දϝτϦοΫ؂ࢹͱґଘάϥϑͷऩू

Slide 18

Slide 18 text

18 ҼՌάϥϑਪ࿦ͱݕఆΞϧΰϦζϜͷީิ ɾҼՌάϥϑਪ࿦ ɾPCΞϧΰϦζϜ: ࣄ৅ͷڞىؔ܎ͷΈΛѻ͏ ɾϕΠζ๏: ࣄ৅ͷڞىʹՃ͑ͯࣄ৅ؒͷલޙؔ܎Λѻ͏ ɾ֤ϝτϦοΫͷܥྻؒʹ࣌ࠁಉظ͕͞Ε͍ͯΕ͹ϕΠζ๏ͷ΄͏ ͕ద੾ͱͳΔՄೳੑ͕͋Δ ɾ࣌ܥྻσʔλʹର͢Δ৚݅෇͖ಠཱੑͷݕఆ ɾG2ݕఆͳͲ ɾ૬ؔ܎਺ΛϥϯΩϯάείΞͱ͢Δ

Slide 19

Slide 19 text

19 ఏҊ͢Δߴ଎ԽॲཧʢJust Ideaʣ ɾ࣌ܥྻσʔλʹର͢ΔݕఆॲཧΛߴ଎Խ ɾܥྻؒͷσʔλฒྻੑʹண໨͠ɺGPUʹΑΓฒྻॲཧ ɾϝτϦοΫͷऔಘॲཧΛߴ଎Խ ɾ௚ۙͷσʔλͷΈ֤ϊʔυͷϩʔΧϧʹอ࣋͠ɺ෼ࢄ໰͍߹Θͤ ɾաڈͷਪ࿦݁ՌΛ౿·͑ͯɺಉҰͷҟৗέʔε͔Ͳ͏͔Λ൑ఆ͠ɺ ಉҰͰ͋Ε͹աڈͷਪ࿦݁ՌΛར༻͢Δ ɾҟৗέʔε͕গͳ͍৔߹Λ૝ఆͯ͠ɺ࣮؀ڥʹҙਤతʹҟৗΛ஫ೖ ͠ɺֶश͓ͤͯ͘͞

Slide 20

Slide 20 text

4. ࣮ݧ༧ఆ

Slide 21

Slide 21 text

21 ධՁ߲໨ ɾଈ࣌ੑ ɾҟৗΛݕ஌ޙʹͲΕ͚ͩ଎͘ݪҼՕॴͷީิΛൃݟͰ͖Δ͔ ɾߏ੒ཁૉ਺ͱϝτϦοΫ਺ͷ૿େʹର͢Δ࣮ߦ࣌ؒͷมԽΛධՁ ɾݪҼͷਪ࿦ਫ਼౓ ɾؔ࿈ݚڀᶅ Microscopeͱൺֱ͠ɺਫ਼౓ʹ͕ࠩͰΔ͔Ͳ͏͔

Slide 22

Slide 22 text

22 ධՁͷϕʔεϥΠϯ ɾଈ࣌ੑ ɾؔ࿈ݚڀᶅ Microscope͸୯Ұ෺ཧϚγϯʹ෼ੳͤͯ͞12ඵ ɾҰൠʹར༻ऀʹো֐Өڹ࣌ؒΛఏࣔ͢Δͱ͖ʹ෼୯ҐͰ͋ΔͨΊɺ 60ඵҎ಺ͷճ෮ͷ஗Ε͸ڐ༰Մೳ ɾਫ਼౓ͱ࠶ݱ཰ ɾؔ࿈ݚڀᶅ Microscope͸ਫ਼౓88%ͱ࠶ݱ཰85%

Slide 23

Slide 23 text

5. ·ͱΊͱࠓޙͷ༧ఆ

Slide 24

Slide 24 text

24 ɾ෼ࢄΞϓϦέʔγϣϯͷҟৗൃੜ࣌ʹɺଈ࣌ੑͱ௿ϊΠζੑΛཱ྆ ͭͭ͠ɺগ਺ͷ୅දࢦඪΛઃܭɾ؂ࢹ͢ΔͷΈͰɺݪҼՕॴΛਪ࿦ ͢Δख๏ΛఏҊ ·ͱΊͱࠓޙͷ༧ఆ ɾࠓޙͷ༧ఆ ɾఏҊख๏ͷਪ࿦ͷ࣮ߦ࣌ؒΛͲͷఔ౓୹ॖͰ͖Δ͔Λ༧උ࣮ݧ͢ Δ ɾIOT50ʢ6݄18೔ʒ੾ʣͰͦͷ࣌఺ͰͷਐḿΛ੔ཧͯ͠ൃද͢Δ ɾΑ͍݁Ռ͕ͰΕ͹IOTS2020ʢ9্݄०ʒ੾ʣͰൃද͢Δ

Slide 25

Slide 25 text

25 ɾఏҊख๏ͷਪ࿦ͷ࣮ߦ࣌ؒΛͲͷఔ౓୹ॖͰ͖Δ͔Λ༧උ࣮ݧ͢Δ ɾIOT50ʢ6݄18೔ʒ੾ʣͰͦͷ࣌఺ͰͷਐḿΛ੔ཧͯ͠ൃද͢Δ ɾΑ͍݁Ռ͕ͰΕ͹IOTS2020ʢ9্݄०ʒ੾ʣͰൃද͢Δ ࠓޙͷ༧ఆ