Upgrade to Pro — share decks privately, control downloads, hide ads and more …

AI時代に向けたクラウドにおける信頼性エンジニアリングの未来構想 / DICOMO2022 6A-1

AI時代に向けたクラウドにおける信頼性エンジニアリングの未来構想 / DICOMO2022 6A-1

DICOMO2022 6A 統一セッション:クラウド 招待講演

https://tsys.jp/dicomo/2022/program/program_abst.html#6A-1

情報サービスの利用者に必要な機能を頻繁に加え続けながらも、いかに必要十分な信頼性を継続させるかが従前より課題となっている。この課題に対するひとつの回答とも言える、Googleが提唱した情報サービスの新しい運用形態であるSite Reliability Engineering(SRE)の普及が進んでいます。本発表では、SREの中核概念を整理した上で、AI時代に向けて、AIとの対話を軸にした未来の運用のあり方を構想します。

A658ec7f1badf73819dfa501165016c1?s=128

Yuuki Tsubouchi (yuuk1)

July 14, 2022
Tweet

More Decks by Yuuki Tsubouchi (yuuk1)

Other Decks in Research

Transcript

  1. AI࣌୅ʹ޲͚ͨΫϥ΢υʹ͓͚Δ 
 ৴པੑΤϯδχΞϦϯάͷະདྷߏ૝ ௶಺༎थɹɹ௽ాതจ 2022/07/14 DICOMO 2022 ট଴ߨԋ ※1 ͘͞ΒΠϯλʔωοτݚڀॴ

    ※2 ژ౎େֶେֶӃ৘ใֶݚڀՊ ※̍ ※1 ※2
  2. 2 ϓϩϑΟʔϧ ௶಺ ༎थ ͘͞ΒΠϯλʔωοτݚڀॴɹݚڀһ ژ౎େֶେֶӃ৘ใֶݚڀՊɹത࢜ޙظ՝ఔ3೥ TopotalɹςΫϊϩδΞυόΠβʔ ৽ଔ͔Β5೥ؒɺגࣜձࣾ͸ͯͳͰΤϯδχΞΛ຿ΊΔ https://yuuk.io/ 2019೥ΑΓ͘͞ΒΠϯλʔωοτʹస৬͠ɺݚڀ։ൃͷੈք΁

    2020೥ʹژ౎େֶେֶӃ ത࢜ޙظ՝ఔʹೖֶ @yuuk1t Ϋϥ΢υʹ͓͚Δߴ৴པԽͷͨΊͷɺ 
 ӡ༻σʔλͷߴޮ཰ͳऩूͱɺ 
 ౷ܭղੳɾػցֶशʹجͮ͘ো֐ݪҼ਍அ ݚڀςʔϚ
  3. ͦ͜Ͱɺ৘ใγεςϜͷ৴པੑʹؔ͢ΔΤϯδχΞϦϯάͷݱࡏ Λ੔ཧ͠ɺདྷͨΔ΂͖ະདྷͷAI࣌୅ʹ͓͚Δ৴པੑ΁ͷΞϓϩʔ νΛߏ૝͠·͢ Έͳ͞·ͷࠓޙͷݚڀͷண૝ͷछͱͯ࣋ͪ͠ؼ͍͚ͬͯͨͩΔ͜ ͱ͕͋Ε͹޾͍Ͱ͢ɻ·ͨɺຊߏ૝ΛίϛϡχςΟͰҭ͍͖ͯͯ ͍ͨͱ΋ߟ͍͑ͯ·͢ɻ اۀʹ͓͚ΔࣄۀʹؔΘΔதͰɺະདྷͷల๬Λݚڀऀͷཱ৔Ͱఏࣔ ͢Δ͜ͱͷॏཁੑ͕ߴ·͍ͬͯΔΑ͏ʹײ͍ͯ͡·͢ɻ

  4. 1. Ϋϥ΢υʹ͓͚Δ৴པੑΤϯδχΞϦϯά 2. AI࣌୅ʹ͓͚Δ৴པੑΤϯδχΞϦϯάͷະདྷ 3. AIͱͷڠಇʹΑΔ৴པੑΤϯδχΞϦϯάͷݕ౼ 4. ͓ΘΓʹ 4 ΞδΣϯμ

    ݱࡏɺͲ͏ͳͬͯ 
 ͍Δͷ͔ 20೥ઌͷະདྷͰ 
 Ͳ͏͋Γ͍͔ͨ ະདྷͱݱࡏͷࠩΛ 
 ຒΊΔಓے͸ͳʹ͔
  5. 1. Ϋϥ΢υʹ͓͚Δ৴པੑΤϯδχΞϦϯά 2. AI࣌୅ʹ͓͚Δ৴པੑΤϯδχΞϦϯάͷະདྷ 3. AIͱͷڠಇʹΑΔ৴པੑΤϯδχΞϦϯάͷݕ౼ 4. ͓ΘΓʹ 5 ΞδΣϯμ

    ݱࡏɺͲ͏ͳͬͯ 
 ͍Δͷ͔ 20೥ઌͷະདྷͰ 
 Ͳ͏͋Γ͍͔ͨ ະདྷͱݱࡏͷࠩΛ 
 ຒΊΔಓے͸ͳʹ͔
  6. 6 ৘ใγεςϜͷ৴པੑʢReliabilityʣͷॏཁੑ ɾߨԋ௚લʹ͸ɺCloud fl areʢ݄̒ʣͱKDDIʢ݄̓ʣͷͦΕͧΕͷγεςϜʹ େن໛ͳো֐͕ൃੜͨ͠ ɾো֐ʹૺ۰͢Δͱɺਓʑ͸ࣗಈԽ͞ΕͨγεςϜΛ৴པͯ͠Α͍΋ͷ͔෼͔ Βͣɺґଘ͢Δ͜ͱΛڪΕΔ ɾҰํͰɺ৘ใγεςϜͷ৴པੑΛҡ࣋͢ΔͨΊʹɺ৘ใٕज़ऀ͕೔ʑ࿑ۤΛ ॏͶ͍ͯΔ

    ɾࠓޙɺDX͕Ճ଎͢ΔதͰɺ৴པੑʹؔΘΔ໰୊ʹऔΓ૊Ή͜ͱ͸ॏཁͰ͋Δ [Beyer+, 2016] Site Reliability Engineering: How Google Runs Production Systems ৘ใγεςϜʹ͓͍ͯɺʮ৴པੑ͸࠷΋جຊతͳػೳʯͰ͋Δ [Beyer+, 2016]
  7. 7 ৘ใγεςϜʹ͓͚Δʮ৴པੑʯͷݱࡏ ৴པੑͷʮݱࡏʯʹͭͳ͕Δɺྺ࢙తมભΛΈ͍ͯ͘ ߴස౓ͷมߋͱߴ৴པੑΛཱ྆͢ΔͨΊͷΞϓϩʔνͷීٴ͕ਐΜͰ͍Δ ݱࡏͷ৘ใγεςϜ͸ɺΫϥ΢υίϯϐϡʔςΟϯάʹΑΔఏڙ͕Ұൠత Ϋϥ΢υ Site Reliability EngineeringʢSREʣ Ϧιʔεڞ༗ɺ޿ҬωοτϫʔΫɺҟ

    छιϑτ΢ΣΞ/ϋʔυ΢ΣΞɺͦΕ Βͷෳࡶͳ૬ޓ࡞༻Λ੒͢γεςϜ Πϯλʔωοτ Infrastructure Platform Application ઌ୺اۀͰ͸ɺ1೔ෳ਺ճҎ্ͷมߋ [Humble+, 2018] Accelerate: The Science of Lean Software and DevOps: Building and scaling high performing technology organizations [Beyer+, 2016] Site Reliability Engineering: How Google Runs Production Systems [Humble+, 2018] [Beyer+, 2016]
  8. 8 ৴པੑʹؔΘΔ΋ͷ͝ͱͷྺ࢙తมભ ೥୅ ৴པੑͷର৅ γεςϜͷఏڙܗଶ ߴ৴པԽͷߟ͑ํ ৴པੑͷఆٛ 1940~ 60 


    ϋʔυ΢ΣΞ ػثΛ෺ཧతʹग़ՙ ނোͤͣʹ௕࣋ͪͤ͞ Δ ʢ଱ٱੑʣΞΠςϜ͕༩͑Β Εͨ৚݅ͷԼͰɺ༩͑ΒΕͨ ظؒɺނোͤͣʹɺཁٻͲ͓ Γʹ਱ߦͰ͖Δೳྗ 1960~ 80 ιϑτ΢Σ Ξ ιʔείʔυɾ࣮ߦϑΝΠ ϧɺ·ͨ͸ɺΠϯετʔϧ͞ Εͨίϯϐϡʔλ͝ͱೲ඼ ίϯϙʔωϯτͱͦͷ ૊Έ߹Θͤͷग़ྗͷͦ ΕͧΕΛࣄલʹ֬ೝ ʢอશੑʣʢઃܭ৴པ ੑʣ 1980~ 2000 Πϯλʔ ωοτ ୯ҰͷڊେωοτϫʔΫΛڞ ༗ͯ͠ར༻ ߴ଎௨৴ɺ஗Ԇɾ఻ૹ ޡΓɺϊʔυނোΛલ ఏͱ͢Δ௨৴ϓϩτί ϧͷઃܭ ʢ૯߹৴པੑʣΞΠςϜ ͕ɼཁٻ͞Εͨͱ͖ʹɺͦ ͷཁٻͲ͓Γʹɺ਱ߦ͢Δ ͨΊͷೳྗ 2000 ~ Ϋϥ΢υ ࣄۀऀʹΑΓγεςϜΛूத ؅ཧɾৗ࣌Քಇɻར༻ऀ͸Π ϯλʔωοτܦ༝Ͱར༻ ৴པੑͷ௿͍ίϯϙʔ ωϯτ܈Λ౔୆ʹ৴པ ੑͷߴ͍γεςϜઃܭ ௥Ճͷݫີͳఆٛ͸֬ೝͰ ͖ͣɻΑΓར༻ऀ໨ઢͷ৴ པੑΛݸผ۩ମతʹఆٛɻ ※1 JIS Z 8115:2019 ※1 ※1 [Saleh+, 2006] Highlights from the early (and pre-) history of reliability engineering 
 [Kleppmann,2017] Designing Data-intensive Applications: The big ideas behind reliable, scalable, and maintainable systems ※1 [ࢁຊ+ 2021] ֬཰ɾ౷ܭ͔Β࢝ΊΔ ΤϯδχΞͷͨΊͷ৴པੑ޻ֶ- ਎ۙͳނো͔ΒӉ஦։ൃ·Ͱ -,ίϩφࣾ
  9. 9 ৴པੑʹؔΘΔ΋ͷ͝ͱͷྺ࢙తมભ ೥୅ ৴པੑͷର৅ γεςϜͷఏڙܗଶ ߴ৴པԽͷߟ͑ํ ৴པੑͷఆٛ 1940~ 60 


    ϋʔυ΢ΣΞ ػثΛ෺ཧతʹग़ՙ ނোͤͣʹ௕࣋ͪͤ͞ Δ ʢ଱ٱੑʣΞΠςϜ͕༩͑Β Εͨ৚݅ͷԼͰɺ༩͑ΒΕͨ ظؒɺނোͤͣʹɺཁٻͲ͓ Γʹ਱ߦͰ͖Δೳྗ 1960~ 80 ιϑτ΢Σ Ξ ιʔείʔυɾ࣮ߦϑΝΠ ϧɺ·ͨ͸ɺΠϯετʔϧ͞ Εͨίϯϐϡʔλ͝ͱೲ඼ ίϯϙʔωϯτͱͦͷ ૊Έ߹Θͤͷग़ྗͷͦ ΕͧΕΛࣄલʹ֬ೝ ʢอશੑʣʢઃܭ৴པ ੑʣ 1980~ 2000 Πϯλʔ ωοτ ୯ҰͷڊେωοτϫʔΫΛڞ ༗ͯ͠ར༻ ߴ଎௨৴ɺ஗Ԇɾ఻ૹ ޡΓɺϊʔυނোΛલ ఏͱ͢Δ௨৴ϓϩτί ϧͷઃܭ ʢ૯߹৴པੑʣΞΠςϜ ͕ɼཁٻ͞Εͨͱ͖ʹɺͦ ͷཁٻͲ͓Γʹɺ਱ߦ͢Δ ͨΊͷೳྗ 2000 ~ Ϋϥ΢υ ࣄۀऀʹΑΓγεςϜΛूத ؅ཧɾৗ࣌Քಇɻར༻ऀ͸Π ϯλʔωοτܦ༝Ͱར༻ ৴པੑͷ௿͍ίϯϙʔ ωϯτ܈Λ౔୆ʹ৴པ ੑͷߴ͍γεςϜઃܭ ௥Ճͷݫີͳఆٛ͸֬ೝͰ ͖ͣɻΑΓར༻ऀ໨ઢͷ৴ པੑΛݸผ۩ମతʹఆٛɻ ※1 JIS Z 8115:2019 ※1 ※1 [Saleh+, 2006] Highlights from the early (and pre-) history of reliability engineering 
 [Kleppmann,2017] Designing Data-intensive Applications: The big ideas behind reliable, scalable, and maintainable systems ※1 [ࢁຊ+ 2021] ֬཰ɾ౷ܭ͔Β࢝ΊΔ ΤϯδχΞͷͨΊͷ৴པੑ޻ֶ- ਎ۙͳނো͔ΒӉ஦։ൃ·Ͱ -,ίϩφࣾ ෳ੡඼͔Βৗ࣌ՔಇͷҰ఺΋ͷ΁
  10. 10 ৴པੑʹؔΘΔ΋ͷ͝ͱͷྺ࢙తมભ ೥୅ ৴པੑͷର৅ γεςϜͷఏڙܗଶ ߴ৴པԽͷߟ͑ํ ৴པੑͷఆٛ 1940~ 60 


    ϋʔυ΢ΣΞ ػثΛ෺ཧతʹग़ՙ ނোͤͣʹ௕࣋ͪͤ͞ Δ ʢ଱ٱੑʣΞΠςϜ͕༩͑Β Εͨ৚݅ͷԼͰɺ༩͑ΒΕͨ ظؒɺނোͤͣʹɺཁٻͲ͓ Γʹ਱ߦͰ͖Δೳྗ 1960~ 80 ιϑτ΢Σ Ξ ιʔείʔυɾ࣮ߦϑΝΠ ϧɺ·ͨ͸ɺΠϯετʔϧ͞ Εͨίϯϐϡʔλ͝ͱೲ඼ ίϯϙʔωϯτͱͦͷ ૊Έ߹Θͤͷग़ྗͷͦ ΕͧΕΛࣄલʹ֬ೝ ʢอશੑʣʢઃܭ৴པ ੑʣ 1980~ 2000 Πϯλʔ ωοτ ୯ҰͷڊେωοτϫʔΫΛڞ ༗ͯ͠ར༻ ߴ଎௨৴ɺ஗Ԇɾ఻ૹ ޡΓɺϊʔυނোΛલ ఏͱ͢Δ௨৴ϓϩτί ϧͷઃܭ ʢ૯߹৴པੑʣΞΠςϜ ͕ɼཁٻ͞Εͨͱ͖ʹɺͦ ͷཁٻͲ͓Γʹɺ਱ߦ͢Δ ͨΊͷೳྗ 2000 ~ Ϋϥ΢υ ࣄۀऀʹΑΓγεςϜΛूத ؅ཧɾৗ࣌Քಇɻར༻ऀ͸Π ϯλʔωοτܦ༝Ͱར༻ ৴པੑͷ௿͍ίϯϙʔ ωϯτ܈Λ౔୆ʹ৴པ ੑͷߴ͍γεςϜઃܭ ௥Ճͷݫີͳఆٛ͸֬ೝͰ ͖ͣɻΑΓར༻ऀ໨ઢͷ৴ པੑΛݸผ۩ମతʹఆٛɻ ※1 JIS Z 8115:2019 ※1 ※1 [Saleh+, 2006] Highlights from the early (and pre-) history of reliability engineering 
 [Kleppmann,2017] Designing Data-intensive Applications: The big ideas behind reliable, scalable, and maintainable systems ※1 [ࢁຊ+ 2021] ֬཰ɾ౷ܭ͔Β࢝ΊΔ ΤϯδχΞͷͨΊͷ৴པੑ޻ֶ- ਎ۙͳނো͔ΒӉ஦։ൃ·Ͱ -,ίϩφࣾ ෦඼͕ؒ૬ޓ࡞༻͢ΔΑ͏ͳγεςϜ΁ ෦඼ͷނোΛલఏͱͨ͠ઃܭͱอक΁
  11. 11 ৴པੑʹؔΘΔ΋ͷ͝ͱͷྺ࢙తมભ ೥୅ ৴པੑͷର৅ γεςϜͷఏڙܗଶ ߴ৴པԽͷߟ͑ํ ৴པੑͷఆٛ 1940~ 60 


    ϋʔυ΢ΣΞ ػثΛ෺ཧతʹग़ՙ ނোͤͣʹ௕࣋ͪͤ͞ Δ ʢ଱ٱੑʣΞΠςϜ͕༩͑Β Εͨ৚݅ͷԼͰɺ༩͑ΒΕͨ ظؒɺނোͤͣʹɺཁٻͲ͓ Γʹ਱ߦͰ͖Δೳྗ 1960~ 80 ιϑτ΢Σ Ξ ιʔείʔυɾ࣮ߦϑΝΠ ϧɺ·ͨ͸ɺΠϯετʔϧ͞ Εͨίϯϐϡʔλ͝ͱೲ඼ ίϯϙʔωϯτͱͦͷ ૊Έ߹Θͤͷग़ྗͷͦ ΕͧΕΛࣄલʹ֬ೝ ʢอશੑʣʢઃܭ৴པ ੑʣ 1980~ 2000 Πϯλʔ ωοτ ୯ҰͷڊେωοτϫʔΫΛڞ ༗ͯ͠ར༻ ߴ଎௨৴ɺ஗Ԇɾ఻ૹ ޡΓɺϊʔυނোΛલ ఏͱ͢Δ௨৴ϓϩτί ϧͷઃܭ ʢ૯߹৴པੑʣΞΠςϜ ͕ɼཁٻ͞Εͨͱ͖ʹɺͦ ͷཁٻͲ͓Γʹɺ਱ߦ͢Δ ͨΊͷೳྗ 2000 ~ Ϋϥ΢υ ࣄۀऀʹΑΓγεςϜΛूத ؅ཧɾৗ࣌Քಇɻར༻ऀ͸Π ϯλʔωοτܦ༝Ͱར༻ ৴པੑͷ௿͍ίϯϙʔ ωϯτ܈Λ౔୆ʹ৴པ ੑͷߴ͍γεςϜઃܭ ௥Ճͷݫີͳఆٛ͸֬ೝͰ ͖ͣɻΑΓར༻ऀ໨ઢͷ৴ པੑΛݸผ۩ମతʹఆٛɻ ※1 JIS Z 8115:2019 ※1 ※1 [Saleh+, 2006] Highlights from the early (and pre-) history of reliability engineering 
 [Kleppmann,2017] Designing Data-intensive Applications: The big ideas behind reliable, scalable, and maintainable systems ※1 [ࢁຊ+ 2021] ֬཰ɾ౷ܭ͔Β࢝ΊΔ ΤϯδχΞͷͨΊͷ৴པੑ޻ֶ- ਎ۙͳނো͔ΒӉ஦։ൃ·Ͱ -,ίϩφࣾ ෦඼ͷނোΛલఏͱͨ͠ΑΓ޿ൣғͷఆٛ΁
  12. 12 ։ൃ͞Εͨ੒Ռ෺Λೲ඼ͨ͠ͷͪʹɺӡ༻ɾอक͢Δ 2ஈ֊ͷϥΠϑαΠΫϧ ιϑτ΢ΣΞγεςϜͷϥΠϑαΠΫϧͷมભ Ϋϥ΢υ্Ͱৗ࣌Քಇ͢ΔγεςϜͷ։ൃͱӡ༻Λಉ࣌ʹ࣮ફ͢Δ ஈ֊෼͚ͳ͠ͷϥΠϑαΠΫϧ աڈ ݱ୅ ࢀߟɿDevOps [Allspaw+,

    2009] 10+ Deploys Per Day: Dev and Ops Cooperation at Flickr https://www.slideshare.net/jallspaw/10-deploys-per-day-dev-and- ops-cooperation-at-flickr ఏڙܗଶ͕ৗ࣌Քಇ͢ΔҰ఺΋ͷαʔϏε΁มԽ ҰํͰɺมߋස౓͕ߴ͍͜ͱ͔Βɺมߋ͕ো֐ͷҾ͖ۚͱͳΔ [Beyer+, 2018] The Site Reliability Workbook: Practical Ways to Implement SRE GoogleͰ͸ো֐ͷҾ͖ۚͷ͏ͪ68%͸มߋʹΑΔ΋ͷ [Allspaw+, 2009] [Beyer+, 2018]
  13. 13 มߋʹΑΔো֐ൃੜΛલఏͱ͢Δߟ͑ํ ɾ׬શͳ৴པੑ͸໨ࢦ͞ͳ͍ ɾ৴པੑͷࢦඪͱͦͷ໨ඪ஋Λઃఆ ɾ໨ඪ஋ΛԼݶͱͯ͠ɺ։ൃऀ͸ੵۃతʹมߋՄೳͱ͢Δ ɾʮ৴པੑʯΛ੍ޚ͠ɺมߋ଎౓Λ࠷େԽ͢Δ Site Reliability Engineering (SRE)

    ཱ྆͢Δʹ͸ ߴස౓ͷมߋ ߴ৴པੑ [Beyer+, 2016] Site Reliability Engineering: How Google Runs Production Systems [Beyer+, 2016]
  14. 14 ɾιϑτ΢ΣΞʹΑΔӡ༻ࣗಈԽ͸͢Ͱʹ࣮ફ͞Ε͍ͯͨ ɾมߋΛલఏͱͨ͠ো֐Λڐ༰͢ΔͨΊͷΞϓϩʔν͕ීٴ Googleʹ͓͚Δᴈ໌ظ ੈքతͳ෩Ӣظ ɾӡ༻Λιϑτ΢ΣΞΤϯδχΞϦϯάͰ࠶ఆٛ ɾιϑτ΢ΣΞʹΑΔΦϖϨʔγϣϯͷࣗಈԽ Site Reliability EngineeringʢSREʣͷීٴ

    2004೥ 2014೥ 
 Ҏ߱ ίϛϡχςΟͰڞ༗͞ΕΔSREͷݫ֨ͳఆٛ͸·ͩͳ͍ 2014೥ USENIXͰSREconͷॳ։࠵ ”৴པੑ޻ֶ”ͷΑ͏ͳֶज़ྖҬͱͯ͠ͷ੒ख़౓߹͍͸·ͩઙ͍ طଘͷߴ৴པԽख๏ͱͷؔ܎ੑΛߟ͑Δ
  15. 15 1. ϓϩτίϧʹ 
 جͮࣗ͘ಈ੍ޚ 2. ؅ཧऀʹΑΓએݴ͞Εͨ๬·͠ ͍ঢ়ଶʹ௥ै͢Δࣗಈ੍ޚ 3. ΦϖϨʔλʔʹΑΔखಈ੍ޚ

    Ϋϥ΢υͷ଱ো֐ੑͷͨΊͷ֊૚Ϟσϧʢಠࣗʣ • ίϯϙʔωϯτ΍௨৴ϨϕϧͰͷ 
 ނো΍ྼԽରԠ • ఻ૹ੍ޚɾܦ࿏੍ޚϓϩτίϧɺ 
 ෼ࢄ߹ҙΞϧΰϦζϜͳͲ • ܭࢉػϦιʔεͷ 
 ల։ɾࣗಈ৳ॖɾ؅ཧ • Borg, KubernetesͳͲͷ 
 ΦʔέετϨʔλʔ • ৴པੑͷ໨ඪ஋Λຬͨ͢Α͏ʹ 
 ো֐ʹରͯ͠खಈͰରԠ • ༧๷ɾ༧ଌɾݕ஌ɾݪҼ਍அɾ 
 ؇࿨ɾࣄޙ෼ੳɾम෮ ʢϑΥʔϧττϨϥϯεʣ ࡢࠓͷٕज़Λ౿·͑ͨ3૚ΦχΦϯϞσϧɻSREͰ͸3ͷ૚ʹϑΥʔΧε Service-level Component-level System-level
  16. 16 ΦϖϨʔλʔʹΑΔखಈ੍ޚʹཁ͢Δ࣌ؒ ※1 The VOID Report 2021 https://www.thevoid.community/report 599૊৫ͷ1,818ͷো֐ϨϙʔτʹΑΔͱɺো֐ͷ൒਺Ҏ্͸2࣌ؒҎ಺ʹղܾ ※1

    ճ෮Λ୹ॖ͢Δ ༨஍͸े෼ʹ͋Δ
  17. 17 2. ؅ཧऀʹΑΓએݴ͞Εͨ๬·͠ ͍ঢ়ଶʹ௥ै͢Δࣗಈ੍ޚ 3. ΦϖϨʔλʔʹΑΔखಈ੍ޚ Ϋϥ΢υͷ଱ো֐ੑͷͨΊͷ֊૚Ϟσϧʢಠࣗʣ AIʹΑΔࣗಈԽʢAIOpsʣ • ίϯϙʔωϯτ΍௨৴ϨϕϧͰͷ

    
 ނো΍ྼԽରԠ • ఻ૹ੍ޚɾܦ࿏੍ޚϓϩτίϧɺ 
 ෼ࢄ߹ҙΞϧΰϦζϜͳͲ • ܭࢉػϦιʔεͷ 
 ల։ɾࣗಈ৳ॖɾ؅ཧ • Borg, KubernetesͳͲͷ 
 ΦʔέετϨʔλʔ • ৴པੑͷ໨ඪ஋Λຬͨ͢Α͏ʹ 
 ো֐ʹରͯ͠खಈͰରԠ • ༧๷ɾ༧ଌɾݕ஌ɾݪҼ਍அɾ 
 ؇࿨ɾࣄޙ෼ੳɾम෮ ো֐ͷػߏͷ೺Ѳ͕೉͍ͨ͠Ίɺσʔλۦಈͷ ֶशʹΑΔࣗಈԽ͕ݚڀ͞Ε͍ͯΔ 1. ϓϩτίϧʹ 
 جͮࣗ͘ಈ੍ޚ ʢϑΥʔϧττϨϥϯεʣ
  18. 18 ɾITΦϖϨʔλ͸खಈͰ໘౗ͳ؅ཧ࡞ۀ΍ೝ஌ෛՙͷߴ͍࡞ۀ͕ཁٻ͞ΕΔ ɾো֐ͷݕ஌΍ݪҼͷ਍அ ɾෛՙʹԠͨ͡εέʔϧΞ΢τɾεέʔϧΠϯ ɾΞϥʔςΟϯάͷ؅ཧɺΠϯγσϯτରԠ AIOps (Artificial Intelligence for IT

    Operations) [Notaro ’20]: Notaro, P, Jorge C, and Michael G. "A Systematic Mapping Study in AIOps.” ICSOC. Springer, Cham, 2020. [Dang’19]: Dang, Y, Qingwei L, and Peng H. "AIOps: Real-World Challenges and Research Innovations." ICSE-Companion. IEEE, 2019. ɾGartnerʹΑΓ2017೥ʹఏএ͞Εͨ ʢAlgorithmic IT Operationsઆ΋͋Δʣ ※1 https://blogs.gartner.com/andrew-lerner/2017/08/09/aiops-platforms/ ※1 ɾITαʔϏεͷ؅ཧͱվળʹɺ౷ܭղੳ΍ػցֶशΛ͸͡Ίͱ͢ΔAIʢਓ޻஌ ೳʣٕज़Λద༻͢ΔऔΓ૊Έͷ૯শ
  19. 19 SREͱAIOpsͷؔ܎ ɾ׬શͳ৴པੑΛ໨ࢦ͞ͳ͍ϙϦγʔʹ͸ɺσʔλۦಈܕͷAI͕΋ͭ ՄṩੑΛ৫ΓࠐΈ΍͍͢ ɾ׬શͳ৴པੑΛ໨ࢦ͢৔߹ɺϒϥοΫϘοΫεͰ͋ΔAIΛ৴͡ΒΕͳ͍

  20. 20 ɾ1980೥୅ޙ൒ʹ͸ɺωοτϫʔΫ؅ཧʹɺ஌ࣝϕʔεAI΍χϡʔϥϧωο τϕʔεAIΛԠ༻͢ΔՄೳੑ͕ٞ࿦͞Ε͍ͯΔ ৘ใγεςϜͷӡ༻ʹAIΛԠ༻͢ΔىݯΛ୳Δ [Cebulka 1989]: Cebulka KD, et al.,

    Applications of arti fi cial intelligence for meeting network management challenges in the 1990s, IEEE GLOBECOM 1989. ɾಛఆͷαʔϏεΛαϙʔτ͢ΔͨΊͷωοτϫʔΫͷॳظઃܭ ɾηϯτϥϧΦϑΟεؒͷઓज़తͳઃඋܭը ɾεΠον͔Βͷϝοηʔδͷ؂ࢹͱ਍அ [Notaro 2021]: Notaro P, et al., A Survey of AIOps Methods for Failure Management. ACM TIST, 2021. ɾ1990೥୅ॳ಄͔ΒΦϯϥΠϯͷιϑτ΢ΣΞ΍ϋʔυ΢ΣΞͷނো༧஌ Ϟσϧ͕͍͔ͭ͘ఏҊ͞Ε͍ͯΔɽͦͷଞͷނো๷ࢭํ๏ͳͲ΋ಉ࣌ظ [Cebulka 1989] [Notaro 2021]
  21. 21 ݱ୅ʹ͓͚ΔAIOpsͷߩݙྖҬ [Notaro ’20]: Notaro, P, Jorge C, and Michael

    G. "A Systematic Mapping Study in AIOps.” ICSOC. Springer, Cham, 2020. [Notaro ’20]: Fig.2 Taxonomy of AIOps as observed in the identified contributions 
 ΑΓసࡌ ো֐؅ཧʹؔ͢Δݚڀ Ϧιʔεͷׂ౰ͳͲͷ 
 ࠷దԽʹؔ͢Δݚڀ
  22. 22 AIOpsͷݚڀྖҬ͝ͱͷ࿦จ਺ [Notaro+, ICSOC2020] Notaro, P, Jorge C, and Michael

    G. "A Systematic Mapping Study in AIOps ɾAIOpsؔ࿈ͷ࿦จ਺ɿ670ʢ2020೥࣌఺ʣ ɾ670݅ͷ62.1%͕Failure Managementʢো֐؅ཧʣʹؔ࿈͍ͯ͠Δ ɾো֐༧ଌʢ26.4ˋʣো֐ݕग़ʢ33.7ˋʣݪҼ෼ੳʢ26.7ˋʣ ࿦จ਺͸૿Ճ܏޲ [Notaro+, ICSOC2020]
  23. 23 AIOpsʹ͓͚Δো֐؅ཧͷݚڀ ༧ଌ 
 ༧๷ ݪҼ਍அ ؇࿨ ࣄޙ෼ੳ ݕ஌ म෮

    ͍ͣΕ΋ΦϖϨʔλʔͷ ܦݧ΍௚ײʹґଘ͢ΔλεΫ ݚڀ࿦จ͕ଟ͍λεΫ ௚઀తͳ൑அ΍ૢ࡞ΑΓ͸ ิॿతͳ৘ใࢧԉͷͨΊͷݚڀ͕ ࢧ഑త [Notaro+, TIST2021] A Survey of AIOps Methods for Failure Management [Soldani+, CSUR2022] Anomaly Detection and Failure Root Cause Analysis in (Micro)Service-Based Cloud Applications: A Survey ݹయతػցֶश ਂ૚ֶश CNN/RNN /LSTM/GNN… ౷ܭతҼՌਪ࿦ ౷ܭతػցֶश ϝτϦΫε/ϩά/τϨʔε/Πϕϯτ/Ξ ϥʔτͳͲͷӡ༻σʔλΛಛ௃ྔͱ͢Δ
  24. 24 ΑΓৄࡉͳAIOpsͷݚڀࣄྫ https://speakerdeck.com/yuukit/sre-next-2022 AIOpsݚڀ࿥ʕSREͷͨΊͷγεςϜো֐ͷࣗಈݪҼ਍அ SRE NEXT 2022

  25. 25 ɾ ϋʔυ΢ΣΞ͔Βιϑτ΢ΣΞɺΫϥ΢υ΁ͱ৘ใγεςϜͷܗଶ ͕มભ͢ΔʹͭΕͯɺ৴པੑͷΞϓϩʔν͕αʔϏεࢦ޲΁มભɻ ɾSRE͸ɺΦϖϨʔγϣϯΛࣗಈԽ্ͨ͠Ͱɺ৴པੑࢦඪʹԼݶΛઃ ఆ͠ɺมߋ଎౓ΛߴΊΔɺো֐ڐ༰ΞϓϩʔνͰ͋Δɻ ɾ ଱ো֐ੑͷͨΊͷ֊૚Ϟσϧͷ͏ͪɺ࠷֎֪ͷखಈ੍ޚʹରͯ͠ɺ AIʹΑΔࣗಈԽʢAIOpsʣ͕ݚڀ͞Ε͍ͯΔɻ ɾSREͷݪଇʹ͸ɺAI͕΋ͭՄṩੑΛ৫ΓࠐΈ΍͍͢ͱظ଴͢Δ

    ɾݱࡏ͸ิॿతͳ৘ใࢧԉʹཹ·Δ ·ͱΊɿ1. Ϋϥ΢υʹ͓͚Δ৴པੑΤϯδχΞϦϯά
  26. 1. Ϋϥ΢υʹ͓͚Δ৴པੑΤϯδχΞϦϯά 2. AI࣌୅ʹ͓͚Δ৴པੑΤϯδχΞϦϯάͷະདྷ 3. AIͱͷڠಇʹΑΔ৴པੑΤϯδχΞϦϯάͷݕ౼ 4. ͓ΘΓʹ 26 ΞδΣϯμ

    ݱࡏɺͲ͏ͳͬͯ 
 ͍Δͷ͔ 20೥ઌͷະདྷͰ 
 Ͳ͏͋Γ͍͔ͨ ະདྷͱݱࡏͷࠩΛ 
 ຒΊΔಓے͸ͳʹ͔
  27. 27 དྷͨΔ΂͖AI࣌୅΁޲͚ͯະདྷΛߟ͑Δ ɾ৴པੑ͚ͩΛऔΓ্͛ͯɺະདྷΛޠΔͷ͸೉͍͠ ɾ৴པੑΛཁٻ͢ΔਓʑͱΞϓϦέʔγϣϯͷ͋Γํ͔Βߟ͑Δ ະདྷͷ͋Δ࣌఺ͷئ๬ ͔Β࢝ΊΔ 2040s 2022(ݱࡏ) 2045 ٕज़త

    
 ಛҟ఺ όοΫΩϟεςΟϯάͰ ະདྷ͔Βݱࡏ΁Ḫߦ ϑΣʔζ2 ϑΣʔζ1 ϑΣʔζ3 2045೥ͷγϯΪϡϥ ϦςΟൃੜΛԾఆ
  28. AIʹਓؒͷ࢓ࣄ͕ୣΘΕΔ͚ͩͷ ະདྷ؍͸͓΋͠Ζ͘ͳ͍ ਓؒಉ࢜ͷ૬ޓཧղ͕೉͍͜͠ͱ͔Β AI͕ਓؒͷજࡏతࢥߟΛཧղ͢Δ͜ͱ΋༰қͰ͸ͳ͍͸ͣ

  29. 
 ສਓ͕ࣗΒʹ࠷దԽ͞ΕͨΞϓϦέʔγϣϯΛ 
 AIͱͷର࿩Λ௨ͯࣗ͡༝ʹ੡࡞Մೳͳ࣌୅ 2040೥୅ ηϧϑΫϥϑτʢSelf Craftingʣ AIʹΑΔࣗಈԽΛಥ͖٧ΊΔͱɺٯઆతʹਓؒ͸૑଄తʹͳΔ

  30. 30 ݱࡏʢ2022೥ʣͷΞϓϦέʔγϣϯ։ൃ Ϋϥ΢υ Πϯλʔωοτ ඪ४Խ͞Εͨ 
 ػೳͱ 
 ΠϯλʔϑΣΠε ։ൃऀ

    ඪ४Խࢦ޲ͷੈք ཁૉٕज़ ΞϓϦέʔγϣϯ։ൃऀ͸Ϋϥ΢υ্ͷσʔλߏ଄΍ ܭࢉϢχοτ΋ඪ४Խ͞Εͨ΋ͷΛར༻ ΞϓϦέʔγϣϯ 
 ͷܧଓߋ৽ ඪ४Խ͞Εͨ 
 ௨৴ϓϩτίϧͱAPI αʔϏε ଟ਺ͷར༻ऀʹڞ௨ʹΈΒΕΔજࡏతͳχʔζΛൃݟ ͠ɺඪ४Խ͞ΕͨػೳͱΠϯλʔϑΣΠεΛఏڙ ඪ४Խ͞Εͨ 
 σʔλߏ଄ ඪ४Խ͞Εͨ 
 ܭࢉϢχοτ
  31. 31 ະདྷʢ2040sʣͷΞϓϦέʔγϣϯ։ൃΫϥϑτ ݸผԽࢦ޲ͷੈք ۭؒͱͷΠϯλϥΫγϣϯ ʢXRʣ Πϯλʔωοτ Ϋϥ΢υ ֶशܕ௨৴ 
 ϓϩτίϧ

    [Kraska+, SIGMOD2018] The Case for Learned Index Structures ֶशܕ σʔλߏ଄ ηϧϑ Ϋϥϑτ AI AI AI αʔϏε ར༻ऀͷજࡏతͳχʔζʹࢸΔ· ͰɺAIͱར༻ऀ͕ػೳͱΠϯλʔ ϑΣΠεΛର࿩తɾମݧతʹ࣮૷ ཁૉٕज़ ΞϓϦέʔγϣϯͷཁٻ 
 ʹ͋Θֶͤͨशܕͷݸผ 
 ࠷దԽ ૬ޓ࡞༻ʹ ΑΔਐԽ [Ma+, EuroSys2022] Multi-Objective Congestion Control [Kraska+, SIGMOD2018] [Ma+, EuroSys2022] e2eͰར༻ऀͷཁٻʹԠͯ͡ ࠷దͳϓϩάϥϜͱϓϩτί ϧ͕ಈత͔ͭదԠతʹมԽ
  32. 32 ηϧϑΫϥϑτͷੈքʹ͓͚Δ৴པੑ ༗ݶͷڞ༗ࢿݯ AI AI AI Πϯλʔωοτ Ϋϥ΢υ ࢿݯͷཁٻ ద੾ͳ

    
 ৴པੑ໨ඪΛ 
 ܾఆ͢Δඞཁ͕ 
 ͋Δ ಛఆͷηϧϑΫϥϑτΞϓϦͷ৴པੑ໨ඪΛ100%ʹ͚ۙͮΔ΄Ͳ… ࢿݯফඅˢ: ৑௕ੑ΍Ԡ౴଎౓ΛߴΊΔ΄Ͳɺଟ͘ͷࢿݯΛফඅɻଞऀͷຬ଍ ౓ΛԼ͛ΔՄೳੑ༗Γɻ มߋ଎౓ˣ: ηϧϑΫϥϑτʹΑΓมߋ͢Δ΄Ͳɺมߋޙͷӡ༻σʔλ͕଍Γ ͳ͘ͳΓɺো֐ͷ༧ଌɾ༧๷ਫ਼౓ͳͲ͕௿Լ [Mogul+, HotOS2019] Nines are not enough: Meaningful metrics for clouds [Mogul+, HotOS2019]
  33. AI͸ ۉߧ఺ͱͯ͠ͷ৴པੑ໨ඪΛ ద੾ʹܾఆՄೳ͔ʁ 

  34. 34 ਓ͕ؒAIʹ໋ྩʢએݴʣ͢Δ͜ͱͷݶք [BEATLESS]: ௕୩ හ࢘, B E A T L

    E S S, ݄ץχϡʔλΠϓ, ֯઒ॻళ, 2012೥. ʪ໋ྩ͞ΕΔਓ޻஌ೳͷଆʹཱͬͯɺߟ͑ͯΈ͍ͯͩ͘͞ɻ໋ྩ ͸ɺᐆດͳᶸҙຯᶹͷ૊Έ߹Θͤͱͯ͠༩͑ΒΕɺͦͷᶸҙຯᶹղ ऍ΋·ͨɺ͢΂໋ͯྩΛ༩͑ΔਓؒʹѲΒΕ͍ͯ·͢ɻʜਓ޻஌ೳ ͸ɺͲ͜·Ͱ໋ྩऀͷݴ͏ʰద੾ͳʱղ౴Λग़ͤΔͷͰ͔͢ʁʫ ௒ߴ౓AI ʬώΪϯζʭ [BEATLESS] PHASE13ʮBEATLESSʯΑΓ 
 Ұ෦จࣈ৭Λมߋͯ͠Ҿ༻ ʪʜ͔ͩΒɺࢲʹ͸ʰ৴͡Δʱ͜ͱ͸Ͱ͖·ͤΜɻͦ͏͍͏ಓ۩ͷ ڍಈΛਖ਼֬ʹίϯτϩʔϧ͍ͨ͠ͷͰ͋Ε͹ɺᐆດ͞ͷͳ͍൑அج४ Λ͍ͩ͘͞ʫ ௒ߴ౓AI ʬώΪϯζʭ [BEATLESS] LAST PHASE ʮIMAGE AND LIFEʯΑΓ 
 Ұ෦จࣈ৭Λมߋͯ͠Ҿ༻ ʮΘͨ͠͸ɺΦʔφʔͰ͋ΔΞϥτ͞ΜͷͨΊʹࢿݯΛ഑෼͢ΔίϯτϩʔϥʔͰ͢ɻɹɹ ᶸະདྷΛσβΠϯᶹͯ͠ཉ͍͠ͱ͸ɺ഑෼ͷͨΊͷج४఺Λઃఆͯ͠ཉ͍͠ͱ͍͏͜ͱͰ͢ʯ hIEʬϨΠγΞʭ[BEATLESS] PHASE10ʮPLUS ONEʯΑΓҰ෦จࣈ৭Λมߋͯ͠Ҿ༻
  35. 35 ʰ2001 ೥Ӊ஦ͷཱྀʱHAL 9000 18ষ SREͷͨΊͷػցֶशೖ໳ ͔ΒͷҾ༻ ” ͨͬͨࠓɺAE35Ϣχοτͷো֐Λݕग़͠·ͨ͠ɻ ࢲ͸72࣌ؒҎ಺ʹ100%ͷ֬཰Ͱػೳఀࢭ͠·͢ɻ”

    ― HAL 9000ɺʰ2001 ೥Ӊ஦ͷཱྀʱ “͜ͷөը͕ඳ͘ະདྷΛઌݟͷ໌Λ΋ͬͯߏ૝ͨ͠ͷ͸Ξʔ αʔɾCɾΫϥʔΫ(Arthur C. Clarke)ͰɺγεςϜͱϋʔυ΢Σ Ξͷো֐ൃੜΛԿ࣌ؒ΋લʹ༧ଌͰ͖Δ׬શࣗಈԽαʔϏεͱ AI Λ૊Έ߹Θͤ·ͨ͠ɻHAL 9000 ͸ɺཱࣗͨࣗ͠ݾௐ੔ܕͷ ܽ఺͕ͳ͍ػցͱ͍͏ਓྨͷເ(͋Δ͍͸ѱເ)Ͱ͋Γɺਓؒʹ Αͬͯఆٛ͞Εͨ໨ඪΛୡ੒͢ΔͨΊʹɺӉ஦ધͷ৐һͱϛο γϣϯͷ྆ํʹไ࢓͠·͢ɻ” David N. Blank-Edelmanɹฤɺࢁޱ ೳ᫫ɹ؂༁ɺ౉ᬒ ྃհɹ༁, SREͷ୳ٻʕʕ༷ʑͳاۀʹ͓͚ΔαΠτϦϥΠΞϏϦςΟΤϯδχΞϦ ϯάͷಋೖͱ࣮ફ, ΦϥΠϦʔɾδϟύϯ, 2021೥.
  36. ར༻ऀͱAI͕ɺར༻ऀʹͱͬͯͷ࠷దͳۉߧ఺Λର࿩తʹ୳Δ ద੾ͳ৴པੑ͕ෆ໌ → ର࿩తΞϓϩʔν ৴པੑɺίετɺมߋ଎౓ͳͲͷ 
 ֤มྔͷ഑෼ͷͨΊͷ࠷దۉߧ఺ ར༻ऀ͕͋Δ΂͖ঢ়ଶΛख़ߟͯ͠એݴͤͣʹɺ 
 ൃݟతʹղΛ୳ࡧՄೳ

    Ұ୴ղ͕ऩଋͯ͠΋ɺ ঢ়گͷมԽʹԠͯ͡ɺ ࠶౓ର࿩తऩଋΛߦ͏ 36
  37. 37 ର࿩తΞϓϩʔνʹΑΔௐ੔ͷྫ Ͱ͖Δ͚ͩམͪͳͯ͘ɺಈ࡞΋ܰ͘͠ ͯ΄͍͠ ※Ի੠΍ςΩετʹΑΔର࿩Ҏ֎ͷ਎ମత ͳૢ࡞ʹΑΔର࿩΋͋Γ͑Δ ৴པੑΛݱ࣮తͳϨϕϧͰߴΊΔͱͳ Δͱɺۚમίετ͸ʓʓԁͰ͢ AI ͍΍͍΍ɺߴ͗͢ΔΑ

    ˛˛ػೳͷ৴པੑ໨ඪΛ99.999%͔Β 99.9%ʹ௿Լͤ͞Ε͹ɺίετ͸˘˘ ԁ·Ͱ҆͘ͳΓ·͢ AI े෼҆͘ͳ͚ͬͨͲɺ৴པੑ͕མͪΔ ͷ͸ෆ҆ͳΜ͚ͩͲ Ͱ͸ɺࢼ͠ʹɺࠓ͔Β10෼͚ͩ˛˛ ػೳΛྼԽͤ͞ΔͨΊɺ৴པੑʹෆຬ ͕͋Δ͔൑அ͍ͯͩ͘͠͞ AI ΍ͬͺΓ͜Ε͚ͩΤϥʔ͕ͰΔͱෆศ ͩͶ Ͱ͸ɺ৴པੑ໨ඪΛ99.99%ʹͯ͠ɺ ίετ͸✕✕ԁͰ͸Ͳ͏Ͱ͔͢ʁ AI ʮମݧతʯͳ 
 ௐ੔ϓϩηε
  38. ݱ୅ͱະདྷͷ؍఺ผͷൺֱ HCIͷมԽ ಛ௃ ΞϓϦ 
 έʔγϣϯ ཁૉٕज़ ৴པੑ ݱ୅ 


    2022೥ ฏ໘ͷσΟε ϓϨΠΛհ͠ ͨΠϯλϥΫ γϣϯ ඪ४Խࢦ޲ 
 αʔϏεࢤ ޲ ઐ໳ࣄۀऀ͕λʔ ήοτͱͳΔඪ४ తͳར༻ऀΛ૝ఆ ͯ͠ػೳΛ։ൃ ඪ४Խ͞Εͨ σʔλߏ଄ͱ ϓϩτίϧ ར༻ऀͷߦಈʹؔ ͢Δܭଌࢦඪͷ౷ ܭతཁ໿ʹΑΓܾ ఆ ະདྷ 
 2040s Ծ૝ݱ࣮ɾ֦ ுݱ࣮ɾෳ߹ ݱ࣮ʹର͢Δ ޒײΛ௨ͨ͠ ۭؒͱͷΠϯ λϥΫγϣϯ ݸผԽࢦ޲ 
 Ϋϥϑτࢤ ޲ ར༻ऀ͕ࣗ෼ͷᅂ ޷ʹ͋Θͤͨ࠷ద ͳػೳΛࣗΒ੡࡞ AIͱͷର࿩ʹΑΔ ࣗಈϓϩάϥϛϯ ά ΞϓϦʹ͋Θ ֶͤͨशܕͷ σʔλߏ଄ͱ ϓϩτίϧ ৴པੑͱͦͷଞͷ جຊมྔͱͷۉߧ ఺ΛAIͱର࿩త͔ ͭମݧతʹܾఆ 38
  39. 39 ɾ2040೥୅ɿݸผԽࢦ޲ΞϓϦέʔγϣϯΛສਓ͕ࣗ෼ͷͨΊʹࣗ෼Ͱ ੡࡞ʢηϧϑΫϥϑτʣ͢Δ࣌୅ʹͳͬͯ΄͍͠ ɾࢿݯ͸༗ݶͰ͋ΔͨΊɺݸਓͷޮ༻ΛແݶʹߴΊΔ͜ͱ͸Ͱ͖ͳ͍ ɾར༻ऀ͕৴པੑͱͦͷଞͷجຊతͳมྔؒͷۉߧ఺Λௐ੔͢Δඞཁ͕ ͋Δ ɾਓ͕ؒAIʹۉߧ఺Λ༧Ί໋ྩʢએݴʣ͓ͯ͘͜͠ͱ͸೉͍͠ ɾద੾ͳ৴པੑΛɺར༻ऀݸผʹɺAIͱର࿩త͔ͭମݧతʹܾఆ͢Δ ·ͱΊɿ2. AI࣌୅ʹ͓͚Δ৴པੑΤϯδχΞϦϯάͷະདྷ

  40. 1. Ϋϥ΢υʹ͓͚Δ৴པੑΤϯδχΞϦϯά 2. AI࣌୅ʹ͓͚Δ৴པੑΤϯδχΞϦϯάͷະདྷ 3. AIͱͷڠಇʹΑΔ৴པੑΤϯδχΞϦϯάͷݕ౼ 4. ͓ΘΓʹ 40 ΞδΣϯμ

    ݱࡏɺͲ͏ͳͬͯ 
 ͍Δͷ͔ 20೥ઌͷະདྷͰ 
 Ͳ͏͋Γ͍͔ͨ ະདྷͱݱࡏͷࠩΛ 
 ຒΊΔಓے͸ͳʹ͔
  41. 41 ݱ୅͔Β2040೥୅·Ͱͷ৴པੑΤϯδχΞϦϯά 2040s 2022 2017 2030s 2045 ٕज़త 
 ಛҟ఺

    Gartner͕ 
 AIOpsఏএ ٕज़ऀ͕ 
 AIͱڠಇ AIʹΑΔ ো֐ͷࣗ཯ରԠ ٕज़ऀ͕ 
 ৴པੑΛ੍ޚ ৴པੑ໨ඪ͸ ਓ͕ؒએݴɻ AIʹΑΔݕ஌ ΍਍அͷݶఆ తͳิॿ ݱࡏ e2eͰར༻ऀͷཁٻ ʹԠͯ͡ϓϩάϥϜ ͱϓϩτίϧ͕ಈత ͔ͭదԠతʹਐԽ ٕज़ऀ͕γεςϜ ΞʔΩςΫνϟΛઃ ܭ͠ɺAI͕Ϟδϡʔ ϧΛ࣮૷ɾ࿈݁ ٕज़ऀͱAIʹΑΔ ର࿩తͳো֐༧๷ ΍ճ෮ɻ ӡ༻σʔλͷΦʔ ϓϯԽ͕ਐΉ ηϧϑ 
 Ϋϥϑτ ར༻ऀ͕ 
 ৴པੑΛ੍ޚ ϑΣʔζ2 ϑΣʔζ3 ϑΣʔζ̍ ৴པੑ໨ඪ͸αʔ Ϗεࣄۀऀ͕ܾఆ
  42. ৴པੑ໨ඪ͸ ਓ͕ؒએݴɻ AIʹΑΔݕ஌ ΍਍அͷݶఆ తͳิॿ ٕज़ऀ͕γεςϜ ΞʔΩςΫνϟΛઃ ܭ͠ɺAI͕Ϟδϡʔ ϧΛ࣮૷ɾ࿈݁ ٕज़ऀͱAIʹΑΔ

    ର࿩తͳো֐༧๷ ΍ճ෮ɻ ӡ༻σʔλͷΦʔ ϓϯԽ͕ਐΉ 42 ݱ୅͔Β2040೥୅·Ͱͷ৴པੑΤϯδχΞϦϯά 2040s 2022 2017 2030s 2045 ٕज़త 
 ಛҟ఺ Gartner͕ 
 AIOpsఏএ ٕज़ऀ͕ 
 AIͱڠಇ AIʹΑΔ ো֐ͷࣗ཯ରԠ ٕज़ऀ͕ 
 ৴པੑΛ੍ޚ ݱࡏ ηϧϑ 
 Ϋϥϑτ ར༻ऀ͕ 
 ৴པੑΛ੍ޚ ৴པੑ໨ඪ͸αʔ Ϗεࣄۀऀ͕ܾఆ ϑΣʔζ3 ࠓճͷݕ౼ൣғ ϑΣʔζ2 ϑΣʔζ̍ e2eͰར༻ऀͷཁٻ ʹԠͯ͡ϓϩάϥϜ ͱϓϩτίϧ͕ಈత ͔ͭదԠతʹਐԽ
  43. 43 ਂ૚ֶश؍఺ͰͷAIOpsͷ໰୊ҙࣝ ݱࡏͷAIOpsͰ͸ɺݸผͷγεςϜ͝ͱʹہॴతʹֶशϞσϧΛ࡞੒ ɾଟ਺ͷγεςϜͷσʔλ͔Βֶश͠ɺେҬతͳֶशϞσϧΛ࡞੒͢Ε͹ɺଞ ෼໺ʢCVɺNLPʣͷΑ͏ͳݦஶͳ੒Ռ͕ಘΒΕΔͷͰ͸ͳ͍͔ ɾ͔͠͠ɺαʔϏεࣄۀऀ͸ɺϓϥΠόγʔอޢͷͨΊɺӡ༻σʔλͷެ։ʹ ੵۃతͰͳ͍ ྫ γεςϜX ίϯϙʔωϯτCX

    ϝτϦΫεCXM1 ϝτϦΫεCXM2 ଟมྔ࣌ܥྻͷ ֶशϞσϧ ɾ ɾ ɾ
  44. 44 ϑΣʔζ1ʢٕज़ऀͱAIͷڠಇʣʹ޲͚ͯͷ՝୊ͱཁ݅ 1. ਖ਼ৗظ͕ؒࢧ഑తͰ͋ΓɺҟৗΛֶश͢ΔͨΊͷσʔλ͕ෆ଍ ↪ ཁ݅ᶃ ނҙʹҟৗΛൃੜͤ͞ɺҟৗΛֶशՄೳ 2. ֶशϞσϧ͕ఏࣔ͢Δ༧ଌͷࠜڌ͕ෆ໌ ↪

    ཁ݅ᶄ ༧ଌࠜڌΛΦϖϨʔλʔ͕ཧղՄೳͳݴޠͰఏࣔՄೳ [Soldani+, CSUR2022] Anomaly Detection and Failure Root Cause Analysis in (Micro)Service-Based Cloud Applications: A Survey [Soldani+, CSUR2022] લఏɿগ਺ͷγεςϜͷσʔλ͔ΒͷΈֶश͢Δ ҟৗΛڭ͑Δ ڭΘͬͨ݁ՌΛఏࣔ ڭ͑ͨ͜ͱͷֶ श౓߹͍Λ֬ೝ AI
  45. Interactive AIOps ΦϖϨʔλʔͱAI͕ର࿩తʹର৅γεςϜͷಛ௃Λ 
 ڠಇֶश͢Δίϯηϓτ

  46. 46 ཁ݅ᶃɹ࣮ݧՄೳੑʢExperimentabilityʣ AI ऑ఺ͷൃݟͱֶश Chaos Engineering ͔Βண૝ γεςϜతͳऑ఺Λൃݟ͢ΔͨΊ 
 ʹߦ͏࣮ݧͷԁ׈Խ

    [Rosenthal+, 2020] Chaos Engineering: System Resiliency in Practice [Rosenthal+, 2020] 1. ΦϖϨʔλʔ͸৘ใγεςϜʹނ োΛ஫ೖͨ͠ΓෛՙΛ૿ݮͤ͞Δ 2. ͦͷࡍʹ؍ଌ͞ΕͨσʔλΛAI͕ ֶश͢Δ 3. 1ͱ2ΛҟৗύλʔϯΛม͑ͳ͕Β ܁Γฦ͢ Operator ҟৗΛڭ͑Δ ֓೦ͷ֦ு
  47. 47 ཁ݅ᶄɹղऍੑʢExplainabilityʣ ղऍՄೳͳAIʢXAIʣ AI Operator ※̍ https://speakerdeck.com/tsurubee/a-survey-on-interpretable-machine-learning-and-its-application-for-system-operation ※̍ ڭΘͬͨ݁Ռ Λఏࣔ

    1. ΦϖϨʔλʔ͸ཁ݅ᶃͰͷҟৗͱ ྨࣅͷҟৗΛ࠶ݱ 2. AI͸ҟৗʹରͯ͠ɺ༧ଌ΍ݪҼΛ ͦͷࠜڌʢد༩ͨ͠ಛ௃ྔʣͱͱ ΋ʹฦ͢ ਓؒʹཧղՄೳͳݴ༿Ͱઆ໌· ͨ͸ఏࣔ͢ΔೳྗΛ΋ͭAI [Adadi+, Access2018] Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI) [Adadi+, Access2018] ༧ଌ݁Ռͷ 
 ਖ਼౰Խ ਓؒͱϞσϧؒ Ͱܧଓվળ σόοά ৽ͨͳൃݟ [Adadi+, Access2018] ΑΓFIGURE 5ͷҾ༻
  48. 48 ΑΓൃలతͳAIͱͷڠಇͷՄೳੑ γεςϜֶؒशੑ (Intersystem Learnability) ܇࿅Մೳੑ (Trainability) AI AI͕ఏࣔ͢Δ܇࿅ϓϩάϥϜ 


    Λ༻͍ͯΦϖϨʔλʔ͕ো֐ ରԠ܇࿅ ͋ΔγεςϜ͕ 
 ଞγεςϜʢࣗݾͷաڈؚΉʣ ͷֶश಺༰͔Βֶ΂Δ AI AI సҠ AI సҠֶशʹΑΔ - ֶशͷߴ଎Խ - ֎ૠੑΛ֫ಘ Operator ೳಈֶशʹΑΓաڈͷ σʔλͷϥϕϦϯάΛ ܇࿅ϓϩάϥϜʹ૊Έ ࠐΉͱͯ͠ఏࣔ Target Source [Pan+, TKDE2009] A Survey on Transfer Learning. [Pan+, TKDE2009] [Settles,2009] Active Learning Literature Survey [Settles,2009]
  49. 49 ɾݱ୅͔Β2040೥୅ʢٕज़ಛҟ఺ؚΉʣ·ͰͷಓےΛ3ͭͷϑΣʔζ ʹ෼཭͠ɺࠓճ͸ɺٕज़ऀ͕AIͱڠಇՄೳͳϨϕϧΛݕ౼͢Δɻ ɾӡ༻σʔλΛ޿͘ೖखͰ͖ͳ੍͍໿ͷൣғͰ͸ɺҟৗͷσʔλΛࣗ Β࡞Γग़ֶ͠श͢Δඞཁ͕͋Δɻ ɾΦϖϨʔλʔͱAI͕ର࿩తʹγεςϜͷಛ௃Λڠಇֶश͢Δίϯη ϓτʮInteractive AIOpsʯΛఏএ͢Δɻ ɾର࿩ͷجຊܕ͸ɺ࣮ݧՄೳੑʢAIʹڭ͑ΔʣͱղऍੑʢAI͔Βઆ ໌ʣͰ͋Δɻ

    ·ͱΊɿ3. AIͱͷڠಇʹΑΔ৴པੑΤϯδχΞϦϯάͷݕ౼
  50. 1. Ϋϥ΢υʹ͓͚Δ৴པੑΤϯδχΞϦϯά 2. AI࣌୅ʹ͓͚Δ৴པੑΤϯδχΞϦϯάͷະདྷ 3. AIͱͷڠಇʹΑΔ৴པੑΤϯδχΞϦϯάͷݕ౼ 4. ͓ΘΓʹ 50 ΞδΣϯμ

    ݱࡏɺͲ͏ͳͬͯ 
 ͍Δͷ͔ 20೥ઌͷະདྷͰ 
 Ͳ͏͋Γ͍͔ͨ ະདྷͱݱࡏͷࠩΛ 
 ຒΊΔಓے͸ͳʹ͔
  51. 51 ຊߨԋશମͷ·ͱΊ ݱࡏ ɾSite Reliability Engineering͸ɺ৴པੑΛ੍ޚର৅ͱ͢Δɻ ɾAIOpsͷݚڀ͕׆ൃͰ͋Γͭͭ΋ɺิॿతͳ৘ใࢧԉʹཹ·Δɻ ະདྷ ɾ2040sɿඪ४Խɾએݴࢦ޲͔ΒݸผԽɾର࿩ࢦ޲ͷ࣌୅΁มભɻ ɾ৴པੑΛݸผͷۉߧ఺΁ɺར༻ऀ͕AIͱͷର࿩త͔ͭମݧతʹऩଋɻ

    ಓے ɾAIͱٕज़ऀͷڠಇ → AIʹΑΔো֐ͷࣗ཯ରԠ → ར༻ऀ͕৴པੑΛ੍ޚ ɾσʔλ͕ෆ଍͢ΔલఏͰ͸ɺҟৗΛࣗΒ࡞Γग़͍ͯ͘͠ඞཁ͕͋Δɻ ର࿩ͱମݧʢ࣮ݧʣʹΑΔڠಇతͳ৘ใγεςϜͷ੡࡞ͱ੍ޚɻ γεςϜͷجຊཁૉͰ͋Δ৴པੑʹ΋ٴͿɻ
  52. 52 AIͷೳྗ͕޲্͢ΔʹͭΕͯɺٕज़ऀʹͱͬͯͷϒϥοΫϘοΫεͷ ൣғ͕େ͖͘ͳΔ ຊߏ૝ͷࠓޙͷݕ౼ࣄ߲ [Bainbridge, Pergamon1983] Ironies of Automation. Analysis,

    Design and Evaluation of Man–machine Systems Ironies of Automation [Bainbridge, Pergamon1983] ੍ޚγεςϜ͕ߴ౓ʹͳΕ͹ͳΔ΄ͲɺਓؒͷΦϖϨʔλʔͷߩݙ͕ ΑΓॏཁʹͳΔͱ͍͏ൽ೑ Ͳ͜·ͰAIΛ৴͡Δͷ͔ɺAIࣗମͷ৴པੑʹͲ͏Ξϓϩʔν͢Δ͔
  53. 53 ओͳࢀߟਤॻ

  54. ͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠ ڞಉͰͷٞ࿦ɾݚڀɺ 
 σΟεΧογϣϯɺࢧԉͳͲ 
 Λ͓଴͓ͪͯ͠Γ·͢