Upgrade to Pro — share decks privately, control downloads, hide ads and more …

SREへの機械学習適用に関するサーベイ / A Survey for Cases of App...

SREへの機械学習適用に関するサーベイ / A Survey for Cases of Applying Machine Learning to SRE


Yuuki Tsubouchi (yuuk1)

March 27, 2019

More Decks by Yuuki Tsubouchi (yuuk1)

Other Decks in Research


  1. ͘͞ΒΠϯλʔωοτ גࣜձࣾ (C) Copyright 1996-2019 SAKURA Internet Inc ͘͞ΒΠϯλʔωοτ ݚڀॴ

    SRE΁ͷػցֶशద༻ʹؔ͢Δ αʔϕΠ 2019/03/27 ݚڀһ ௶಺ ༎थ Machine Learning Meetup KANSAI #4 LT @yuuk1t / id:y_uuki
  2. 3 Site Reliability Engineeringͱ͸ ɾReliability = ৴པੑ: Ϣʔβʔ͕շదʹ αʔϏεΛར༻Ͱ͖Δ౓߹͍ ɾίϯϐϡʔλγεςϜͷ৴པੑΛ੍ޚ

    ͢Δ͜ͱΛ໨ࢦͨ͠޻ֶ෼໺ ɾैདྷͷγεςϜ؅ཧΛιϑτ΢ΣΞΤ ϯδχΞϦϯάʹΑΓ࠶ߏங ɾϞχλϦϯά, ΠϯγσϯτରԠ, มߋ ؅ཧ, ΩϟύγςΟϓϥϯχϯά, ϓϩ Ϗδϣχϯά, ޮ཰ͱύϑΥʔϚϯε…
  3. 10 αʔόͷΦʔτεέʔϦϯά ɾ[1]: PerfEnforce: a dynamic scaling engine for analytics

    with performance guarantees ɾRedShiftͷΑ͏ͳOLAPͷΫΤϦηογϣϯதʹ੍ޚث͕໨ඪͱΫΤϦ࣌ؒ Λ্ճΒͳ͍Α͏ʹɺDBαʔόͷ୆਺ΛεέʔϦϯάͤ͞ΔΤϯδϯ ɾ༧ଌతख๏ͱͯ͠ΦϯϥΠϯֶश(ύʔηϓτϩϯ)ɺ൓Ԡతख๏ͱͯ͠ڧԽ ֶश(Qֶश)·ͨ͸ϑΟʔυόοΫ੍ޚ(PI)Λར༻͠ൺֱ͢Δ [3]: Figure 1. PerfEnforce deployment ɾධՁͷ݁Ռɺύʔηϓτϩϯ͕ྑ͍ ݁Ռͱͳͬͨ ɾ൓Ԡతख๏͸΍͸ΓಥൃతͳมԽ ΁ͷରԠ͕஗͍
  4. 11 Ϋϥ΢υϦιʔε੍ޚ ɾ[2]: Self-Adaptive and Self-Configured CPU Resource Provisioning for

    Virtualized Servers Using Kalman Filters ɾΧϧϚϯϑΟϧλʔʹΑΓɺదԠతʹVMͷCPUϦιʔεΛׂΓ౰ͯΔ [2]: Figure 1. Virtualized prototype and control system. ɾCPUݸ਺Λ੩తʹܾఆ͍ͯͯ͠Ϧιʔε͕ ଍Γͳ͔ͬͨΓ༨Δ໰୊͕͋Δ ɾController͕VMͷCPU࢖༻཰ΛτϥοΩϯ ά͠ɺᮢ஋ʹୡ͢ΔͱɺΧϧϚϯϑΟϧλʔ ʹै͍ɺCPU਺Λมߋ͢Δ
  5. 12 ϛυϧ΢ΣΞઃఆͷࣗಈνϡʔχϯά ɾ[3]: Automatic Database Management System Tuning Through Large-Scale

    Machine Learning. ɾMySQL/PostgresͷઃఆΛࣗಈνϡʔχϯάɻઐ໳Ոͷઃఆʹ͍ۙੑೳʹɻ ɾϝτϦοΫΛҼࢠ෼ੳ͠ɺK-MeansΫϥελϦϯάͯ͠ॏཁͳ΋ͷΛநग़ ɾLassoʹΑΓγεςϜશମͷੑೳʹରͯ͠૬ؔͷେ͖͍ઃఆ߲໨Λಛ௃બ୒ ɾνϡʔφʔ͕ઃఆΛมߋ࣮ͭͭ͠ࡍʹܭଌͯ͠ྑ͍஋Λܾఆ [3]: Figure 4.
  6. 13 αʔϕΠ࿦จ ɾ[4]: A Control Theoretical View of Cloud Elasticity:

    Taxonomy, Survey and Challenges (2018) ɾΫϥ΢υͷ৳ॖੑʹ੍ޚཧ࿦ͷख๏Λద༻ͨ͠ݚڀΛ·ͱΊͨαʔϕΠ ɾػցֶशΑΓ΋ϑΟʔυόοΫ੍ޚ΍ϑΝδʔ੍ޚ͕த৺ ɾ[5]: Adaptation in Cloud Resource Configuration: A Survey (2016) ɾΫϥ΢υͷϦιʔεઃఆ΁దԠతख๏Λద༻ͨ͠ݚڀΛ·ͱΊͨαʔϕΠ ɾώϡʔϦεςΟοΫɺ੍ޚཧ࿦ɺػցֶशɺ଴ͪߦྻཧ࿦ʹ෼ྨ ɾ[6]: Resource Management in Clouds: Survey and Research Challenges (2015) ɾ[7]: What Does Control Theory Bring to Systems Research? (2009)
  7. 21 ࢀߟจݙ ɾ[1]: ORTIZ, Jennifer, et al. PerfEnforce: a dynamic

    scaling engine for analytics with performance guarantees. arXiv preprint arXiv:1605.09753, 2016. ɾ[2]: KALYVIANAKI, Evangelia; CHARALAMBOUS, Themistoklis; HAND, Steven. Self-adaptive and self-configured CPU resource provisioning for virtualized servers using Kalman filters. In: Proceedings of the 6th international conference on Autonomic computing. ACM, 2009. p. 117-126. ɾ[3]: VAN AKEN, Dana, et al. Automatic database management system tuning through large-scale machine learning. In: Proceedings of the 2017 ACM International Conference on Management of Data. ACM, 2017. p. 1009-1024.
  8. 22 ࢀߟจݙ ɾ[4]: ULLAH, Amjad, et al. A control theoretical

    view of cloud elasticity: taxonomy, survey and challenges. Cluster Computing, 2018, 21.4: 1735-1764. ɾ[5]: HUMMAIDA, Abdul R.; PATON, Norman W.; SAKELLARIOU, Rizos. Adaptation in cloud resource configuration: a survey. Journal of Cloud Computing, 2016, 5.1: 7. ɾ[6]: JENNINGS, Brendan; STADLER, Rolf. Resource management in clouds: Survey and research challenges. Journal of Network and Systems Management, 2015, 23.3: 567-619. ɾ[7]: ZHU, Xiaoyun, et al. What does control theory bring to systems research?. ACM SIGOPS Operating Systems Review, 2009, 43.1: 62-69.