Upgrade to Pro — share decks privately, control downloads, hide ads and more …

SREへの機械学習適用に関するサーベイ / A Survey for Cases of Applying Machine Learning to SRE

SREへの機械学習適用に関するサーベイ / A Survey for Cases of Applying Machine Learning to SRE

MACHINE LEARNING Meetup KANSAI #4 LT
https://mlm-kansai.connpass.com/event/119084/

Yuuki Tsubouchi (yuuk1)

March 27, 2019
Tweet

More Decks by Yuuki Tsubouchi (yuuk1)

Other Decks in Research

Transcript

  1. ͘͞ΒΠϯλʔωοτ גࣜձࣾ (C) Copyright 1996-2019 SAKURA Internet Inc ͘͞ΒΠϯλʔωοτ ݚڀॴ

    SRE΁ͷػցֶशద༻ʹؔ͢Δ αʔϕΠ 2019/03/27 ݚڀһ ௶಺ ༎थ Machine Learning Meetup KANSAI #4 LT @yuuk1t / id:y_uuki
  2. 3 Site Reliability Engineeringͱ͸ ɾReliability = ৴པੑ: Ϣʔβʔ͕շదʹ αʔϏεΛར༻Ͱ͖Δ౓߹͍ ɾίϯϐϡʔλγεςϜͷ৴པੑΛ੍ޚ

    ͢Δ͜ͱΛ໨ࢦͨ͠޻ֶ෼໺ ɾैདྷͷγεςϜ؅ཧΛιϑτ΢ΣΞΤ ϯδχΞϦϯάʹΑΓ࠶ߏங ɾϞχλϦϯά, ΠϯγσϯτରԠ, มߋ ؅ཧ, ΩϟύγςΟϓϥϯχϯά, ϓϩ Ϗδϣχϯά, ޮ཰ͱύϑΥʔϚϯε…
  3. 10 αʔόͷΦʔτεέʔϦϯά ɾ[1]: PerfEnforce: a dynamic scaling engine for analytics

    with performance guarantees ɾRedShiftͷΑ͏ͳOLAPͷΫΤϦηογϣϯதʹ੍ޚث͕໨ඪͱΫΤϦ࣌ؒ Λ্ճΒͳ͍Α͏ʹɺDBαʔόͷ୆਺ΛεέʔϦϯάͤ͞ΔΤϯδϯ ɾ༧ଌతख๏ͱͯ͠ΦϯϥΠϯֶश(ύʔηϓτϩϯ)ɺ൓Ԡతख๏ͱͯ͠ڧԽ ֶश(Qֶश)·ͨ͸ϑΟʔυόοΫ੍ޚ(PI)Λར༻͠ൺֱ͢Δ [3]: Figure 1. PerfEnforce deployment ɾධՁͷ݁Ռɺύʔηϓτϩϯ͕ྑ͍ ݁Ռͱͳͬͨ ɾ൓Ԡతख๏͸΍͸ΓಥൃతͳมԽ ΁ͷରԠ͕஗͍
  4. 11 Ϋϥ΢υϦιʔε੍ޚ ɾ[2]: Self-Adaptive and Self-Configured CPU Resource Provisioning for

    Virtualized Servers Using Kalman Filters ɾΧϧϚϯϑΟϧλʔʹΑΓɺదԠతʹVMͷCPUϦιʔεΛׂΓ౰ͯΔ [2]: Figure 1. Virtualized prototype and control system. ɾCPUݸ਺Λ੩తʹܾఆ͍ͯͯ͠Ϧιʔε͕ ଍Γͳ͔ͬͨΓ༨Δ໰୊͕͋Δ ɾController͕VMͷCPU࢖༻཰ΛτϥοΩϯ ά͠ɺᮢ஋ʹୡ͢ΔͱɺΧϧϚϯϑΟϧλʔ ʹै͍ɺCPU਺Λมߋ͢Δ
  5. 12 ϛυϧ΢ΣΞઃఆͷࣗಈνϡʔχϯά ɾ[3]: Automatic Database Management System Tuning Through Large-Scale

    Machine Learning. ɾMySQL/PostgresͷઃఆΛࣗಈνϡʔχϯάɻઐ໳Ոͷઃఆʹ͍ۙੑೳʹɻ ɾϝτϦοΫΛҼࢠ෼ੳ͠ɺK-MeansΫϥελϦϯάͯ͠ॏཁͳ΋ͷΛநग़ ɾLassoʹΑΓγεςϜશମͷੑೳʹରͯ͠૬ؔͷେ͖͍ઃఆ߲໨Λಛ௃બ୒ ɾνϡʔφʔ͕ઃఆΛมߋ࣮ͭͭ͠ࡍʹܭଌͯ͠ྑ͍஋Λܾఆ [3]: Figure 4.
  6. 13 αʔϕΠ࿦จ ɾ[4]: A Control Theoretical View of Cloud Elasticity:

    Taxonomy, Survey and Challenges (2018) ɾΫϥ΢υͷ৳ॖੑʹ੍ޚཧ࿦ͷख๏Λద༻ͨ͠ݚڀΛ·ͱΊͨαʔϕΠ ɾػցֶशΑΓ΋ϑΟʔυόοΫ੍ޚ΍ϑΝδʔ੍ޚ͕த৺ ɾ[5]: Adaptation in Cloud Resource Configuration: A Survey (2016) ɾΫϥ΢υͷϦιʔεઃఆ΁దԠతख๏Λద༻ͨ͠ݚڀΛ·ͱΊͨαʔϕΠ ɾώϡʔϦεςΟοΫɺ੍ޚཧ࿦ɺػցֶशɺ଴ͪߦྻཧ࿦ʹ෼ྨ ɾ[6]: Resource Management in Clouds: Survey and Research Challenges (2015) ɾ[7]: What Does Control Theory Bring to Systems Research? (2009)
  7. 21 ࢀߟจݙ ɾ[1]: ORTIZ, Jennifer, et al. PerfEnforce: a dynamic

    scaling engine for analytics with performance guarantees. arXiv preprint arXiv:1605.09753, 2016. ɾ[2]: KALYVIANAKI, Evangelia; CHARALAMBOUS, Themistoklis; HAND, Steven. Self-adaptive and self-configured CPU resource provisioning for virtualized servers using Kalman filters. In: Proceedings of the 6th international conference on Autonomic computing. ACM, 2009. p. 117-126. ɾ[3]: VAN AKEN, Dana, et al. Automatic database management system tuning through large-scale machine learning. In: Proceedings of the 2017 ACM International Conference on Management of Data. ACM, 2017. p. 1009-1024.
  8. 22 ࢀߟจݙ ɾ[4]: ULLAH, Amjad, et al. A control theoretical

    view of cloud elasticity: taxonomy, survey and challenges. Cluster Computing, 2018, 21.4: 1735-1764. ɾ[5]: HUMMAIDA, Abdul R.; PATON, Norman W.; SAKELLARIOU, Rizos. Adaptation in cloud resource configuration: a survey. Journal of Cloud Computing, 2016, 5.1: 7. ɾ[6]: JENNINGS, Brendan; STADLER, Rolf. Resource management in clouds: Survey and research challenges. Journal of Network and Systems Management, 2015, 23.3: 567-619. ɾ[7]: ZHU, Xiaoyun, et al. What does control theory bring to systems research?. ACM SIGOPS Operating Systems Review, 2009, 43.1: 62-69.