Upgrade to Pro — share decks privately, control downloads, hide ads and more …

工学としてのSRE再訪 / Revisiting SRE as Engineering

工学としてのSRE再訪 / Revisiting SRE as Engineering

SRE NEXT 2024 IN TOKYO.

Yuuki Tsubouchi (yuuk1)

August 03, 2024
Tweet

More Decks by Yuuki Tsubouchi (yuuk1)

Other Decks in Research

Transcript

  1. 2 Yuuki TSUBOUCHI / yuuk1 ͘͞ΒΠϯλʔωοτݚڀॴɹ্ڃݚڀһ TopotalɹςΫϊϩδΞυόΠβʔ ژ౎େֶେֶӃ ৘ใֶݚڀՊ Ph.D.

    Candidate https://yuuk.io/ SRE NEXTొஃྺ 2020 SREͷ૯࿦ͱऔΓ૊ΜͰ͍Δݚڀͷ࿩ جௐߨԋ 2022 ެื 2023 ެื AIOpsݚڀ࿥ SRE࿦จ΁ͷট଴ SREͷݚڀऀ
  2. l΢Σ ブ Φ ペ Ϩʔγϣϯ͸ٕܳ で ͋ΓɺՊֶ で ͸ͳ͍z ※1

    John AllspawɺJesse Robbinsฤɺ֯ ੐య༁,΢ΣϒΦϖϨʔγϣϯʔʔαΠτӡ༻؅ཧͷ࣮ફςΫχοΫ,ΦϥΠϦʔδϟύϯ ※1
  3. 5೥ͷΤϯδχΞܦݧΛجʹͨ͠ ࣗ෼ͳΓͷSRE࿦ 11 2019೥SREߟ https://blog.yuuk.io/entry/2019/thinking-sre ٕ͔ܳΒ޻ֶ΁ SRE͸৴པੑΛఆྔԽ͠ɺద੾ͳ஋ʹ੍ޚ SRE͸Software Engineering ※1

    Beyer, Betsy, et al., “Site reliability engineering: How Google runs production systems.”, O'Reilly Media, Inc., 2016. ※1 Figure III-1 ৴པੑ੍ޚͷߏ଄Λ֊૚Խ͠ɺશମ၆ᛌ
  4. 12 ͳʹ͕มΘͬͨͷ͔ʁ SREจ຺ʹ͓͚Δ޻ֶੑ ͋Δ΂͖ঢ়ଶ ࠓͷঢ়ଶ ࠩ෼ ԿΛ͢΂͖͔ ɾaction 1 ɾaction

    2 ɾ… ࠓͷঢ়ଶ खஈࢦ޲ ໨తࢦ޲ ओ؍త શମ၆ᛌ ٬؍త ہॴࢹ఺ ٕܳత ޻ֶత ※1 ”։ൃऀͷͨΊͷγεςϜζΤϯδχΞϦϯάಋೖͷનΊ”, ୈ1.1൛, IPA, 2017೥. ※1 ※1
  5. ※1 γʔφɾΞΠΤϯΨʔ (ஶ), ᓎҪ༞ࢠ (຋༁), “THINK BIGGER ʮ࠷ߴͷൃ૝ʯΛੜΉํ๏ɿίϩϯϏΞେֶϏδωεεΫʔϧಛผߨٛ”, χϡʔζϐοΫε, 2023೥.

    14 Կͷ࿩Λ͢Δͷ͔ʁ SREΛ޻ֶͱͯ͠࠶๚͢Δ ະղܾͷ޻ֶత՝୊ʢΦʔϓϯνϟϨϯδʣ Λٞ࿦͍ͨ͠ ͔͠͠ɺࠃ಺ίϛϡχςΟͰ͸ ޻ֶԽͷഎܠ͸ޠΒΕ͍ͯͳ͍
  6. 15 ࠶๚ͷಓॱ ޻ֶԽͷྺ࢙తഎܠ ະղܾ՝୊ͷྫ 3. SREʹ઀ଓ͞ΕΔ෼໺ SREcon͔Β୳ࡧ ਎ۙͳ՝୊ͱࠜຊ΁ͷ໰͍ ະൃݟ՝୊΁ͭͳ͕Δ໰͍΁ ※1

    γʔφɾΞΠΤϯΨʔ (ஶ), ᓎҪ༞ࢠ (຋༁), “THINK BIGGER ʮ࠷ߴͷൃ૝ʯΛੜΉํ๏ɿίϩϯϏΞେֶϏδωεεΫʔϧಛผߨٛ”, χϡʔζϐοΫε, 2023೥. ശͷ֎Λ୳͢ ※1
  7. 17 l΢Σ ブ Φ ペ Ϩʔγϣϯ͸ٕܳ で ͋ΓɺՊֶ で ͸ͳ͍z

    ΢ΣϒΦϖϨʔγϣϯ (ݪஶ 2010೥ൃץ) ఆٛɿl*5γεςϜ؅ཧͷઐ໳෼໺ で ɺ΢Σ ブ Ξ プ Ϧέʔγϣϯͷ։ ൃɾӡӦɾอकɾௐ੔ɾमཧΛؚΉz lʮਖ਼͍͠ํ๏ʯ͸ ど ͜ʹ΋ଘࡏ͠ͳ͍ɻͦ͜ʹ͋Δͷ͸ɺ ͱ Γ͋͑ ず ࠓ͸ ͏·͍͘͘ͱ͍͏ࣄ࣮ͱɺ࣍͸΋ͬͱྑ͘͢Δͱ ͍͏֮ޛ だ ͚ だ ɻz lωοτϫʔΫɾϧʔςΟϯ グ ɾεΠονϯ グ ɾϑΝΠΞ΢Υʔ ϧɾෛՙ෼ࢄɾߴՄ༻ੑɾো֐෮چɾ5$1΍6%1ͷαʔ ビ εɾ /0$ͷ؅ཧɾϋʔ ド ΢ΣΞ࢓༷ɾෳ਺ͷ6OJY؀ڥɾෳ਺ͷ΢Σ ブ αʔ バ ٕज़ɾΩϟογϡٕज़ɾ デ ʔλ ベ ʔεٕज़ɾετϨʔ ジ Πϯϑϥɾ҉߸ٕज़ɾΞϧ ゴ Ϧ ズ Ϝɾ܏޲෼ੳɾΩϟ パ γςΟ ܭըཱҊͳ ど Λਂ͘ཧղ͍ͯ͠ͳ͚Ε ば ͳΒͳ͍ɻz ※1 John AllspawɺJesse Robbinsฤɺ֯ ੐య༁,΢ΣϒΦϖϨʔγϣϯʔʔαΠτӡ༻؅ཧͷ࣮ફςΫχοΫ,ΦϥΠϦʔδϟύϯ ※1
  8. 18 SREBook (ݪஶ 2016೥ൃץʣ l<#VS>ͷಋೖ෦ で ɺγεςϜ؅ཧ͸ώϡʔϚϯίϯ ピ ϡʔλΤϯ ジ

    χΞϦϯ グ ͷܗͷҰͭ だ ͱࢲ͸ओு͠· ͨ͠ɻϨ ビ ϡʔΞͷதʹ͸ʮ· だ ͦΕ͸Τϯ ジ χΞϦϯ グ ͱݺ べ Δ΄ ど ͷஈ֊ʹ͸དྷ͍ͯͳ͍ʯͱڧ͘൱ఆ͢Δ ਓ΋͍·ͨ͠ɻ͜ͷ࣌఺ で ͸ɺࢲ͸͜ͷ෼໺͸ݟࣦΘΕ ͯɺಠࣗͷຐज़ࢣతͳจԽʹͱΒΘΕɺਐΉ べ ͖ํ޲ が ݟ͑ͳ͘ͳ͍ͬͯΔͱײ じ ͍ͯ·ͨ͠ɻl Mark Burgess ཧ࿦෺ཧֶͷPh.D.ɺChef/PuppetͳͲͷલ਎ ͷCFEngine࡞ऀɻ ίϯϐϡʔλ໔Ӹֶ΍ϓϩϛεཧ࿦ͷఏএऀɻ Network and system administration is a branch of *engineering* that concerns the operational management of human–computer systems. [Bur99]: “Principles of Network and System Administration”, 1999 ΑΓҾ༻ ※1 Betsy Beyer [΄͔] ฤ ; ۄ઒ཽ࢘༁, "SREαΠτϦϥΠΞϏϦςΟΤϯδχΞϦϯά : Googleͷ৴པੑΛࢧ͑ΔΤϯδχΞϦϯάνʔϜ”, ΦϥΠϦʔɾδϟύϯ, 2016೥. ※1
  9. 20 ɾ1987೥͔Β͸͡·ͬͨγεςϜ؅ཧʹؔ͢Δࠃࡍձٞ ɾ2022೥ʹUSENIX SREcon΁౷߹͞ΕΔ USENIX LISA “γεςϜ؅ཧ͸ίϯϐϡʔςΟϯάͷଞͷଟ͘ͷ෼໺ͱॏෳ͍ͯ͠Δ ͨΊɺֶज़քͰ͸Ұൠతʹ࿬໾ͱͯ͠๨ΕΒΕ͍ͯΔɻ”ʢ༁ʣ ※2 Mark

    Burgess, Computer Immunology, USENIX LISA 1998. USENIX board: “͍΍ɺզʑ͸ֶऀͩ͠ɺγεςϜ؅ཧʹ͸Պֶత ͳ͜ͱ΋ݚڀతͳ͜ͱ΋໘ന͍͜ͱ΋Կ΋ͳ͍ɻ”ʢ༁ʣ ※2 ※1 Thomas Limoncelli, “LISA made LISA obsolete (That's a compliment!)”, 2022. https://www.usenix.org/publications/loginonline/lisa- made-lisa-obsolete-thats-compliment ※1
  10. 21 The Morning Paper on Operability l͜ͷߨԋʹ͍ͭͯߟ͑࢝Ίͨͱ ͖ɺ࿦จͷେ൒͸ʢগͳ͘ͱ΋ࢲ ͕ಡΜͩ࿦จͷେ൒͸ʣӡ༻্ͷ ໰୊ʹ͍ͭͯ͋·Γ৮Ε͍ͯͳ͍

    ͱ͍͏ҹ৅Λ࣋ͪ·ͨ͠ɻ͔͠ ͠ɺࣗ෼ͷίϨΫγϣϯΛৼΓ ฦͬͯΈΔͱɺӡ༻ʹؔ࿈͢Δ໰ ୊ʹ৮Ε͍ͯΔ࿦จ͕͍͔ʹଟ͍ ͔ʹڻ͔͞Ε·ͨ͠ɻzʢ༁ʣ https://blog.acolyer.org/2016/09/21/the-morning-paper-on-operability/
  11. 22 l΋ͱ΋ͱ͸ɺೃછΈਂ͍ʮιϑτ΢ΣΞΤϯδχΞͳΜ͔ͩΒɺ ܁Γฦ͠ͷ࡞ۀͳΜ͔͸͜͏΍ͬͯย෇͚͍ͨʯͱ͍͏ൃ૝͕ಈػ ͩͬͨ΋ͷͷɺαΠτϦϥΠΞϏϦςΟΤϯδχΞϦϯά͸ࠓ͸ͦ ΕҎ্ͷ΋ͷɺ͢ͳΘͪҰ࿈ͷࢦ਑ɺϓϥΫςΟεɺಈػ෇͚ɺͦ ͯ͠ιϑτ΢ΣΞΤϯδχΞϦϯάͱ͍͏޿େͳྖҬͷதͷ஫ྗ෼ ໺ͱͳͬͨͷͰ͢ɻz SREBook (ݪஶ 2016ʣ

    1ষ ΠϯτϩμΫγϣϯ 1.4 ࢝·ΓͷऴΘΓ ͔ΒҰ෦ൈਮͯ͠సࡌ ※1 Betsy Beyer [΄͔] ฤ ; ۄ઒ཽ࢘༁, "SREαΠτϦϥΠΞϏϦςΟΤϯδχΞϦϯά : Googleͷ৴པੑΛࢧ͑ΔΤϯδχΞϦϯάνʔϜ”, ΦϥΠϦʔɾδϟύϯ, 2016೥.
  12. 24 ޻ֶࢥߟʹجͮ͘ʢͱࢥΘΕΔʣ୅දతߩݙͷҰ෦ ৴པੑͷ༧ࢉԽ ։ൃੜ࢈ੑͷࢦඪ ։ൃ૊৫ͷઃܭ ΤϥʔόδΣοτ ʹجͮ͘ҙࢥܾఆ๏ DORA΍SPACEͳͲʹΑΔ ։ൃੜ࢈ੑͷఆྔతͳࢦඪԽ๏ Team

    TopologiesʹΑΔιϑτ ΢ΣΞϓϩμΫτͷదԠܕ૊৫ ઃܭ๏ ※2 Skelton, Matthew, and Manuel Pais, “Team Topologies: Organizing Business and Technology Teams for Fast Flow”, IT Revolution, 2019. ※1 N. Forsgren, H. Jez Humble, and K. Gene, “Accelerate: The science of lean software and devops: Building and scaling high performing technology organizations”, IT Revolution, 2018. ※2 ※1 ΦϒβʔόϏϦςΟ ςϨϝτϦʔʹجͮ͘ ԋ៷ʹΑΔσόοά๏
  13. ϩʔΧϧͷϝϞϦͷόοϑΝϓʔϧʹ τϨʔεΛҰఆྔอ࣋ͭͭ͠ɺݕग़ޙ ʹશϊʔυ͔ΒḪͬͯσʔλऩू 28 τϨʔεσʔλ͕΄ͱΜͲࢀর͞Εͳ͍໰୊ γάφϧͷ ࡉཻ౓Խ ίετͱ Φʔόʔϔου ૿େ

    Τοδέʔεͷ ݟಀ͠ αϯϓϦϯά ※1 Paige Cruz, “99.99% of Your Traces Are (Probably) Trash", SREcon24 Americas, 2024. ※2 Zhang, Lei et al, “The Bene fi t of Hindsight: Tracing Edge-Cases in Distributed Systems.”, NSDI, 2022. ͋Δ΂͖ঢ়ଶɿো֐ൃੜલޙ͚ͩτϨʔε͢Ε͹͍͍ͷͰ͸ʁ ※2 ※1
  14. 29 TTRʢTime to ResolveʣͷϘτϧωοΫΛఆྔతʹಛఆ͢Δ ΠϯγσϯτରԠͷվળ͕ͳ͔ͳ͔Ͱ͖ͳ͍໰୊ Xiaoyun Li, et al., “Going

    through the Life Cycle of Faults in Clouds: Guidelines on Fault Handling”, ISSRE’22. Fig. 2ΑΓసࡌ 1. ϥΠϑαΠΫϧͷ֤ஈ֊Ͱͷॴཁ࣌ؒΛܭଌ͢Δ 2. ֤ஈ֊ͰɺྨࣅͷཁҼͰॴཁ͕࣌ؒେ͖͍ՕॴΛಛఆ͢Δ 3. ࠷େͷՕॴ͔Β༏ઌͯࠜ͠ຊతͳվળΛߦ͏
  15. 31 SLO͍ͬͯΖΜͳҙࢥܾఆʹ࢖͑ΔͷͰ͸ʁ ΤϥʔόδΣοτ ࢒ྔ ݪҼಛఆ or ෮چ༏ઌʁ ࣄޙͷࠜຊରࡦΛ ࣮ࢪ͢Δ or

    ͠ͳ͍ʁ ৑௕౓Λ૿Ճ or ݮগͤ͞Δʁ . . . ࢒ྔΛ࢖͍੾Βͳ͍Α͏ʹ దԠతʹ੍ޚ͢Δ Ξϥʔτͷ௥Ճ or ࡟আʁ
  16. 33 SLO͔ΒγεςϜΞʔΩςΫνϟΛಋग़Ͱ͖Δ͸ͣʁ ※1 ࢁޱ ೳ᫫, “৴པੑ໨ඪͱγεςϜΞʔΩςΫνϟʔ”, SRE NEXT 2023.https://speakerdeck.com/ymotongpoo/reliability-objective-and-system-architecture SLOs

    ※1, ※2 ※2 r9y, https://r9y.dev Workloads System Architecture ΩϟύγςΟ ߴՄ༻ੑ ෛՙ෼ࢄ Ωϟογϯά ඇಉظԽ Data Structure ΠϯγσϯτରԠମ੍
  17. 37 ɾ෼ࢄγεςϜ ɾ৴པੑ޻ֶ /ϨδϦΤϯε޻ֶ / ҆શ޻ֶ ɾࣾձֶ ɾೝ஌Պֶ SREcon͔Β઀ଓΛ୳͢ SREconʹ͸ΞΧσϛοΫͳഎܠΛؚΉϓϨθϯ͕Ұఆ਺͋Δ

    ෼໺͸όϦΤʔγϣϯʹ෋Ή ݚڀऀɺPh.D.औಘऀɺത࢜՝ఔֶੜ͕ొஃྫ΋গͳ͘ͳ͍ “site:https://www.usenix.org PhD SREcon“
  18. 38 ΞϓϦʹো֐͕ൃੜͯ͠΋ɺো֐ݕग़ث͕ؾ͔ͮͳ͍໰୊ Gray Failure Ze Li, and Ryan Huang, “Gray

    Failure: The Achilles’ Heel of Cloud-Scale Systems”, SREcon24 Americas @SREcon24 Americas HotOS’17ͳͲͷֶज़ͷ ࠃࡍձٞͰൃද͞Εͨ Gray Failure໰୊ʹ͍ͭ ͯͷέʔεελσΟ SREcon23 EMEA, SREcon22 EMEA Ͱ΋औΓ্͛ΒΕͨ ߨԋͰ͸εΩοϓ
  19. 39 ͋ΔτϦΨʔ͕γεςϜΛѱԽͤ͞ɺͦͷτϦΨʔΛऔΓআ͍ͯ΋ѱԽ ͨ͠··ʹͳΔ໰୊ Metastable Failure @SREcon23 Americas Kyle Lexmond, “We're

    Still Down: A Metastable Failure Tale”, SREcon23 Americas ※1 Bronson, Nathan Grasso et al., “Metastable failures in distributed systems.” HotOS’21. ݩʑ͸ࠃࡍձٞͷHotOS’21 Ͱఏࣔ͞Εͨো֐ύλʔϯ ※1 ߨԋͰ͸εΩοϓ
  20. 40 γεςϜؒ૬ޓ࡞༻ʹ༝དྷ͢Δো֐ΛମܥԽ͠ɺ๷͙ͨΊͷςετͱ ݕূख๏ΛఏҊ Cross-System Interaction Failures @SREcon23 Americas ※1 Tang,

    Lilia et al., “Fail through the Cracks: Cross-System Interaction Failures in Modern Cloud Systems.”, EuroSys 2023. ※1 ݩ࿦จ͸ࠃࡍձٞͷ EuroSys’23Ͱൃද͞Εͨɻ ஶऀͷҰਓ͕SREconͰ΋ൃ ද͍ͯ͠Δɻ ߨԋͰ͸εΩοϓ
  21. 42 MetaͷΤϯδχΞʹΠϯλϏϡʔ΍ΞϯέʔτௐࠪΛ͠ɺ৴པੑͷจԽ Λྔతɾ࣭తʹܭଌ͢Δɻ Measuring Reliability Culture @SREcon24 Americas Kathryn (Casey)

    Bouskill, “Measuring Reliability Culture to Optimize Tradeoffs: Perspectives from an Anthropologist”, SREcon24 Americas 54%ͷνʔϜ͕ ”Find it hard to identify reliability gaps” ൃදऀ͸ਓྨֶͷത࢜߸ͱӸֶͷम࢜߸Λ΋ͭɻ ৴པੑ޲্ͷͨΊͷ۩ମతͳΞΫ γϣϯ͕໌֬Ͱͳ͍ɺ·ͨ͸༏ઌ ॱҐ෇͚͕೉͍͠ͱ͍͏՝୊
  22. 43 ਓؒΛഉআͯࣗ͠ಈԽ͢Δ΄Ͳɺਓؒʹߴ౓ͳεΩϧΛཁٻ͢Δൽ೑ Ironies of Automation Tanner Lund, “Ironies of Automation:

    A Comedy in Three Parts”, SREcon19 Asia. @SREcon19 Asia ೝ஌৺ཧֶऀͷBainbridgeʹ ΑΔ1983೥ͷ࿦จ ※1 L. Bainbridge, “Ironies of Automation”, Automatica, Vol.19, No.6, pp.775–779 1983. ※2 B. Strauch, "Ironies of Automation: Still Unresolved After All These Years". IEEE Transactions on Human-Machine Systems, Vol.48, No.5, pp.419–433 2018. 2018೥Ͱ΋ଓ͘໰୊Ͱ͋Δ ※1 ※2 ࣗಈԽγεςϜ͕ਓؒͷೳ ྗෆ଍ΛӅṭͯ͠͠·͏৽ ͍͠ൽ೑Λఏࣔ
  23. 44 ɾൃදऀͷDavid Woods͸ϨδϦΤϯε޻ֶͷେՈ ɾෳࡶͳγεςϜͷঢ়ଶΛ௚ײతʹཧղ͢ΔProcess FeelͷॏཁੑΛఏএ Process Feeling @SREcon21 David D.

    Woods, Laura Nolan, You've Lost That Process Feeling: Some Lessons from Resilience Engineering, SREcon21 2021. ɾݪࢠྗൃిॴͷΦϖϨʔλʔ͸ɺ ੍ޚγεςϜͷΧ΢ϯλʔͷҰఆ ϕʔεͷԻͰਖ਼ৗੑΛײ֮తʹཧ ղ͍ͯͨ͠ ɾSLOͷൣғ಺Ͱਖ਼ৗͰ͋ͬͯ΋ɺ ෦෼తͳҟৗʹ͙͢ʹؾ͚ͮΔ
  24. 45 ɾΠϯγσϯτίϚϯμʔ΁ͷ৘ใ ूத͸ೝ஌తաෛՙͷͨΊɺΑΓ ෼ࢄܕͷௐ੔ϞσϧΛఏএ ɾ2020೥ʹത࢜࿦จͱͯ͠ެ։ Controlling the Costs of Coordination

    @SREcon20 Americas Laura Maguire, The Secret Lives of SREs - Controlling the Costs of Coordination across Remote Teams, SREcon20 Americas, 2020. ※1 Laura Maguire, Controlling the Costs of Coordination in Large-scale Distributed Software Systems, Dissertation, The Ohio State University, 2020. ※1 ɾൃදऀ͸Integrated Systems Engineeringͷത࢜՝ఔݚڀͰ ̐ͭͷ૊৫ͷ̒̎ݸͷΠϯγσϯτରԠࣄྫΛௐࠪɻ
  25. 46 1. γεςϜ͕ෳࡶԽ͠ɺ৽ͨͳো֐ύλʔϯ͕ੜ·Ε͍ͯΔ 2. ෳࡶ͞ʹର͢Δղܾ͸ɺ୯७ԽͰ͸ͳ͍͔΋͠Εͳ͍ʁ 3. ʮจԽʯͰย෇͚ͣɺจԽΛ΋ܭଌ͢Δ 4. ࣗಈԽʹΑΔΦϖϨʔλʔͷഉআʹ͸ݶք͕͋ΔͷͰ͸ʁ 5.

    ͍͔ʹΦϖϨʔλʔͷೝ஌ෛՙΛԼ͛Δ͔ʁ SREcon͔ΒಘΒΕͨࣔࠦ [Gray | Metastable | Cross-System Interactions] Failures Only complexity can reduce complexity. Measuring Reliability Culture Ironies of Automation Process Feeling: ΦϖϨʔλʔݸਓͷγεςϜೝ஌ෛՙ Controlling the Costs of Coordination: ΦϖϨʔλʔಉ࢜ͷ৘ใڞ༗ೝ஌ෛՙ ίϯϐϡʔλݻ༗Ͱ͸ͳ ͍໰͍͸ଞͷֶज़෼໺͔ Β͕ࣔࠦಘΒΕΔ
  26. 49 ࠶๚ͷ·ͱΊ ޻ֶԽͷྺ࢙తഎܠ ະղܾ՝୊ͷྫ 3. SREʹ઀ଓ͞ΕΔ෼໺ γεςϜ޻ֶɺϨδϦΤϯε޻ֶ ೝ஌Պֶɺਓྨֶɺࣾձֶ ਎ۙͳ՝୊ͱࠜຊ΁ͷ໰͍ ※1

    γʔφɾΞΠΤϯΨʔ (ஶ), ᓎҪ༞ࢠ (຋༁), “THINK BIGGER ʮ࠷ߴͷൃ૝ʯΛੜΉํ๏ɿίϩϯϏΞେֶϏδωεεΫʔϧಛผߨٛ”, χϡʔζϐοΫε, 2023೥. Human-Computer Engineering ൃలͱͯ͠ͷՄೳੑ USENIX LISAɺιϑτ΢ΣΞ৴པੑ޻ֶ Πϯλʔωοτӡ༻ٕज़ͳͲͷจݙ͔Β
  27. ޻ֶ > ٕܳ ͳͷ͔ʁ ٕܳ͸ഉআ͞ΕΔ΋ͷͰ͸ͳ͘ ڞଘͤ͞Δ΋ͷͰ͸ͳ͍͔ʁ lγεςϜɾΤϯδχΞϦϯά ͸ɺՊֶͰ͋Δͱಉ࣌ʹܳज़Ͱ΋ ͋ΔͷͰ͢ɻz ※1

    ”։ൃऀͷͨΊͷγεςϜζΤϯδχΞϦϯάಋೖͷનΊ”, ୈ1.1൛, IPA, 2017೥. ※2 ݪౡ ത. “จԽ૑଄ֶͱͯ͠ͷ޻ֶ”, ిࢠ৘ใ௨৴ֶձࢽ Vol.99, No.4, 2016೥. ※1 l޻ֶ͸จԽ૑଄ֶͰ͋Δz ※2 lʜݩʑٕज़ͱܳज़͸΄ͱΜͲҰମ Ͱ͋ͬͨɽz ※2
  28. 54 ຊߨԋͰఏࣔ͢Δରൺߏ଄ खஈࢦ޲ ޻ֶత ٕܳత ໨తࢦ޲ πʔϧۦಈ ཧ࿦ۦಈ ओ؍త ٬؍త

    ϦΞΫςΟϒ ϓϩΞΫςΟϒ ہॴࢹ఺ શମ၆ᛌ ຐज़త τοϓμ΢ϯ ϘτϜΞοϓ
  29. 55 ٕज़ʢٕܳʣͱ޻ֶ ※1 ଜ্ ཅҰ࿠, ”޻ֶͷྺ࢙ͱٕज़ͷྙཧ”, ؠ೾ॻళ, 2006೥. lٕज़ͱ͸ɼਓ͕ؒͦͷੜΛશ͏͢ΔͨΊʹɼࣗΒͷ໨తҙࣝʹج ͍ͮͯɼ໨ඪͷୡ੒Λ໨ࢦͯ͠Ҋग़͠ɼ·ͨ࢖༻͢ΔʮΘ͟ʯͷ૯

    ମͱͰ΋ఆٛ͢Ε͹Α͍ͩΖ͏z l޻ֶͱ͍͏ݴ༿͸ɼͦ͏ٕͨ͠ज़Λֶ໰Խͨ͠΋ͷͱఆٛͰ͖Δz (1) ݴޠͳͲʹΑΓ޿͘఻ୡՄೳͳܗʹ੔උ͞Ε͍ͯΔ (2) ઐ໳ྖҬʹΑͬͯମܥԽ͞Εͨʮ஌ࣝʯͱ͍͏ܗଶΛͱΔ ※1 Ұൠʹٕज़͸ɺඞͣ͠΋ʮ஌ࣝԽʯ͞Ε͍ͯͳ͍
  30. 58 γεςϜ޻ֶʢSystems Engineeringʣ ※1 ”։ൃऀͷͨΊͷγεςϜζΤϯδχΞϦϯάಋೖͷનΊ”, ୈ1.1൛, IPA, 2017೥. (1) ໨తࢦ޲ͱશମ၆ᛌ

    γεςϜΛ੒ޭͤ͞ΔͨΊͷෳ਺ͷઐ໳෼໺ʹ·͕ͨΔ ΞϓϩʔνͱखஈͰ͋Δ ఆٛ (2) ଟ༷ͳઐ໳෼໺Λ౷߹ Japan Council on Systems Engineering ʹΑΔ (3) ந৅ԽɾϞσϧԽ (4) ൓෮ʹΑΔൃݟͱਐԽ
  31. 59 ιϑτ΢ΣΞ޻ֶʢSoftware Engineering) ιϑτ΢ΣΞΤϯ ジ χΞϦϯ グ ͱ͸ɺιϑτ΢ΣΞγεςϜͷ ։ൃɺςετɺ デプ

    ϩΠɺ ӡ༻ɺอकʹ͓͍ͯɺମܥత で ن཯ ͷ͋ΔఆྔՄೳͳΞ プ ϩʔνΛద༻͢Δ΋ͷ で ͋Δɻ ※1 Ivar Jacobson, et al., Ϟμϯɾιϑτ΢ΣΞΤϯδχΞϦϯά, 2020, ᠳӭࣾ. ※1
  32. 60 SRE͸γεςϜ؅ཧʹ໨తͱશମ၆ᛌΛ༩͑Δ SRE DevOps ػೳཁ݅ ඇػೳཁ݅ ϓϩμΫτ Ϛωδϝϯτ Ϣʔβʔ ʢ৴པੑʣ

    ΦϒβʔόϏϦςΟ / ϞχλϦϯά σϦόϦʔ Πϯγσϯτ؅ཧ ԾઆݕূΛߴ଎ʹ มߋ؅ཧ ద੾ʹ੍ޚ SLI/SLO
  33. 62 ΢Οʔφʔք໘ ग़యɿ౻Ҫ ௚ܟ. ݱ࣮ͱ͸ʁɹ೴ͱҙࣝͱςΫϊϩδʔͷະདྷ(ϋϠΧϫ৽ॻ) (p. 14). Kindle Edition. ΢Οʔφʔ͸ʰαΠόωςΟοΫεʱͷ೔ຊޠ൛ͷલॻ͖ʹɺ໘ന͍͜ͱΛॻ͍͍ͯ·͢ɻ

    > ΘΕΘΕͷঢ়گʹؔ͢Δೋͭͷมྔ͕͋Δ΋ͷͱͯ͠ɺͦͷҰํ͸ΘΕΘΕʹ͸੍ޚͰ͖ͳ͍΋ͷɺଞ ͷҰํ͸ΘΕΘΕʹௐઅͰ͖Δ΋ͷͰ͋Δͱ͠·͠ΐ͏ɻͦͷͱ੍͖ޚͰ͖ͳ͍มྔͷաڈ͔Βݱࡏʹ ͍ͨΔ·Ͱͷ஋ʹ΋ͱ͍ͮͯɺௐઅͰ͖Δมྔͷ஋Λద౰ʹఆΊɺΘΕΘΕʹ࠷΋ͭ͝͏ͷΑ͍ঢ়گΛ΋ ͨΒ͍ͤͨͱ͍͏๬Έ͕΋ͨΕ·͢ɻͦΕΛୡ੒͢Δํ๏͕Cyberneticsʹ΄͔ͳΒͳ͍ͷͰ͢ɻ ͜ͷߟ͑ํ͕ඇৗʹ໘നͯ͘ɺ࣮͸͜Ε͸ΤϯδχΞϦϯάͷ໨తɺ͋Δ͍͸ςΫϊϩδʔͷ໨తΛ ޠ͍ͬͯΔͱ΋ݴ͑Δ͔΋͠Ε·ͤΜɻ΢Οʔφʔ͸ɺੈք͸ʮ੍ޚͰ͖Δੈքʯͱʮ੍ޚͰ͖ͳ͍ੈ քʯʹ෼͔ͭ͜ͱ͕Ͱ͖Δͱݴ͏͚ΕͲᴷᴷࢲ͸ͦͷڥքΛʮ΢Οʔφʔք໘ʯͱ໊෇͚ͯ΋͍͍ͷͰ ͸ͳ͍͔ͱࢥ͍ͬͯΔΜͰ͚͢Ͳ ʢҴݟণ඙ʣ
  34. 64 Πϯλʔωοτٕज़ ※1 Ԭ෦ णஉ, ”Πϯλʔωοτٕज़෼໺ͷ࣍ੈ୅ݚڀऀҭ੒”, ৴ֶٕใ, vol. 111, no.

    321, IA2011-40, p. 31-32, 2011೥. l͔ͭͯʮΠϯλʔωοτٕज़ʯ͸ֶ໰Ͱ͸ͳ͍ɺ͋Δ͍͸࿦จ ʹͳΒͳ͍ɺͱݴΘΕͨ࣌୅͕͋ͬͨɻz ※1
  35. 65 ॻ੶ʮੈքඪ४ͷܦӦཧ࿦ʯʹΑΔͱɺཧ࿦ͦͷ΋ͷ͕ͳʹ͔ʹ͍ͭ ͯ͸ॾઆ͋Δ͕ɺҎԼ͸ίϯηϯαε͕΄΅ͱΕ͍ͯΔɺͱͷ͜ͱɻ ཧ࿦ͱ͸ͳʹ͔ ཧ࿦ͷ໨త͸HowɺWhenɺWhyʹԠ͑Δ͜ͱ ɾhow: ʮX -> YʯͷΑ͏ͳҼՌؔ܎ɻ ɾwhen:

    ͦͷཧ࿦͕௨༻͢Δൣғɻ(boundary condition) ɾwhyɿҼՌؔ܎͕ͳͥͦ͏ͳͷ͔ʹର͢Δઆ໌ɻ ※1 ”The primary goal of a theory is to answer the questions of how, when, and why, unlike the goal of desciption, which is to answer the question of what". (Bacharach, 1989, pp.498) ※2 ೖࢁ ষӫ, ੈքඪ४ͷܦӦཧ࿦, μΠϠϞϯυࣾ, 2019೥. ※2 ※1
  36. 67 User Uptime ※1 Hauer, et al., “Meaningful Availability”, USENIX

    NSDI 2020. ɾγεςϜܥτοϓձٞUSENIX NSDIͰఏҊ͞ΕͨGoogleͷG SuiteͰ༻͍ΒΕ͍ͯΔՄ༻ੑࢦඪ Anika Mukherji, User Uptime in Practice, SREcon, 2021. ※1 ֤Ϣʔβʔͷ࣮ࡍͷuptimeΛجʹɺෳ਺ͷ࣌ؒ࿮ͰՄ༻ੑΛಉ࣌ʹධՁ @SREcon21 ैདྷͷՄ༻ੑࢦඪͰ͸ɺΞΫςΟϒϢʔβʔ΁ͷภΓɾҰ෦ఀࢭ͕ະߟྀ