Upgrade to Pro — share decks privately, control downloads, hide ads and more …

サービス立ち上げ期におけるSREの取り組み / SRE efforts in the service launch phase

サービス立ち上げ期におけるSREの取り組み / SRE efforts in the service launch phase

93c80c388fe9d8f9df7d030549a0ff0b?s=128

Takeshi Kondo

January 19, 2022
Tweet

More Decks by Takeshi Kondo

Other Decks in Technology

Transcript

  1. αʔ ビ ε্ཱͪ げ ظʹ͓͚ΔSREͷऔΓ૊Έ Takeshi Kondo / @chaspy 2022/01/19

    ʲiCARE Dev Meetup #29ʳΤϯδχΞʹΑΔ৽نαʔϏε্ཱͪ͛ͷۤ࿑ͱتͼ
  2. Who am I chaspy chaspy_ SRE at sisterwith.com Takeshi Kondo

  3. ࠓ೔࿩͢͜ͱ / ର৅ • ࿩͢͜ͱ • αʔϏε্ཱͪ͛ظʹ͓͍ͯɺSRE ͱ͍͏ߟ͑͸Ͳ͏໾ʹཱͭͷ͔ • SRE

    ͷߟ͑ΛͲͷΑ͏ʹద༻͠ɺ࣮ફ͢Ε͹͍͍ͷ͔ • ͍͍ͩͨϒϩάͷ࿩Ͱ͢ https://blog.sisterwith.com/blog/sre-for-sister • ର৅ • αʔϏε্ཱͪ͛࣌ͷ৴པੑΛͲ͏ߟ͑Ε͹͍͍͔Θ͔Βͳ͍ਓ • SREΛ࣮ફ͠Α͏ͱࢥ͏͕Ͳ͔͜ΒखΛ͚ͭΕ͹͍͍͔Θ͔Βͳ͍ਓ
  4. Tl;dr • SRE ͷߟ͑͸αʔϏε্ཱͪ͛ظͰ΋ద༻Ͱ͖Δ • Ϣʔβͷ৴པੑ΁ͷظ଴஋Λ૝૾͠Α͏ • αʔϏεɾ૊৫ͷن໛ʹԠͯ͡ SRE ରԠͷϩʔυϚοϓΛ

    ࡞Ζ͏
  5. Agenda 1. SRE ͱ͸Կ͔ 2. ݸਓ։ൃͱ SRE 3. sister Ͱͷࣄྫ

    4. SRE Λ࣮ફ͢ΔͨΊͷϥμʔ
  6. Agenda 1. SRE ͱ͸Կ͔ 2. ݸਓ։ൃͱ SRE 3. sister Ͱͷࣄྫ

    4. SRE Λ࣮ફ͢ΔͨΊͷϥμʔ
  7. SRE ͱ͸Կ͔ • SRE = Site Reliability Engineering • ىݯ͸ʮαʔϏεӡ༻Λ

    Software Engineer ʹΑ࣮ͬͯݱ͢ Δ͜ͱʯ (*1) • ίΞίϯηϓτͱͯ͠ SLI/SLO(*2) ͕͋ΓɺϢʔβ͕ظ଴͢Δ αʔϏεϨϕϧΛࢦඪԽ͠ɺػೳ։ൃͱඇػೳ։ൃͷͲͪ Βʹ౤ࢿ͢Δ͔ͷࢦ਑ͱ͢Δ *1 Site Reliability Engineering: https://sre.google/sre-book/introduction/ our Site Reliability Engineering teams focus on hiring software engineers to run our products and to create systems to accomplish the work that would otherwise be performed, often manually, by sysadmins. *2 Service Level Indicator / Service Level Objectives ͷ͜ͱ
  8. Α͋͘Δ࿩ʢཁग़యʣ • ͦΕͬͯ Google ͙Β͍ͷେن໛ͳαʔϏε͔ͩΒඞཁͳ ͜ͱͳΜͰ͠ΐʁ • ݸਓ։ൃ΍ελʔτΞοϓͩͱͱʹ͔͘Ϣʔβʹ࢖ͬͯ΋Β ͑ΔػೳΛ࡞Δͷ༏ઌʹܾ·ͬͯΔͷͰ SRE

    ͳΜͯؔ܎ͳ ͍ΑͶʂ • ʢތு͍ͯ͠·͢ʣ
  9. SRE ͱ͸Կ͔ʢ࠶ʣ • -> ίΞίϯηϓτͱͯ͠ SLI/SLO͕͋ΓɺϢʔβ͕ظ଴͢ ΔαʔϏεϨϕϧΛࢦඪԽ͠ɺػೳ։ൃͱඇػೳ։ൃͷͲ ͪΒʹ౤ࢿ͢Δ͔ͷࢦ਑ͱ͢Δ • ݴ͍׵͑Δͱ...

    • Ϣʔβ͕ظ଴͢ΔαʔϏεϨϕϧΛఏڙͰ͖͍ͯΔ͔ • ͦΕΛఏڙͰ͖ͯͳ͍࣌ؒΛ࠷খԽͰ͖Δ͔
  10. ༨ஊɿ100% ৴པੑ໨ඪ͸ؒҧͬͨ໨ඪ • 100% is the wrong reliability target(*3) •

    99.9, 99.99% ͱ 9ͷܻΛ૿΍͢ͱͦͷͨΊͷίετ͕େ͖͔͔͘Δ • 100% ͸ෆՄೳ = ো֐͸ى͖Δ΋ͷɺͱ͍͏લఏΛ࣋ͭ΂͖ *3 Site Reliability Engineering: https://sre.google/sre-book/introduction/ The error budget stems from the observation that 100% is the wrong reliability target for basically everything
  11. Agenda 1. SRE ͱ͸Կ͔ 2. ݸਓ։ൃͱ SRE 3. sister Ͱͷࣄྫ

    4. SRE Λ࣮ફ͢ΔͨΊͷϥμʔ
  12. ݸਓ։ൃϑΣʔζͱ͸ɺͲ͏͍͏ϑΣʔζͩͱଊ͑Δ͔ʁ • Ϣʔβ͸গͳ͍͔΋͠Εͳ͍͕ɺଘࡏ͢Δ • Ϣʔβ͕ຬ଍͢Ε͹ɺར༻ऀ͸૿͑Δ • ΋͠ຬ଍ʹར༻Ͱ͖ͳ͍ɺظ଴͍ͯ͠ΔΑ͏ʹ࢖͑ͳ͍৔߹ • Ϣʔβ͸؆୯ʹ཭Εͯ͠·͏ 


    ݸਓ։ൃͰ΋େن໛։ൃ΋ɺػೳ։ൃͱಉ͡Α͏ʹ Ϣʔβظ଴஋Λຬͨ͢৴པੑ͸ॏཁ
  13. ༨ஊɿ৴པੑ͸࠷΋ॏཁͳػೳͷ1ͭ • Reliability Is the Most Important Feature(*4) • γεςϜ͕৴པͰ͖ͳ͚Ε͹ɺϢʔβ͸ͦΕΛ৴པ͠ͳ͍

    • Ϣʔβ͕γεςϜΛ৴པ͠ͳ͚Ε͹ɺ࢖Θͳ͍ • γεςϜ͸ωοτϫʔΫޮՌʹΑΓ޿͕ΔͨΊɺϢʔβ͕͍ͳ͍γε ςϜ͸Ձ஋͕ͳ͍ • ଌఆ߲໨͸৻ॏʹબ୒͠ͳ͍͞ *4 The Site Reliability Workbook: https://sre.google/workbook/reaching-beyond/
  14. ݸਓ։ൃʹ͓͚Δ SREɺͲ͔͜Β͸͡ΊΔʁ • 1. Ϣʔβ͕ظ଴͢ΔαʔϏεϨϕϧΛఏڙͰ͖͍ͯΔ͔ • 2. ͦΕΛఏڙͰ͖ͯͳ͍࣌ؒΛ࠷খԽͰ͖Δ͔ • ݴ͍׵͑Δͱ...

    • मਖ਼ϛεΛຊ൪؀ڥʹग़͢લʹؾ෇͚ΔΑ͏ʹ͢Δ • ຊ൪Ͱमਖ਼ϛε͕ى͖ͯ΋͙͢ؾͮ͘͜ͱ͕Ͱ͖Δ • ຊ൪Ͱमਖ਼ϛε͕ى͖ͨ৔߹ɺͦͷݪҼΛௐࠪՄೳʹ͢Δ • ຊ൪Ͱमਖ਼ϛε͕ى͖ͨ৔߹ɺͦͷमਖ਼Λૉૣ͘ϦϦʔεͰ͖Δ
  15. Agenda 1. SRE ͱ͸Կ͔ 2. ݸਓ։ൃͱ SRE 3. sister Ͱͷࣄྫ

    4. SRE Λ࣮ફ͢ΔͨΊͷϥμʔ
  16. sister ʹ͓͚Δ࣮ફ

  17. • Developer Productivity • Observability • Testing • Security sister

    ʹ͓͚Δ࣮ફ
  18. sister ʹ͓͚Δ࣮ફ

  19. Agenda 1. SRE ͱ͸Կ͔ 2. ݸਓ։ൃͱ SRE 3. sister Ͱͷࣄྫ

    4. SRE Λ࣮ફ͢ΔͨΊͷϥμʔ
  20. SRE Λ࣮ફ͢ΔͨΊͷϥμʔ • ૊৫ن໛ͱϑΣʔζʹΑͬͯ3ஈ֊ • ্ཱͪ͛࣌ظʢsister ͸͜͜ʣ • ຊ൪ӡ༻·ͰʢνʔϜن໛ʙ10ਓʣ •

    ຊ൪ӡ༻։࢝ʙ֦େ࣌ظʢʙ50ਓʣ
  21. SRE Λ࣮ફ͢ΔͨΊͷϥμʔ • ্ཱͪ͛࣌ظʢsister ͸͜͜ʣ • Developer Productivity (Local Environment)

    • Release Engineering, Unit Test, CICD • Observability (Logging, Metrics, Tracing) ։ൃɺద༻ɺ֬ೝͷαΠΫϧΛߴ଎Խ ໰୊ʹૉૣ͘ؾͮͨ͘Ίͷ࢓૊Έ࡞Γ
  22. SRE Λ࣮ફ͢ΔͨΊͷϥμʔ • ຊ൪ӡ༻·ͰʢνʔϜن໛ʙ10ਓʣ • Continuous Library Update (renovate/dependabot) •

    Data Protection • Availability (AutoScaling, Redundancy) • Performance Improvement Ϣʔβ਺ɾσʔλ਺͕૿͑ͨͱ͖ʹ޲͚ͨ४උ
  23. SRE Λ࣮ફ͢ΔͨΊͷϥμʔ • ຊ൪ӡ༻։࢝ʙ֦େ࣌ظʢʙ50ਓʣ • E2E Test Automation • SLI/SLO/Error

    Budget Policy • Incident Response Management / Training • Load Test / Stress Test ૊৫ɾνʔϜͰ໨ࢦ͢৴པੑΛ୲อ͢Δ ਺೥ޙΛݟӽͨ͠४උɺઃܭ
  24. ·ͱΊ • SRE ͷߟ͑͸αʔϏε্ཱͪ͛ظͰ΋ద༻Ͱ͖Δ • Ϣʔβͷ৴པੑ΁ͷظ଴஋Λ૝૾͠Α͏ • αʔϏεɾ૊৫ͷن໛ʹԠͯ͡ SRE ରԠͷϩʔυϚοϓΛ

    ࡞Ζ͏
  25. ͓ΘΓʹ • sister (sisterwith.com) ͸͓࢞͞Μʢϝϯλʔʣɺຓʢϝϯ ςΟʔʣΛืू͍ͯ͠·͢ • SRE ʹ·ͭΘΔτϐοΫ͋Ε͹ؾܰʹ Twitter

    DM Ͳ͏ͧʂ • https://twitter.com/_chaspy
  26. Thank you! chaspy chaspy_ SRE at sisterwith.com Takeshi Kondo