Slide 1

Slide 1 text

αʔ ビ ε্ཱͪ げ ظʹ͓͚ΔSREͷऔΓ૊Έ Takeshi Kondo / @chaspy 2022/01/19 ʲiCARE Dev Meetup #29ʳΤϯδχΞʹΑΔ৽نαʔϏε্ཱͪ͛ͷۤ࿑ͱتͼ

Slide 2

Slide 2 text

Who am I chaspy chaspy_ SRE at sisterwith.com Takeshi Kondo

Slide 3

Slide 3 text

ࠓ೔࿩͢͜ͱ / ର৅ • ࿩͢͜ͱ • αʔϏε্ཱͪ͛ظʹ͓͍ͯɺSRE ͱ͍͏ߟ͑͸Ͳ͏໾ʹཱͭͷ͔ • SRE ͷߟ͑ΛͲͷΑ͏ʹద༻͠ɺ࣮ફ͢Ε͹͍͍ͷ͔ • ͍͍ͩͨϒϩάͷ࿩Ͱ͢ https://blog.sisterwith.com/blog/sre-for-sister • ର৅ • αʔϏε্ཱͪ͛࣌ͷ৴པੑΛͲ͏ߟ͑Ε͹͍͍͔Θ͔Βͳ͍ਓ • SREΛ࣮ફ͠Α͏ͱࢥ͏͕Ͳ͔͜ΒखΛ͚ͭΕ͹͍͍͔Θ͔Βͳ͍ਓ

Slide 4

Slide 4 text

Tl;dr • SRE ͷߟ͑͸αʔϏε্ཱͪ͛ظͰ΋ద༻Ͱ͖Δ • Ϣʔβͷ৴པੑ΁ͷظ଴஋Λ૝૾͠Α͏ • αʔϏεɾ૊৫ͷن໛ʹԠͯ͡ SRE ରԠͷϩʔυϚοϓΛ ࡞Ζ͏

Slide 5

Slide 5 text

Agenda 1. SRE ͱ͸Կ͔ 2. ݸਓ։ൃͱ SRE 3. sister Ͱͷࣄྫ 4. SRE Λ࣮ફ͢ΔͨΊͷϥμʔ

Slide 6

Slide 6 text

Agenda 1. SRE ͱ͸Կ͔ 2. ݸਓ։ൃͱ SRE 3. sister Ͱͷࣄྫ 4. SRE Λ࣮ફ͢ΔͨΊͷϥμʔ

Slide 7

Slide 7 text

SRE ͱ͸Կ͔ • SRE = Site Reliability Engineering • ىݯ͸ʮαʔϏεӡ༻Λ Software Engineer ʹΑ࣮ͬͯݱ͢ Δ͜ͱʯ (*1) • ίΞίϯηϓτͱͯ͠ SLI/SLO(*2) ͕͋ΓɺϢʔβ͕ظ଴͢Δ αʔϏεϨϕϧΛࢦඪԽ͠ɺػೳ։ൃͱඇػೳ։ൃͷͲͪ Βʹ౤ࢿ͢Δ͔ͷࢦ਑ͱ͢Δ *1 Site Reliability Engineering: https://sre.google/sre-book/introduction/ our Site Reliability Engineering teams focus on hiring software engineers to run our products and to create systems to accomplish the work that would otherwise be performed, often manually, by sysadmins. *2 Service Level Indicator / Service Level Objectives ͷ͜ͱ

Slide 8

Slide 8 text

Α͋͘Δ࿩ʢཁग़యʣ • ͦΕͬͯ Google ͙Β͍ͷେن໛ͳαʔϏε͔ͩΒඞཁͳ ͜ͱͳΜͰ͠ΐʁ • ݸਓ։ൃ΍ελʔτΞοϓͩͱͱʹ͔͘Ϣʔβʹ࢖ͬͯ΋Β ͑ΔػೳΛ࡞Δͷ༏ઌʹܾ·ͬͯΔͷͰ SRE ͳΜͯؔ܎ͳ ͍ΑͶʂ • ʢތு͍ͯ͠·͢ʣ

Slide 9

Slide 9 text

SRE ͱ͸Կ͔ʢ࠶ʣ • -> ίΞίϯηϓτͱͯ͠ SLI/SLO͕͋ΓɺϢʔβ͕ظ଴͢ ΔαʔϏεϨϕϧΛࢦඪԽ͠ɺػೳ։ൃͱඇػೳ։ൃͷͲ ͪΒʹ౤ࢿ͢Δ͔ͷࢦ਑ͱ͢Δ • ݴ͍׵͑Δͱ... • Ϣʔβ͕ظ଴͢ΔαʔϏεϨϕϧΛఏڙͰ͖͍ͯΔ͔ • ͦΕΛఏڙͰ͖ͯͳ͍࣌ؒΛ࠷খԽͰ͖Δ͔

Slide 10

Slide 10 text

༨ஊɿ100% ৴པੑ໨ඪ͸ؒҧͬͨ໨ඪ • 100% is the wrong reliability target(*3) • 99.9, 99.99% ͱ 9ͷܻΛ૿΍͢ͱͦͷͨΊͷίετ͕େ͖͔͔͘Δ • 100% ͸ෆՄೳ = ো֐͸ى͖Δ΋ͷɺͱ͍͏લఏΛ࣋ͭ΂͖ *3 Site Reliability Engineering: https://sre.google/sre-book/introduction/ The error budget stems from the observation that 100% is the wrong reliability target for basically everything

Slide 11

Slide 11 text

Agenda 1. SRE ͱ͸Կ͔ 2. ݸਓ։ൃͱ SRE 3. sister Ͱͷࣄྫ 4. SRE Λ࣮ફ͢ΔͨΊͷϥμʔ

Slide 12

Slide 12 text

ݸਓ։ൃϑΣʔζͱ͸ɺͲ͏͍͏ϑΣʔζͩͱଊ͑Δ͔ʁ • Ϣʔβ͸গͳ͍͔΋͠Εͳ͍͕ɺଘࡏ͢Δ • Ϣʔβ͕ຬ଍͢Ε͹ɺར༻ऀ͸૿͑Δ • ΋͠ຬ଍ʹར༻Ͱ͖ͳ͍ɺظ଴͍ͯ͠ΔΑ͏ʹ࢖͑ͳ͍৔߹ • Ϣʔβ͸؆୯ʹ཭Εͯ͠·͏ 
 ݸਓ։ൃͰ΋େن໛։ൃ΋ɺػೳ։ൃͱಉ͡Α͏ʹ Ϣʔβظ଴஋Λຬͨ͢৴པੑ͸ॏཁ

Slide 13

Slide 13 text

༨ஊɿ৴པੑ͸࠷΋ॏཁͳػೳͷ1ͭ • Reliability Is the Most Important Feature(*4) • γεςϜ͕৴པͰ͖ͳ͚Ε͹ɺϢʔβ͸ͦΕΛ৴པ͠ͳ͍ • Ϣʔβ͕γεςϜΛ৴པ͠ͳ͚Ε͹ɺ࢖Θͳ͍ • γεςϜ͸ωοτϫʔΫޮՌʹΑΓ޿͕ΔͨΊɺϢʔβ͕͍ͳ͍γε ςϜ͸Ձ஋͕ͳ͍ • ଌఆ߲໨͸৻ॏʹબ୒͠ͳ͍͞ *4 The Site Reliability Workbook: https://sre.google/workbook/reaching-beyond/

Slide 14

Slide 14 text

ݸਓ։ൃʹ͓͚Δ SREɺͲ͔͜Β͸͡ΊΔʁ • 1. Ϣʔβ͕ظ଴͢ΔαʔϏεϨϕϧΛఏڙͰ͖͍ͯΔ͔ • 2. ͦΕΛఏڙͰ͖ͯͳ͍࣌ؒΛ࠷খԽͰ͖Δ͔ • ݴ͍׵͑Δͱ... • मਖ਼ϛεΛຊ൪؀ڥʹग़͢લʹؾ෇͚ΔΑ͏ʹ͢Δ • ຊ൪Ͱमਖ਼ϛε͕ى͖ͯ΋͙͢ؾͮ͘͜ͱ͕Ͱ͖Δ • ຊ൪Ͱमਖ਼ϛε͕ى͖ͨ৔߹ɺͦͷݪҼΛௐࠪՄೳʹ͢Δ • ຊ൪Ͱमਖ਼ϛε͕ى͖ͨ৔߹ɺͦͷमਖ਼Λૉૣ͘ϦϦʔεͰ͖Δ

Slide 15

Slide 15 text

Agenda 1. SRE ͱ͸Կ͔ 2. ݸਓ։ൃͱ SRE 3. sister Ͱͷࣄྫ 4. SRE Λ࣮ફ͢ΔͨΊͷϥμʔ

Slide 16

Slide 16 text

sister ʹ͓͚Δ࣮ફ

Slide 17

Slide 17 text

• Developer Productivity • Observability • Testing • Security sister ʹ͓͚Δ࣮ફ

Slide 18

Slide 18 text

sister ʹ͓͚Δ࣮ફ

Slide 19

Slide 19 text

Agenda 1. SRE ͱ͸Կ͔ 2. ݸਓ։ൃͱ SRE 3. sister Ͱͷࣄྫ 4. SRE Λ࣮ફ͢ΔͨΊͷϥμʔ

Slide 20

Slide 20 text

SRE Λ࣮ફ͢ΔͨΊͷϥμʔ • ૊৫ن໛ͱϑΣʔζʹΑͬͯ3ஈ֊ • ্ཱͪ͛࣌ظʢsister ͸͜͜ʣ • ຊ൪ӡ༻·ͰʢνʔϜن໛ʙ10ਓʣ • ຊ൪ӡ༻։࢝ʙ֦େ࣌ظʢʙ50ਓʣ

Slide 21

Slide 21 text

SRE Λ࣮ફ͢ΔͨΊͷϥμʔ • ্ཱͪ͛࣌ظʢsister ͸͜͜ʣ • Developer Productivity (Local Environment) • Release Engineering, Unit Test, CICD • Observability (Logging, Metrics, Tracing) ։ൃɺద༻ɺ֬ೝͷαΠΫϧΛߴ଎Խ ໰୊ʹૉૣ͘ؾͮͨ͘Ίͷ࢓૊Έ࡞Γ

Slide 22

Slide 22 text

SRE Λ࣮ફ͢ΔͨΊͷϥμʔ • ຊ൪ӡ༻·ͰʢνʔϜن໛ʙ10ਓʣ • Continuous Library Update (renovate/dependabot) • Data Protection • Availability (AutoScaling, Redundancy) • Performance Improvement Ϣʔβ਺ɾσʔλ਺͕૿͑ͨͱ͖ʹ޲͚ͨ४උ

Slide 23

Slide 23 text

SRE Λ࣮ફ͢ΔͨΊͷϥμʔ • ຊ൪ӡ༻։࢝ʙ֦େ࣌ظʢʙ50ਓʣ • E2E Test Automation • SLI/SLO/Error Budget Policy • Incident Response Management / Training • Load Test / Stress Test ૊৫ɾνʔϜͰ໨ࢦ͢৴པੑΛ୲อ͢Δ ਺೥ޙΛݟӽͨ͠४උɺઃܭ

Slide 24

Slide 24 text

·ͱΊ • SRE ͷߟ͑͸αʔϏε্ཱͪ͛ظͰ΋ద༻Ͱ͖Δ • Ϣʔβͷ৴པੑ΁ͷظ଴஋Λ૝૾͠Α͏ • αʔϏεɾ૊৫ͷن໛ʹԠͯ͡ SRE ରԠͷϩʔυϚοϓΛ ࡞Ζ͏

Slide 25

Slide 25 text

͓ΘΓʹ • sister (sisterwith.com) ͸͓࢞͞Μʢϝϯλʔʣɺຓʢϝϯ ςΟʔʣΛืू͍ͯ͠·͢ • SRE ʹ·ͭΘΔτϐοΫ͋Ε͹ؾܰʹ Twitter DM Ͳ͏ͧʂ • https://twitter.com/_chaspy

Slide 26

Slide 26 text

Thank you! chaspy chaspy_ SRE at sisterwith.com Takeshi Kondo