Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Who owns the Service Level?
Search
Takeshi Kondo
May 15, 2022
Technology
5
9.8k
Who owns the Service Level?
SRE NEXT 2022
https://sre-next.dev/2022/
Takeshi Kondo
May 15, 2022
Tweet
Share
More Decks by Takeshi Kondo
See All by Takeshi Kondo
エンジニアブランディングチームの KPI / KPI's of engineer branding team
chaspy
2
700
「SLO Review」今やるならこうする / If I had to do the "SLO Review" again
chaspy
3
730
開発者とともに作る Site Reliability Engineering / SREing with Developers
chaspy
10
6.1k
自己診断能力の獲得を目指して / Toward the acquisition of self-diagnostic skills
chaspy
1
2.8k
『スタディサプリ 中学講座』における E2E Test の運用と計測による改善 / Improved E2E testing through measurement
chaspy
0
2.9k
『スタディサプリ』における SLI/SLO の継続的改善 / Continuous improvement of SLI/SLO at StudySapuri
chaspy
1
2k
ポストモーテム運用を支える文化と技術 / Culture and Technology Supporting Postmortem Operations
chaspy
2
1.2k
多様な働き方を支える Working Agreements / Working agreements that support diverse work styles
chaspy
1
1.9k
SRE を実現するための組織マネジメント / Management to achieve SRE
chaspy
3
5.2k
Other Decks in Technology
See All in Technology
アクセシビリティを考慮したUI/CSSフレームワーク・ライブラリ選定
yajihum
2
1k
Kernel MemoryでAzure OpenAI Serviceとお手軽データソース連携
mitsuzono
1
260
R3のコードから見る実践LINQ実装最適化・コンカレントプログラミング実例
neuecc
2
660
「スニダン」開発組織の構造に込めた意図 ~組織作りはパッションや政治ではない!~
rinchsan
4
580
IaCジェネレーターとBedrockで詳細設計書を生成してみた
tsukasa_ishimaru
3
490
開発生産性大幅アップ!Postman VS Code拡張機能
nagix
2
490
Grafana x PagerDuty Better Together
jacopen
1
170
推しは推せるときに推せ! プロダクトにフィードバックしていこう
nakasho
0
400
EMとして2023年度に頑張ったこと / What we did well in FY2023 as a EM
pauli
1
180
アクセス制御にまつわる改善 / Improving access control
itkq
0
560
競技としてのKaggle、役に立つKaggle
yu4u
5
2k
障害対応をちょっとずつよくしていくための 演習の作りかた
heleeen
1
320
Featured
See All Featured
Building Adaptive Systems
keathley
31
1.9k
The Invisible Customer
myddelton
114
12k
Agile that works and the tools we love
rasmusluckow
325
20k
What's in a price? How to price your products and services
michaelherold
237
11k
RailsConf 2023
tenderlove
4
540
Product Roadmaps are Hard
iamctodd
44
9.7k
How GitHub Uses GitHub to Build GitHub
holman
468
290k
The Language of Interfaces
destraynor
151
23k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
25
2.3k
Building a Modern Day E-commerce SEO Strategy
aleyda
17
6.4k
Code Reviewing Like a Champion
maltzj
514
39k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
14
1.6k
Transcript
Who owns the Service Level? Takeshi Kondo / @chaspy 2022/05/15
SRE NEXT 2022
Who am I chaspy chaspy_ Engineering Manager, Site Reliability at
Recruit Co., Ltd. Takeshi Kondo https://chaspy.me
લఏɿϓϩμΫτհ - ελσΟα ϓϦ
Hello, again! SRE NEXT
ࠓ͢͜ͱ SRE Λ࣮ݱ͢ΔͨΊʹඞཁͳ͜ͱ
Tl;dr / SRE Λ࣮ݱ͢ΔͨΊʹඞཁͳ͜ͱ • ࢦඪΛݩʹվળ͠ଓ͚ΔจԽ • ࣗݾ݁ԽΛࢧ͑Δ Platform •
༏ઌॱҐΛมߋ͢ΔͨΊͷٕज़ઓུ
Agenda • લఏɿSRE Λ࣮ݱ͢ΔͱͲ͏͍͏͜ͱ͔ • ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱࣄۀʹ͓͚Δ SRE • 2020ʮSLO ReviewʯͰͷֶͼͱࣦഊ
• SRE ͱٕज़ઓུ • ·ͱΊͱࠓޙ
Agenda • લఏɿSRE Λ࣮ݱ͢ΔͱͲ͏͍͏͜ͱ͔ • ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱࣄۀʹ͓͚Δ SRE • 2020ʮSLO ReviewʯͰͷֶͼͱࣦഊ
• SRE ͱٕज़ઓུ • ·ͱΊͱࠓޙ
લఏ෦3݄ͷ6ࣾ߹ಉษڧձͱಉ͡༰Ͱ͓͞Β͍͠·͢
SRE Λ࣮ݱ͢Δͱ
։ൃνʔϜ͕৴པੑΛ ίϯτϩʔϧ͢Δ Capability Λ ʹ͚͍ͭͯΔ͜ͱ
ͦͦ Site Reliability Engineering ͱ: Not like this • αʔϏε͕ʮߴ͍৴པੑ
(ʹ100%)ʯΛอ͍ͬͯΔ͜ͱ • SLI/SLO ΛकΕ͍ͯΔ͜ͱ • ΦϯίʔϧϩʔςʔγϣϯΛ։ൃνʔϜͰߦ͏͜ͱ https://github.com/twitter/twemoji
ͦͦ Site Reliability Engineering ͱ: Like this! • αʔϏε͕ʮϢʔβ͕ظ͢Δ৴པੑʯΛอ͍ͬͯΔ͜ͱ •
SLI/SLO Λઃఆ͠ɺඇػೳཁ݅ͱػೳཁ݅ͷ༏ઌܾఆͷ ࢦඪͱͯ͠׆༻͍ͯ͠Δ • SLO ҧ͕ൃੜͨ͠ͱ͖ʹదʹରॲͰ͖ΔΑ͏ͳϞχλ Ϧϯάํ๏ͱϙϦγʔ͕νʔϜͰಉҙ͞Ε͍ͯΔ • ্ه͕ఆظతʹݟ͞Ε͍ͯΔ https://github.com/twitter/twemoji
։ൃνʔϜ͕৴པੑΛίϯτϩʔϧ͢Δ Capability Λʹ͚ͭΔ: Like this! SRE ։ൃ νʔϜ ։ൃνʔϜͷ৴པੑʹ ؔ͢Δ
Capability औಘ Λࢧԉ͢Δ ࣗͨͪͷαʔϏεͷ ৴པੑΛࣗͨͪͰί ϯτϩʔϧͰ͖͍ͯΔ
Team Topologies • 4ͭͷνʔϜύλʔϯ • Stream Aligned • Platform •
Enabling • Complicated Subsystem • 3ͭͷίϛϡχέʔγϣϯύλʔϯ • Collaboration • X as a Service • Facilitation https://pub.jmam.co.jp/book/b593881.html
։ൃνʔϜ͕৴པੑΛίϯτϩʔϧ͢Δ Capability Λʹ͚ͭΔ: Like this! SRE ։ൃ νʔϜ ։ൃνʔϜͷࣗݾ݁ԽΛ ࢧ͑ΔϓϥοτϑΥʔϜͱ
จԽΛ࡞Δ Platform Team Enabling Team Stream Aligned Team ࣗͨͪͰඞཁͳͷΛ ࣗͨͪͰ༻ҙͰ͖Δ = self-contained / ࣗݾ݁Խ
• ࢧԉҰ࣌తͰɺظԽͤͯ͞ͳΒͳ͍ ༨ஊ: Enabling ͱ͍͏୯ޠʹϙδςΟϒɾωΨςΟϒ྆໘͋Δ https://ja.wikipedia.org/wiki/%E3%82%A4%E3%83%8D%E3%83%BC%E3%83%96%E3%83%AA%E3%83%B3%E3%82%B0
SRE Team ͷ Vision / Mission / Values https://blog.studysapuri.jp/entry/sre-vision-mission-values
Mission ࣗݾ݁νʔϜ͕ϓϩμΫ τΛૉૣ҆͘શʹಧ͚ଓ͚ ΔͨΊͷϓϥοτϑΥʔϜ ͱจԽΛ࡞Δ
ͳͥࣗݾ݁Խ͕ॏཁ͔ SRE ։ൃ νʔϜ ։ൃνʔϜͷࣗݾ݁ԽΛ ࢧ͑ΔϓϥοτϑΥʔϜͱ จԽΛ࡞Δ Platform Team Enabling
Team Stream Aligned Team ࣗͨͪͰඞཁͳͷΛ ࣗͨͪͰ༻ҙͰ͖Δ = self-contained / ࣗݾ݁Խ
ͳͥࣗݾ݁Խ͕ॏཁ͔: Not “VS”, but “And” • Dev vs and Ops
• Ϣʔβ͔ΒߴʹϑΟʔυόοΫΛಘΔ (DevOps) • Dev vs and Infrastructure • ηϧϑαʔϏεͰߏஙͯ͠ϦʔυλΠϜॖ • Productivity vs and Reliability • ੜ࢈ੑͱ৴པੑ૬ޓʹґଘ͢Δ
ͳͥࣗݾ݁Խ͕ॏཁ͔ • ࣗݾ݁νʔϜͳΒ • ຊ൪ڥ͔ΒϑΟʔυόοΫΛಘଓ͚ΒΕΔ • ࠷খݶͷίϛϡχέʔγϣϯͰߴͳҙࢥܾఆ͕Ͱ͖Δ • ػೳཁٻ͚ͩͰͳ͘ඇػೳཁٻʹԠ͑Δ͜ͱ͕Ͱ͖Δ
• ։ൃνʔϜ͕৴པੑΛίϯτϩʔϧ͢Δ Capability Λʹ͚ͭͯ ͍Δ͜ͱ • ։ൃνʔϜ͕”ࣗݾ݁Խ”͍ͯ͠Δঢ়ଶ • SRE νʔϜ͜ΕΛϓϥοτϑΥʔϜͱจԽৢͰࢧ͑Δ
• ͜ΕΛ࣮ݱ͢ΔʹϓϩμΫτ։ൃʹด͡ͳ͍ଟ༷ͳࢹ͕ඞཁ • ϢʔβͷظΛΔ / Product Management • ߴ͍։ൃੜ࢈ੑ / Development Skills • ඇػೳཁٻʹͲΕ͚ͩίετΛ͔͚Δ͔ / Business Development ·ͱΊɿSRE Λ࣮ݱ͢Δͱ
Agenda • લఏɿSRE Λ࣮ݱ͢ΔͱͲ͏͍͏͜ͱ͔ • ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱࣄۀʹ͓͚Δ SRE • 2020ʮSLO ReviewʯͰͷֶͼͱࣦഊ
• SRE ͱٕज़ઓུ • ·ͱΊͱࠓޙ
ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱSRE ~ @chaspy ೖࣾޙ • 2018: @chaspy ೖࣾ • 2019:
Application Platform Λ Kubernetes Ҡ • 2020: Microservices Readiness ͷඋ • αʔϏεΦʔφʔγοϓͷࡦఆ • Design Doc / Production Readiness Checklist • Self-services Infrastructure (terraform monorepo) • SLI/SLO • 2021: SLI/SLO ӡ༻Λ։ൃνʔϜʹશҠৡ Platform Team ͱͯ͠ Platform Λ࡞͍ͬͯΔ Enabling Team ͱͯ͠ ։ൃ৫ʹ SLI/SLO ͳͲͷΧϧνϟʔৢ
ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱSRE ~ 2021 • COVID-19 ྲྀߦɺΞΫηε૿େ • Platform ͷਐԽ •
Terraform monorepo • Loadtest Platform • GitHub Actions ʹΑΔ monorepo CI • ৫ͷมԽ • ٕज़ઓུάϧʔϓൃ • ࣄۀҠʹΑΓϦΫϧʔτస੶ɺQuipper ຊࢧళਫ਼ࢉ • chaspy EM ༻
ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱSRE ~ 2022 • ৽͍͠ SRE ࣮ફͷܗ • Partially Embedded
SRE • Partially Enabling SRE in development team
৫نͷਪҠ ։ൃऀ 35 53 54
73 114 43& 4 5 7 7 7 ։ൃऀελσΟαϓϦɾQuipper ྆ํͷɺWeb Engineer (frontend&backend) ͷɻNative আ֎͍ͯ͠Δɻ 2022͔ΒۀҕୗͷํΧϯτ͍ͯ͠Δɻ2021Ҏલۀҕୗͷํͱࣄ͍ͯͨ͠ɻ
ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱSRE • ͍ͣΕͷ࣌ Platform Team ͱ Enabling Team ͷৼΔ ͍Λ͍ͯ͠Δ
• ಛʹ2019͔Βʮࣗݾ݁ԽʯΛςʔϚʹɺ͓ئ͍͞Ε Δ͜ͱΛۃྗݮΒͤΔ Platform Λ࡞͖ͬͯͨ • ಉ࣌ʹ։ൃνʔϜͷʮจԽΛͭ͘Δʯ͜ͱʹ౿ΈࠐΈɺSLI/ SLO Λݟ͍ͯ͘จԽΛ৫ʹৢͨ͠ • →ʮSLO Reviewʯat SRE NEXT 2020
Agenda • લఏɿSRE Λ࣮ݱ͢ΔͱͲ͏͍͏͜ͱ͔ • ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱࣄۀʹ͓͚Δ SRE • 2020ʮSLO ReviewʯͰͷֶͼͱࣦഊ
• SRE ͱٕज़ઓུ • ·ͱΊͱࠓޙ
ʮSLO Reviewʯat SRE NEXT 2020 • ։ൃ৫ʹ SLO Λ Review
͍ͯ͘͠จԽΛ࡞ͬͨऔΓΈ • 2 Product, 15 Team ʹϘτϜΞοϓͰಋೖ • ొΓํͷ4εςοϓ • γεςϜͱ৫ͷΦʔφʔγοϓΛܾΊΔ • 1ਓͰϓϩηεΛ·Θ͠ɺ࣮ݱํ๏Λཱ֬͢Δ • Developer ͱҰॹʹ SLI/SLO Λఆٛ͠ɺϓϩηεΛ·Θ͢ • Error Budget Policy Λఆٛͯ͠ߦಈ͢Δ(ະ࣮ݱ) • ಘֶͨͼ • ඪ४Խ͞Εͨ SLI Λఏڙ͢Δ • ઃఆૣ͍ஈ֊ͰίʔυԽ͢Δ • ֶशۂઢΛٸޯʹ͢Δ https://blog.studysapuri.jp/entry/2020/01/30/slo-review
Α͔ͬͨ • ։ൃ৫ʹ SLO Λ Review ͍ͯ͘͠จԽΛ࡞ͬͨऔΓΈ • 2 Product,
15 Team ʹϘτϜΞοϓͰಋೖ • ొΓํͷεςοϓ • γεςϜͱ৫ͷΦʔφʔγοϓΛܾΊΔ • 1ਓͰϓϩηεΛ·Θ͠ɺ࣮ݱํ๏Λཱ֬͢Δ • Developer ͱҰॹʹ SLI/SLO Λఆٛ͠ɺϓϩηεΛ·Θ͢ • Error Budget Policy Λఆٛͯ͠ߦಈ͢Δ(ະ࣮ݱ) • ಘֶͨͼ • ඪ४Խ͞Εͨ SLI Λఏڙ͢Δ • ઃఆૣ͍ஈ֊ͰίʔυԽ͢Δ • ֶशۂઢΛٸޯʹ͢Δ https://blog.studysapuri.jp/entry/2020/01/30/slo-review ։ൃνʔϜͷೝෛՙΛపఈత ʹԼ͛Δ͜ͱʹͩ͜Θͬͨ తෆ࣮֬ੑͷݮͷͨΊ ϑΟʔυόοΫαΠΫϧΛճͨ͠
Α͘ͳ͔ͬͨʁ • ։ൃ৫ʹ SLO Λ Review ͍ͯ͘͠จԽΛ࡞ͬͨऔΓΈ • 2 Product,
15 Team ʹϘτϜΞοϓͰಋೖ • ొΓํͷεςοϓ • γεςϜͱ৫ͷΦʔφʔγοϓΛܾΊΔ • 1ਓͰϓϩηεΛ·Θ͠ɺ࣮ݱํ๏Λཱ֬͢Δ • Developer ͱҰॹʹ SLI/SLO Λఆٛ͠ɺϓϩηεΛ·Θ͢ • Error Budget Policy Λఆٛͯ͠ߦಈ͢Δ(ະ࣮ݱ) • ಘֶͨͼ • ඪ४Խ͞Εͨ SLI Λఏڙ͢Δ • ઃఆૣ͍ஈ֊ͰίʔυԽ͢Δ • ֶशۂઢΛٸޯʹ͢Δ https://blog.studysapuri.jp/entry/2020/01/30/slo-review ͳͥ͏·͍͔͘ͳ ͔ͬͨͷ͔ʁ
ʢ࠶ܝʣSRE Λ࣮ݱ͢ΔͱԿ͔ • ։ൃνʔϜ͕৴པੑΛίϯτϩʔϧ͢Δ Capability Λʹͭ ͚͍ͯΔ͜ͱ • ʮࢦඪΛݩʹߦಈ͕Ͱ͖Δʯ͜ͱ https://blog.studysapuri.jp/entry/2020/01/30/slo-review
4-*4-0 &SSPS#VEHFU 1PMJDZ ։ൃνʔϜ ఆٛ͢Δ / ఆظతʹݟ͢ Policy ʹैͬͯߦಈ͢Δ
ʢ࠶ܝʣSRE Λ࣮ݱ͢ΔͱԿ͔ • ։ൃνʔϜ͕৴པੑΛίϯτϩʔϧ͢Δ Capability Λʹͭ ͚͍ͯΔ͜ͱ • ʮࢦඪΛݩʹߦಈ͕Ͱ͖Δʯ͜ͱ https://blog.studysapuri.jp/entry/2020/01/30/slo-review
4-*4-0 &SSPS#VEHFU 1PMJDZ ։ൃνʔϜ ఆٛ͢Δ / ఆظతʹݟ͢ Policy ʹैͬͯߦಈ͢Δ Ͱ͖ͨ
ʢ࠶ܝʣSRE Λ࣮ݱ͢ΔͱԿ͔ • ։ൃνʔϜ͕৴པੑΛίϯτϩʔϧ͢Δ Capability Λʹͭ ͚͍ͯΔ͜ͱ • ʮࢦඪΛݩʹߦಈ͕Ͱ͖Δʯ͜ͱ https://blog.studysapuri.jp/entry/2020/01/30/slo-review
4-*4-0 &SSPS#VEHFU 1PMJDZ ։ൃνʔϜ ఆٛ͢Δ / ఆظతʹݟ͢ Policy ʹैͬͯߦಈ͢Δ ͏·͍͔͘ͳ͔ͬͨ
ͳͥ”ߦಈ͢Δ”·ͰࢸΒͳ͔ͬͨͷ͔ • ࣌ɺSLO ҧ࣌ͷΞΫγϣϯ Product Manager / Team ʹҠৡ͍ͯͨ͠ •
·ͬͨ͘ԿͰ͖ͳ͔ͬͨΘ͚Ͱͳ͍ • ͱͱνʔϜʹ༧ࢉͷ͋ΔɺվળͷͨΊͷ࣌ؒͰͰ͖Δ͜ͱ͔͠Ͱ͖ͳ͔ͬͨʢִि1ʣ • QB Day ͱݺΕΔ • Τϥʔʹର͢Δతͳରॲɺܰඍͳ Performance վળͳͲ • ΞʔΩςΫνϟมߋɺΠϯϑϥͳͲɺظతɾࠜຊతରॲ͔ͬͨ͠ • ʮࢦඪΛݩʹػೳཁٻͱඇػೳཁٻͷ༏ઌஅ͕Ͱ͖Δʯ·Ͱ౸ୡ͠ͳ͔ͬͨ • ༏ઌஅʹʹཱͨͳ͍ͷͰ͋Εɺ։ൃνʔϜʹͱͬͯΔ͜ͱ͕૿͚͑ͨͩͱݴ͑Δ
ͳͥ”ߦಈ͢Δ”·ͰࢸΒͳ͔ͬͨͷ͔ • ৴པੑͷ(ඇػೳཁٻ)ʹରॲ͢Δݖݶɾ༧ࢉෆ • ػೳ։ൃͷ༏ઌ͕ߴ͔ͬͨ
SLO ҧͨ͠Β“ϦϦʔεετοϓ”ͱ͍͏ݬ • SLO ҧͨ͠Βશͯͷ։ൃΛࢭΊͯͦͷमਖ਼Λߦ͏ͷݱ࣮ తͰͳ͍ • ͦΕΛߦ͏ͨΊʹࣄۀऀ͕͜ͷӡ༻ʹ߹ҙ͢Δඞཁ͕͋Δ • ͦΕΛߦ͏ͨΊʹ”ᘳͳSLI/SLO”͕ඞཁ͕ͩɺ”ᘳͳSLI/SLO”
ଘࡏ͠ͳ͍ • ৫͕֦େ͠ɺBiz Dev / Product / CS ͱΛ୲͍ͯ͠Δঢ়گͰɺ ݱஅͰ͜ΕΛΔͷίϛϡχέʔγϣϯίετ͕ߴ͗͢Δ
SLI/SLO ӡ༻ͷཧ: ͱ͋Δ Developer ͷίϝϯτ
ʮSLO Reviewʯat SRE NEXT 2020 ͦͷޙͷ·ͱΊ • ʮ৴པੑࢦඪΛఆΊɺ؍͢ΔʯจԽΛ࡞ͬͨ͜ͱʹՁ͕͋ͬͨ • ࣄۀઓ্ུͷҙࢥܾఆʹʹཱͭࢦඪʹҭͭ·ͰʹࢸΒͳ͔ͬͨ
• ཧ༝1. ඇػೳཁٻͱػೳཁٻͷόϥϯεΛม͑Δҙࢥܾఆݖݶɾ༧ࢉ͕ϓϩμΫτ ։ൃνʔϜʹͳ͔ͬͨ • ৽نػೳ։ൃͷΠϯηϯςΟϒ͕େ͖͍ঢ়گ • ͦͷΑ͏ͳٕज़ઓུ/ٕज़ࢿΛϓϩμΫτ All Ͱߦ͑ΔΈ͕ͳ͔ͬͨ • ཧ༝2. ৴པੑࢦඪ͕ Biz/Dev/SRE શһ͕ཧղ͍͢͠ࢦඪͰͳ͔ͬͨ • backend API ͷ SLI ϢʔβମݧΛද͓ͯ͠ΒͣɺLatency ʹؔ͢Δରॲ TPM ͷઆ໌͍͠
Agenda • લఏɿSRE Λ࣮ݱ͢ΔͱͲ͏͍͏͜ͱ͔ • ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱࣄۀʹ͓͚Δ SRE • 2020ʮSLO ReviewʯͰͷֶͼͱࣦഊ
• SRE ͱٕज़ઓུ • ·ͱΊͱࠓޙ
2021 ελσΟαϓϦ খதߴϓϩμΫτ։ൃ෦ ٕज़ઓུάϧʔϓൃ
ελσΟαϓϦখֶɾதֶɾߴߍɾେֶडݧߨ࠲ ελσΟαϓϦ For TEACHERS ελσΟαϓϦ For SCHOOL ݱঢ়ͷ৫ਤ: খதߴϓϩμΫτ։ൃ෦ ҎԼ17άϧʔϓ
TPM BtoB TPM BtoC TPM ForSCHOOL TPM ԣஅ BtoC BtoB QA ։ൃࢧԉ SRE ٕज़ઓུ ίʔνϯά ৽ن։ൃ1 Τϯϋϯε ֶशࢧԉ Native iOS Android ৽ن։ൃ2 ਐ࿏ओମੑ ίϛϡχέʔγϣϯࢧԉ ForSCHOOLϞόΠϧ
Disclaimer • ٕज़ઓུάϧʔϓͷ্ཱͪ͛લϚωʔδϟ͕ߦͬͨͷ • ࡢ࣌Ͱ @chaspy DevOps WG ͷ
Lead -> EM/Lead • લͷୀ৬ʹ͍ٕज़ઓུάϧʔϓͷ EM ෦͕݉ͭ͠ ͭɺଞ໊ͷ EM ͱҰॹʹӡӦ͍ͯ͠Δ • SLO ҧͷରॲ͕Ͱ͖ͳ͍͜ͱ͕ཧ༝Ͱ্ཱ͕ͪͬͨΘ͚Ͱ ͳ͍
ٕज़ઓུͱ ϓϩμΫτ։ൃ৫ʹ͓͍ͯ ԿΛΔ͔ɺԿΛΒͳ͍͔
ͳٕͥज़ઓུ͕ඞཁ͔ • Ϗδωεͷʹਵ͢ΔͨΊ • มߋΛ͍͢͠γεςϜɾίʔυɾ৫ʹ͢Δ • มߋΛ͛ΔཁҼઌճΓͯ͠ରॲ͢Δ(ٕज़తෛ࠴ͷฦࡁ) • ͦͷঢ়ଶΛظతʹҡ࣋Ͱ͖ΔೳྗΛಘΔ
ͳٕͥज़ઓུ”άϧʔϓ”͕ඞཁ͔ • ٕज़ઓུͷܾΊํ৫ʹΑͬͯҟͳΔ • 1ਓͷ CTO ͕τοϓμϯͰܾΊ͍͍ͯ • ϘτϜΞοϓͰશһ߹ٞͰܾΊ͍͍ͯ •
ͦͷதؒͰ͍͍ • ελσΟαϓϦখதߴ։ൃ৫ٕज़ઓུΛ1ਓʹґଘ͠ͳ͍ΈΛ ࡞Δ͜ͱʹઓ͍ͯ͠Δ
ٕज़ઓུάϧʔϓͱ • త • ϓϩμΫτ։ൃ৫ͱͦͷγεςϜΛΑΓมԽʹڧ͘͢Δ • ඪ • ٕज़తͳϏδϣϯͱํͷࡦఆ •
ٕज़త՝ɾෛ࠴ΛίϯτϩʔϧԼʹஔ͘ • վળαΠΫϧͷཱ֬ͱࣗݾஅೳྗͷ֫ಘ
࠷ॳͷ1Ͱͬͨ͜ͱ • ৽ن։ൃɿΤϯϋϯεɿٕज़తෛ࠴ղফ = 1:1:1 ͷ༧ࢉએݴ • ٕज़తෛ࠴ղফҊ݅ΛܾΊͯ։ൃϩʔυϚοϓʹΈࠐΉ • ٕज़՝ͷཧํ๏ͱൃੜ࣌ͷτϦΞʔδํ๏ͷܾఆ
SLO ҧ࣌ͷΞΫγϣϯ͕औΕΔΑ͏ʹΈ͕Ͱ͖ͨ • Ұఆඇػೳཁٻʹ༧ࢉ͕ͯΒΕɺൃੜٕͨ͠ज़՝Λཧ ͠ɺ࣮ߦͰ͖Δମ੍͕ͬͨ • SLO ҧͨ͠ࡍʹ࣮ࢪ͢Δ͜ͱ͕͔ٕͬͨ͠ज़՝Λܭը ʹΈࠐΊΔΑ͏ʹͳͬͨ
׆ಈମ • త • ϓϩμΫτ։ൃ৫ͱͦͷγεςϜΛΑΓมԽʹڧ͘͢Δ • ඪ • ٕज़తͳϏδϣϯͱํͷࡦఆ •
ٕज़త՝ɾෛ࠴ΛίϯτϩʔϧԼʹஔ͘ • վળαΠΫϧͷཱ֬ͱࣗݾஅೳྗͷ֫ಘ DevOps WG ԣஅWG Backend WG Frontend WG
׆ಈମ • త • ϓϩμΫτ։ൃ৫ͱͦͷγεςϜΛΑΓมԽʹڧ͘͢Δ • ඪ • ٕज़తͳϏδϣϯͱํͷࡦఆ •
ٕज़త՝ɾෛ࠴ΛίϯτϩʔϧԼʹஔ͘ • վળαΠΫϧͷཱ֬ͱࣗݾஅೳྗͷ֫ಘ DevOps WG ԣஅWG Backend WG Frontend WG
ٕज़త՝ɾෛ࠴ΛίϯτϩʔϧԼʹஔ͘ • ٕज़՝Λ͛ࠐΊΔॴɾΈΛ࡞Δ • ٕज़՝ͷ༏ઌΛ͚ͭΔ • དྷظͷ։ൃϩʔυϚοϓʹଓ͢Δ
ٕज़՝Λ͛ࠐΊΔॴɾΈΛ࡞Δ ୭Ͱٕज़՝Λ͛ࠐΊΔ spreadsheet reacji Ͱؾܰʹ͛ࠐΊΔ
͛ࠐ·Εͨ՝ͷ re fi nement ͭΒ͞ = ϖΠϯΛݴޠԽ ݱ࣌Ͱු͔Ϳ How Λهࡌ
(༏ઌ͕͚ͭΒΕΕ͍͍ͷͰɺ͜ͷཝ optional)
՝ͷ༏ઌॱҐ͚ ϖΠϯͷසͱڧͰϚοϐϯά ٕज़՝ͷ૬ରධՁΛ͚ͭΔ
དྷظͷ։ൃϩʔυϚοϓʹଓ͢Δ ϖΠϯͷԽ ։ൃϩʔυϚοϓଓঢ়گ
ͪΖΜɺᘳͰͳ͍ • ٕज़՝සͱڧͰଌΕΔͷͰͳ͍ • ఆੑతͰ͋Δ • ࢀՃϝϯόʔͷภΓ͕͋Δ͔ • ෳ member
ͷ vote ݁Ռͷॏ৺ʹஔ͍͍ͯΔͷͰਫ਼ʹٙ • ։ൃϦιʔεɺٕज़తқɺϦεΫʹΑ͙ͬͯ͢ʹऔΓ͔͔Εͳ ͍՝͋Δ • ՝ͷ༏ઌ͚ʹ͕͔͔࣌ؒΔ • etc…
͔͠͠ɺಛఆͷਓʹґଘ͠ͳ͍ελʔτϥΠϯʹཱͬͨ • ୭Ͱٕज़՝Λද໌͢Δ͜ͱ͕Ͱ͖Δ • ද໌ͨ͠՝ඞͣऔΓѻΘΕΔ • શһͰʹ͖߹͑Δ - vs
ࢲͨͪ • ظతͳ৴པੑͷ Controlability ୲อʹཱͭ
׆ಈମ • త • ϓϩμΫτ։ൃ৫ͱͦͷγεςϜΛΑΓมԽʹڧ͘͢Δ • ඪ • ٕज़తͳϏδϣϯͱํͷࡦఆ •
ٕज़త՝ɾෛ࠴ΛίϯτϩʔϧԼʹஔ͘ • վળαΠΫϧͷཱ֬ͱࣗݾஅೳྗͷ֫ಘ DevOps WG ԣஅWG Backend WG Frontend WG
DevOps WG ͷతͱ׆ಈ • తɿʢ։ൃνʔϜͷʣࣗݾஅೳྗͷ֫ಘͷͨΊʹઃஔ • ϝϯόʔ ྖҬ͝ͱͷ WebDev /
QA / SRE • ׆ಈ༰ • όϦϡʔετϦʔϜϚοϐϯάͷ࣮ࢪ • ީิͱͳΔ Metrics / Indicator ͷચ͍ग़͠ͱܭଌ • DX Criteria ͷ࣮ࢪ • όϦϡʔετϦʔϜΛ્͢ΔཁҼͷղܾ(e.g. E2E Automation) • ϓϩμΫτ։ൃ෦֎ͷใ׆ಈ • ·ͣ༗ޮͦ͏ͳ metrics ΞηεϝϯτΛݕূͨ͠
όϦϡʔετϦʔϜϚοϐϯά Meta Issue (Epic) ࡞~ ։ൃ
όϦϡʔετϦʔϜϚοϐϯά ։ൃྃ ~ εϓϦϯτϨϏϡʔ QA ~ ຊ൪ϦϦʔε
ީิͱͳΔ Metrics / Indicator ͷચ͍ग़͠ͱܭଌ όϦϡʔετϦʔϜϚοϐ ϯάΛΠϯϓοτʹɺվ ળͨ͠߹ͷϝϦοτͷ ԾઆɺͦΕΛ્ͯ͠ ͍ΔΛϐοΫΞοϓ
Q https://blog.studysapuri.jp/entry/2020/08/17/dx-criteria-system
ϓϩμΫτ։ൃ෦֎Ͱͷใ׆ಈ: BtoC All Hands Ͱͷൃද https://blog.studysapuri.jp/entry/2020/08/17/dx-criteria-system • ॴଐάϧʔϓΛ͑ͨࣄۀঢ়گΛΔ • Ϛʔέοτχϡʔε
• ࣄۀঢ়گ • ϓϩμΫτ KPI • SLI / ։ൃऀੜ࢈ੑ • ͦͷଞτϐοΫ͞·͟· • SRE ͱͳʹʁ • ϚΠΫϩαʔϏεͬͯͳʹʁ͏Ε͍͠ͷʁ
ϓϩμΫτ։ൃ෦֎Ͱͷใ׆ಈ: BtoC All Hands Ͱͷൃද • “͡Ίͯͷ SRE” ͱͯ͠։ൃ෦Ҏ֎ͷํ͚ʹൃද
ϓϩμΫτ։ൃ෦֎Ͱͷใ׆ಈ: BtoC All Hands Ͱͷൃද • “͡Ίͯͷ SRE” ͱͯ͠։ൃ෦Ҏ֎ͷํ͚ʹൃද
DevOps WG ͷతͱ׆ಈ • తɿʢ։ൃνʔϜͷʣࣗݾஅೳྗͷ֫ಘͷͨΊʹઃஔ • ʢ։ൃνʔϜ͕ʣࢦඪΛݩʹߦಈ͢ΔจԽΛ࡞Ζ͏ͱ͍ͯ͠Δ • SLI/SLO Λݟͯ৴པੑΛ؍͍ͯ͘͜͠ͱͱಉ͡Ͱ…?
SRE ͱٕज़ઓུ • DevOps WG ͷ׆ಈ ʮSRE ͷ࣮ݱʯͷจԽ໘Ͱͷ֦ு • զʑ͕ݟΔ͖ࢦඪγεςϜͷ৴པੑࢦඪ͚ͩͰͳ͍
• ͋ΒΏΔͷΛࢦඪΛݟͯɺҙࢥܾఆ͢Δ • ࠓޙ͜ͷจԽৢͦͷͷͷվળαΠΫϧΛճ͢ • 1. ީิͱͳΔ metrics ͷ༗ޮੑ͕໌Β͔ʹͳΓɺԽ͢Δ • 2. ։ൃνʔϜ͕ͦΕΛݟͯɺΞΫγϣϯΛߟ͑Δ͜ͱ͕Ͱ͖Δ • 3. ։ൃνʔϜ͕ΞΫγϣϯ->վળͷαΠΫϧΛճ͢ • 4. 1-3 ͦΕࣗମ͕͏·͍͍ͬͯ͘Δ͔ΛධՁ͢Δ
SRE ͱٕज़ઓུ: ·ͱΊ • SRE Λ࣮ݱ͢ΔͨΊʹɺSLO ҧΛͨ࣌͠ʹߦಈͰ͖Δ༧ࢉͱݖݶ ͕ඞཁ • ͦͷ্Ͱɺٕज़՝Λղܾ͢Δ༏ઌॱҐΛ͚ͭΒΕΔٕज़ઓུ͕ඞཁ
• ʰελσΟαϓϦʱখதߴϓϩμΫτ։ൃ෦Ͱ͜ͷٕज़ઓུΛ1ਓ ʹґଘͤͣɺάϧʔϓͰ࣮ݱ͢Δ͜ͱʹઓ͍ͯ͠Δ • ͋ΒΏΔͷΛࢦඪͰݟ͍ͯ͘จԽ͕৴པੑͷͨΊʹॏཁ
Agenda • લఏɿSRE Λ࣮ݱ͢ΔͱͲ͏͍͏͜ͱ͔ • ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱࣄۀʹ͓͚Δ SRE • 2020ʮSLO ReviewʯͰͷֶͼͱࣦഊ
• SRE ͱٕज़ઓུ • ·ͱΊͱࠓޙ
·ͱΊ / SRE Λ࣮ݱ͢ΔͨΊʹඞཁͳ͜ͱ • ࢦඪΛݩʹվળ͠ଓ͚ΔจԽ • ࣗݾ݁ԽΛࢧ͑Δ Platform •
༏ઌॱҐΛมߋ͢ΔͨΊͷٕज़ઓུ
SRE Λ࣮ݱ͢Δͱ
։ൃνʔϜ͕৴པੑΛ ίϯτϩʔϧ͢Δ Capability Λ ʹ͚͍ͭͯΔ͜ͱ
SRE “NEXT” in ʰελσΟαϓϦʱ • “৴པੑ” ʹؔͯ͠ Enabling Team ͱͯ͠ͷ
SRE Team ׂΛՌͨͭͭ͋͠Δ • SRE Team ͷࠓޙ • ΑΓ৫Λ Sustainable / Scalable ʹ͢ΔͨΊʹɺPlatform ʹؔ͢Δ ΦϯϘʔσΟϯάͷ֦ॆɺ։ൃνʔϜ͕ࣗతʹ৴པੑʹؔ͢Δ Capability शಘΛஅͰ͖ΔΞηεϝϯτΛఏڙ͢Δ • ৴པੑ͚ͩͰͳ͍ɺ։ൃੜ࢈ੑΛՌͨͤΔ Platform ։ൃʹྗ͢Δ
Who owns the Service Level? • Service Level ϓϩμΫτʹؔΘΔશһͷͷ •
શһ͕ؔ৺Λ࣋ͯΔΑ͏ͳ৴པੑࢦඪʹਐԽͤ͞·͠ΐ͏ • ϢʔβମݧΛతʹද͢ Client-side(WebFrontend/Native) Ͱͷ SLI/SLO Λ͏ • ʮࢦඪΛݟͯߦಈ͢ΔʯͦͷͷͷվળαΠΫϧΛճ͠·͠ΐ͏
Thank you! chaspy chaspy_ Engineering Manager, Site Reliability at Recruit
Co., Ltd. Takeshi Kondo https://chaspy.me