Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
SRE を実現するための組織マネジメント / Management to achieve SRE
Search
Takeshi Kondo
March 12, 2022
Technology
8k
3
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
SRE を実現するための組織マネジメント / Management to achieve SRE
https://line.connpass.com/event/236497/
Takeshi Kondo
March 12, 2022
More Decks by Takeshi Kondo
See All by Takeshi Kondo
SREの知識地図 - 第2章の紹介 - / Knowledge Map of SRE – Introduction to Chapter 2 –
chaspy
0
89
SRE NEXT CfP チームが語る 聞きたくなるプロポーザルとは / Proposals by the SRE NEXT CfP Team that are sure to be accepted
chaspy
2
1.7k
Slack Platform(Deno) での RAG 実装 - LangChain(js) を使ってみた / rag-implementation-on-slack-platform-deno-experimenting-with-langchain-js
chaspy
0
300
SRE の考えをマネジメントに活かす / applying SRE ideas to management
chaspy
7
8.3k
RAGの簡易評価によるフィードバックサイクル実践 / Feedback cycle practice through simplified assessment of RAGs
chaspy
2
6k
定量データと定性評価を用いた技術戦略の組織的実践 / Systematic implementation of technology strategies using quantitative data and qualitative evaluation
chaspy
9
2.3k
エンジニアブランディングチームの KPI / KPI's of engineer branding team
chaspy
2
2.5k
「SLO Review」今やるならこうする / If I had to do the "SLO Review" again
chaspy
3
2.4k
開発者とともに作る Site Reliability Engineering / SREing with Developers
chaspy
10
9.1k
Other Decks in Technology
See All in Technology
AWS Security Hub CSPMの成功・失敗体験
cmusudakeisuke
0
540
水を運ぶ人としてのリーダーシップ
izumii19
4
990
Claude Codeをどのように キャッチアップしているか
oikon48
13
8.8k
サイバーエージェントにおけるAI推進戦略と変革への取り組み
shotatsuge
0
530
FPC(フレキシブル)基板にZephyr実装してみた。
iotengineer22
0
170
元銀行員がAIだけでアプリを量産!「バイブコーディング実演セミナー 」
tatsuya1970
0
110
Oracle Cloud Infrastructure:2026年6月度サービス・アップデート
oracle4engineer
PRO
0
290
気軽に使える"情報のハブ"としてのNotion活用 〜フロー情報の集積点 と、 Claude Code × Notion AI〜
syucream
1
180
從開發到部署全都交給 AI:實作 AI 驅動的自動化流程
appleboy
0
160
10年間のブログ発信を振り返って見えたWebアプリケーションエンジニアとしての軌跡
stefafafan
0
180
【2026年版】 ベクトル検索とEmbedding最前線
mocobeta
23
7.5k
【FinOps】データドリブンな意思決定を目指して
z63d
0
320
Featured
See All Featured
Building Flexible Design Systems
yeseniaperezcruz
330
40k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
12
1.2k
B2B Lead Gen: Tactics, Traps & Triumph
marketingsoph
0
160
How to optimise 3,500 product descriptions for ecommerce in one day using ChatGPT
katarinadahlin
PRO
1
3.6k
Measuring & Analyzing Core Web Vitals
bluesmoon
9
870
For a Future-Friendly Web
brad_frost
183
10k
Leveraging LLMs for student feedback in introductory data science courses - posit::conf(2025)
minecr
1
300
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
31
2.8k
Measuring Dark Social's Impact On Conversion and Attribution
stephenakadiri
2
220
Crafting Experiences
bethany
1
190
How to Grow Your eCommerce with AI & Automation
katarinadahlin
PRO
1
210
The AI Search Optimization Roadmap by Aleyda Solis
aleyda
1
5.9k
Transcript
SRE Λ࣮ݱ͢ΔͨΊͷ৫Ϛωδϝϯτ Takeshi Kondo / @chaspy 2022/03/12 6ࣾ߹ಉ SREษڧձ
Who am I chaspy chaspy_ Engineering Manager, Site Reliability at
Recruit Co., Ltd. Takeshi Kondo https://chaspy.me
Who am I chaspy chaspy_ ʢגʣϦΫϧʔτ ϓϩμΫτ౷ׅຊ෦ ϓϩμΫτ։ൃ౷ׅࣨ ϓϩμΫτσΟϕϩοϓϝϯτࣨ ·ͳͼྖҬϓϩμΫτσΟϕϩοϓϝϯτϢχοτ
খதߴϓϩμΫτ։ൃ෦ খதߴ̨̧̚άϧʔϓ άϧʔϓϚωʔδϟ Takeshi Kondo https://chaspy.me
ࠓ͢͜ͱ ϦΫϧʔτάϧʔϓͷ ʮϛογϣϯϚωδϝϯτʯΛ ׆༻ͯ͠։ൃνʔϜͷ SRE Capability शಘ Λࢧԉͨ͠ࣄྫ
͋Δ͍ (Partially) Embedded / Enabling SRE ͷࣄྫ
• ։ൃνʔϜͷ৴པੑʹؔ͢Δ Capability शಘʹ2छྨ͋Δ • Embedded SRE (from Pure SRE)
/ ֎͔Β͑Δ • Enabling SRE (in the Team) / ͔Β͛Δ • ৫نɾϑΣʔζʹΑͬͯ࠷దͳύλʔϯ͕ҟͳΔ • খن / ։ൃॳظϑΣʔζͰ͋Ε Embedded SRE Pattern • தେن / ։ൃνʔϜ͕ख़ͯ͘͠Ε Enabling SRE Pattern • ͜ͷ2ͭͷύλʔϯϚωδϝϯτͰσβΠϯͰ͖Δ • 100/0 Ͱͳ͘”෦తʹ”࣮ફ͢Δ͚ͩͰޮՌ͕͋Δ Tl;dr
Disclimer • Management ͷྫͱͯ͠հ͠·͕͢ɺՌ͕ग़ͨͷ ϛογϣϯΛҾ͖ड͚ͯ͘ΕͨϝϯόʔɺSREɺ։ൃνʔϜ ͷօ͞Μͷ͓͔͛Ͱ͢ɻ͍ͭ͋Γ͕ͱ͏͍͟͝·͢ʂ
Agenda • લఏɿSRE Λ࣮ݱ͢ΔͱͲ͏͍͏͜ͱ͔ • ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱSRE • ࣄྫɿ(Partially) Embedded /
Enabling SRE • ·ͱΊͱࠓޙ
Agenda • લఏɿSRE Λ࣮ݱ͢ΔͱͲ͏͍͏͜ͱ͔ • ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱSRE • ࣄྫɿ(Partially) Embedded /
Enabling SRE • ·ͱΊͱࠓޙ
SRE Λ࣮ݱ͢Δͱ
։ൃνʔϜ͕৴པੑΛ ίϯτϩʔϧ͢Δ Capability Λ ʹ͚͍ͭͯΔ͜ͱ
ͦͦ Site Reliability Engineering ͱ: Not like this • αʔϏε͕ʮߴ͍৴པੑ
(ʹ100%)ʯΛอ͍ͬͯΔ͜ͱ • SLI/SLO ΛकΕ͍ͯΔ͜ͱ • ΦϯίʔϧϩʔςʔγϣϯΛ։ൃνʔϜͰߦ͏͜ͱ https://github.com/twitter/twemoji
ͦͦ Site Reliability Engineering ͱ: Like this! • αʔϏε͕ʮϢʔβ͕ظ͢Δ৴པੑʯΛอ͍ͬͯΔ͜ͱ •
SLI/SLO Λઃఆ͠ɺඇػೳཁ݅ͱػೳཁ݅ͷ༏ઌܾఆͷ ࢦඪͱͯ͠׆༻͍ͯ͠Δ • SLO ҧ͕ൃੜͨ͠ͱ͖ʹదʹରॲͰ͖ΔΑ͏ͳϞχλ Ϧϯάํ๏ͱϙϦγʔ͕νʔϜͰಉҙ͞Ε͍ͯΔ • ্ه͕ఆظతʹݟ͞Ε͍ͯΔ https://github.com/twitter/twemoji
։ൃνʔϜ͕৴པੑΛίϯτϩʔϧ͢Δ Capability Λʹ͚ͭΔ: Like this! SRE ։ൃ νʔϜ ։ൃνʔϜͷ৴པੑʹ ؔ͢Δ
Capability औಘ Λࢧԉ͢Δ ࣗͨͪͷαʔϏεͷ ৴པੑΛࣗͨͪͰί ϯτϩʔϧͰ͖͍ͯΔ
Team Topologies • 4ͭͷνʔϜύλʔϯ • Stream Aligned • Platform •
Enabling • Complicated Subsystem • 3ͭͷίϛϡχέʔγϣϯύλʔϯ • Collaboration • X as a Service • Facilitation https://pub.jmam.co.jp/book/b593881.html
։ൃνʔϜ͕৴པੑΛίϯτϩʔϧ͢Δ Capability Λʹ͚ͭΔ: Like this! SRE ։ൃ νʔϜ ։ൃνʔϜͷࣗݾ݁ ԽΛࢧ͑Δϓϥοτ
ϑΥʔϜͱจԽΛ࡞Δ Platform Team Enabling Team Stream Aligned Team ։ൃνʔϜࣗͨͪͰඞཁͳ ͷΛࣗͨͪͰ༻ҙͰ͖Δ = self-contained / ࣗݾ݁Խ
SRE Team ͷ Vision / Mission / Values https://blog.studysapuri.jp/entry/sre-vision-mission-values
Mission ࣗݾ݁νʔϜ͕ϓϩμΫ τΛૉૣ҆͘શʹಧ͚ଓ͚ ΔͨΊͷϓϥοτϑΥʔϜ ͱจԽΛ࡞Δ
Agenda • લఏɿSRE Λ࣮ݱ͢ΔͱͲ͏͍͏͜ͱ͔ • ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱSRE • ࣄྫɿ(Partially) Embedded /
Enabling SRE • ·ͱΊͱࠓޙ
ʢͦͷલʹʣ ϓϩμΫτհ
None
None
None
ྺ࢙͔ΒৼΓฦΔ ʰελσΟαϓϦʱSRE
ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱSRE • 2019: Application Platform Λ Kubernetes Ҡ • 2020:
Microservices Readiness ͷඋ • αʔϏεΦʔφʔγοϓͷࡦఆ • Design Doc / Production Readiness Checklist • Self-services Infrastructure (terraform monorepo) • SLI/SLO • 2021: SLI/SLO ӡ༻Λ։ൃνʔϜʹશҠৡ Platform Team ͱͯ͠ Platform Λ ࡞͍ͬͯΔ Enabling Team ͱͯ͠ ։ൃ৫ʹ SLI/SLO ͳͲͷΧϧνϟʔৢ
৫نͷਪҠ ։ൃऀ
43& ։ൃऀελσΟαϓϦɾQuipper ྆ํͷɺWeb Engineer (frontend&backend) ͷɻNative আ֎͍ͯ͠Δɻ
2021ɺEnabling SRE Λ։ൃνʔϜ͔Β࡞ΔΑ͏ઓུมߋ • ʮ৴པੑʯΛऔΓר͘։ൃ৫ͷঢ়گ͕ΑΓΞϓϦέʔγϣϯɾ υϝΠϯʹಛԽͨ͠ʹͳΓͭͭ͋ͬͨ • ෛՙࢼݧ • υϝΠϯಛԽͷ
Pod Auto Scaling • Frontend Performance ͷଌఆ ͓Αͼ SLI/SLO ͷվળ • QA ࣗಈԽ • 1ͭͷ SRE Team ͕ Enabling Team ͱͯ͠ৼΔ͏ΑΓɺ։ൃ νʔϜʹ Enabling SRE Λ࡞Δํʹઓུมߋ https://blog.studysapuri.jp/entry/2022/02/17/sre-study-session
։ൃνʔϜ Enabling SRE Λ࡞Δ
։ൃνʔϜ Enabling SRE Λ࡞Δ
2020ࠒͷঢ়گ SRE ։ൃ νʔϜ ։ൃ νʔϜ Facilitating Facilitating Enabling Team
Stream Aligned Team
2022ݱࡏ SRE ։ൃνʔϜ Facilitation SRE mem ber mem ber mem
ber Facilitation ϑϥΫλϧతʹͳΔ Platform Team Enabling Team Stream Aligned Team Enabling SRE X as a Service
Pure SRE vs Embedded SRE https://www.slideshare.net/newrelic/sreiously-de fi ning-the-principles-habits-and-practices-of-site-reliability-engineering-112178269
2020ࠒͷঢ়گ SRE ։ൃ νʔϜ ։ൃ νʔϜ Facilitating Facilitating Pure SRE
2022ݱࡏ SRE ։ൃνʔϜ Facilitating SRE mem ber mem ber mem
ber Facilitating ϑϥΫλϧతʹͳΔ Pure SRE Embedded SRE X as a Service
Agenda • લఏɿSRE Λ࣮ݱ͢ΔͱͲ͏͍͏͜ͱ͔ • ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱSRE • ࣄྫɿ(Partially) Embedded /
Enabling SRE • ·ͱΊͱࠓޙ
ࠓ͢͜ͱ ϦΫϧʔτάϧʔϓͷ ʮϛογϣϯϚωδϝϯτʯΛ ׆༻ͯ͠։ൃνʔϜͷ SRE Capability शಘ Λࢧԉͨ͠ࣄྫ
͋Δ͍ (Partially) Embedded / Enabling SRE ͷࣄྫ
ϛογϣϯϚωδϝϯτ https://github.com/twitter/twemoji
ϦΫϧʔτͷϛογϣϯϚωδϝϯτ • ϝϯόʔͷ Will / Can / Must ΛϚωʔδϟͱ͢Γ߹ΘͤΔ •
֤ϛογϣϯׂ߹ɾ༰ɾୡج४Λ߹ҙ͞ΕΔ • ϛογϣϯͷϨϙʔτϥΠϯඞͣ͠ଐͷνʔϜϚωʔ δϟͰ͋Δඞཁͳ͍
ϦΫϧʔτͷϛογϣϯϚωδϝϯτ EM Mem ber Mem ber Mem ber Mem ber
ϛογϣϯͷ 30%Λ SRE ؔ ͷͷʹઃఆ SRE ։ൃνʔϜ
۩ମతʹͲΜͳϛογϣϯΛઃఆ͔ͨ͠ • ։ൃνʔϜϝϯόʔʢதֶߨ࠲ϦχϡʔΞϧͷ։ൃʣ • ΠϯϑϥྖҬͷࣗݾ݁Խͷਪਐ 30% • ϓϩμΫτ։ൃͷͨΊͷϛογϣϯ 70% •
SRE ϝϯόʔ • ։ൃνʔϜͷ։ൃऀੜ࢈ੑͷαϙʔτ 20% • Production Release ͷαϙʔτ 20% • SRE ͷͨΊͷϛογϣϯ 60% https://studysapuri.jp/course/junior/
ϦΫϧʔτͷϛογϣϯϚωδϝϯτ EM Mem ber Mem ber ΠϯϑϥྖҬͷ ࣗݾ݁Խͷਪ ਐ(30%) SRE
։ൃνʔϜ ϓϩμΫτ։ൃʹؔ͢Δ ϛογϣϯ(70%) ։ൃऀੜ࢈ੑ/ Production Release ͷ αϙʔτ(40%) / (ଞ60%)
Ϛωʔδϟ͕ͬͨ͜ͱ • ֤ϝϯόʔͱͷఆظతͳ 1on1 • ϛογϣϯͷதؒৼΓฦΓ • ϛογϣϯΛՄࢹԽ͢ΔϛογϣϯπϦʔͷ࡞ • ϛογϣϯͷ૬ޓઆ໌ͷͷઃఆ
ϛογϣϯΛՄࢹԽ͢ΔϛογϣϯπϦʔ https://blog.studysapuri.jp/entry/2022/02/25/sre-mission-tree
Կ͕ى͖ͨͷ͔(1) • ੜ࢈ੑվળαΠΫϧͷՃ • ՝ͷٵ্͍͛ -> ࣮ -> ϑΟʔυόοΫ ->
վળͷαΠΫϧ͕Ճ
Կ͕ى͖ͨͷ͔(2) • SRE Culture ͷൖɿϓϨϞʔςϜͷ࣮ࢪ https://blog.studysapuri.jp/entry/pre-mortem
Կ͕ى͖ͨͷ͔(3) • ΞϥʔτϋϯυϦϯάͷαϙʔτ • Alert ͦͷͷͷઆ໌ɺௐࠪํ๏ SRE ͕αϙʔτ • ରԠͦͷͷ։ൃνʔϜͰ࣮ࢪ
݁ՌͲ͏ͳ͔ͬͨ • େ͖ͳোͳ͘ελσΟαϓϦதֶߨ࠲ͷϑϧϦχϡʔΞϧ ͕ϦϦʔε • ։ൃνʔϜͰΞϥʔτରԠ࣮ݱ https://studysapuri.jp/course/junior/ https://github.com/twitter/twemoji
ࠓճͬͨ͜ͱͳΜͩͬͨͷ͔ SRE ։ൃνʔϜ SRE mem ber mem ber mem ber
Facilitation Pure SRE (։ൃνʔϜ) (Partially) Enabling SRE SRE (Partially) Embedded SRE ͱͯ͠Ҡಈ
ࠓճͬͨ͜ͱͳΜͩͬͨͷ͔ SRE ։ൃνʔϜ SRE mem ber mem ber mem ber
Facilitating Pure SRE (։ൃνʔϜ) (Partially) Enabling SRE SRE (Partially) Embedded SRE ͱͯ͠Ҡಈ
ࠓճͬͨ͜ͱͳΜͩͬͨͷ͔ SRE ։ൃνʔϜ SRE mem ber mem ber mem ber
Collaboration Pure SRE (։ൃνʔϜ) (Partially) Enabling SRE SRE (Partially) Embedded SRE ͱͯ͠Ҡಈ
ࠓճͷύλʔϯͷߟ • Enabling SRE ʹΑΔ Facilitating ”த”͔Β࡞Δํ͕ྑ͍ • ΑΓ։ൃνʔϜͷӡ༻ελΠϧʹ͋ͬͨܗͰద༻Ͱ͖Δ •
ٕज़తͳ࣮ Platform ʹৄ͍͠ Pure SRE ͕”֎”͔Β Embedded ͞Εͯ Collaboration ͨ͠ํ͕ྑ͍ • ArgoCD, GitHub Actions ͳͲ Infrastructure Pure SRE ͕ৄ͍͠ • ՝ൃݟɺ࣮ɺϑΟʔυόοΫαΠΫϧΛߴʹճ͢͜ͱͰΑΓྑ ͍ Platform ͕ఏڙͰ͖Δ
Agenda • લఏɿSRE Λ࣮ݱ͢ΔͱͲ͏͍͏͜ͱ͔ • ྺ࢙͔ΒৼΓฦΔʰελσΟαϓϦʱSRE • ࣄྫɿ(Partially) Embedded /
Enabling SRE • ·ͱΊͱࠓޙ
• ։ൃνʔϜͷ৴པੑʹؔ͢Δ Capability शಘʹ2छྨ͋Δ • Embedded SRE (from Pure SRE)
/ ֎͔Β͑Δ • Enabling SRE (in the Team) / ͔Β͛Δ • ৫نɾϑΣʔζʹΑͬͯ࠷దͳύλʔϯ͕ҟͳΔ • খن / ։ൃॳظϑΣʔζͰ͋Ε Embedded SRE Pattern • தେن / ։ൃνʔϜ͕ख़ͯ͘͠Ε Enabling SRE Pattern • ͜ͷ2ͭͷύλʔϯϚωδϝϯτͰσβΠϯͰ͖Δ • 100/0 Ͱͳ͘”෦తʹ”࣮ફ͢Δ͚ͩͰޮՌ͕͋Δ Tl;dr
ࠓޙ͞Βʹ։ൃνʔϜͷεέʔϥϏϦςΟͷͨΊʹҎԼʹऔΓΉ • SRE Capability शಘࢧԉ • ϛογϣϯϚωδϝϯτʹΑΔ։ൃνʔϜ Enabling SRE ͷ࠾༻
• SRE ख़Ξηεϝϯτͷ࡞ɾ࣮ࢪ • SRE ࣝɾٕज़शಘͷͨΊͷΦϯϘʔσΟϯάࢧԉ • Developer Success / ։ൃੜ࢈ੑ্ࢧԉ • Platform Λ Product ͱͯ͠։ൃ͢Δ • Developer Support ࠓճͷࣄྫ
Special Thanks • @kyontan • As Embedded SRE • @ravelll
• As Enabling SRE • ʰελσΟαϓϦʱதֶߨ࠲ϑϧϦχϡʔΞϧʹؔΘͬͨશͯͷਓ • SRE νʔϜϝϯόʔ
Thank you! chaspy chaspy_ Engineering Manager, Site Reliability at Recruit
Co., Ltd. Takeshi Kondo https://chaspy.me
͓·͚ɿSRE ख़Ξηεϝϯτ