Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
ポストモーテム運用を支える文化と技術 / Culture and Technology Sup...
Search
Takeshi Kondo
February 09, 2023
Technology
2
2.7k
ポストモーテム運用を支える文化と技術 / Culture and Technology Supporting Postmortem Operations
https://findy.connpass.com/event/273197/
Takeshi Kondo
February 09, 2023
Tweet
Share
More Decks by Takeshi Kondo
See All by Takeshi Kondo
SRE NEXT CfP チームが語る 聞きたくなるプロポーザルとは / Proposals by the SRE NEXT CfP Team that are sure to be accepted
chaspy
2
1.4k
Slack Platform(Deno) での RAG 実装 - LangChain(js) を使ってみた / rag-implementation-on-slack-platform-deno-experimenting-with-langchain-js
chaspy
0
240
SRE の考えをマネジメントに活かす / applying SRE ideas to management
chaspy
7
7.7k
RAGの簡易評価によるフィードバックサイクル実践 / Feedback cycle practice through simplified assessment of RAGs
chaspy
2
5.6k
定量データと定性評価を用いた技術戦略の組織的実践 / Systematic implementation of technology strategies using quantitative data and qualitative evaluation
chaspy
9
2k
エンジニアブランディングチームの KPI / KPI's of engineer branding team
chaspy
2
2.2k
「SLO Review」今やるならこうする / If I had to do the "SLO Review" again
chaspy
3
2.1k
開発者とともに作る Site Reliability Engineering / SREing with Developers
chaspy
10
8.5k
自己診断能力の獲得を目指して / Toward the acquisition of self-diagnostic skills
chaspy
1
5.3k
Other Decks in Technology
See All in Technology
能登半島地震で見えた災害対応の課題と組織変革の重要性
ditccsugii
0
750
社内報はAIにやらせよう / Let AI handle the company newsletter
saka2jp
8
1.4k
GoでもGUIアプリを作りたい!
kworkdev
PRO
0
140
『OCI で学ぶクラウドネイティブ 実践 × 理論ガイド』 書籍概要
oracle4engineer
PRO
3
220
そのWAFのブロック、どう活かす? サービスを守るための実践的多層防御と思考法 / WAF blocks defense decision
kaminashi
0
200
OCI Network Firewall 概要
oracle4engineer
PRO
2
7.9k
リセラー企業のテクサポ担当が考える、生成 AI 時代のトラブルシュート 2025
kazzpapa3
1
160
名刺メーカーDevグループ 紹介資料
sansan33
PRO
0
930
今この時代に技術とどう向き合うべきか
gree_tech
PRO
0
680
大規模サーバーレスAPIの堅牢性・信頼性設計 〜AWSのベストプラクティスから始まる現実的制約との向き合い方〜
maimyyym
9
4.6k
JAZUG 15周年記念 × JAT「AI Agent開発者必見:"今"のOracle技術で拡張するAzure × OCIの共存アーキテクチャ」
shisyu_gaku
1
160
ニッポンの人に知ってもらいたいGISスポット
sakaik
0
130
Featured
See All Featured
Done Done
chrislema
185
16k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
252
21k
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.5k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
8
910
How to train your dragon (web standard)
notwaldorf
97
6.3k
Become a Pro
speakerdeck
PRO
29
5.5k
The Invisible Side of Design
smashingmag
302
51k
Build The Right Thing And Hit Your Dates
maggiecrowley
37
2.9k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.2k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
46
7.7k
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
20
1.2k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
48
9.7k
Transcript
ϙετϞʔςϜӡ༻Λࢧ͑ΔจԽͱٕज़ Takeshi Kondo / @chaspy 2023/02/07 ΠϯγσϯτʹͲ͏ରԠ͖͔ͯͨ͠ʁΈΜͳͰֶͿϙετϞʔςϜ Lunch LT
Who am I chaspy chaspy_ Engineering Manager Site Reliability and
Web Application Development at Recruit Co., Ltd. Takeshi Kondo https://chaspy.me
લఏɿϓϩμΫτհ - ελσΟαϓϦ
ࠓ͢͜ͱ ʮϙετϞʔςϜӡ༻ʯͷલఏͱͳΔจԽͱٕज़
ࠓ͞ͳ͍͜ͱ ʮϙετϞʔςϜӡ༻ʯͦΕࣗମͷ
Outline • ϙετϞʔςϜӡ༻ͷݱঢ় • ϙετϞʔςϜӡ༻ͷྺ࢙ • ϙετϞʔςϜӡ༻Λࢧ͑ΔจԽ • ϙετϞʔςϜӡ༻Λࢧ͑Δٕज़ •
·ͱΊ
Outline • ϙετϞʔςϜӡ༻ͷݱঢ় • ϙετϞʔςϜӡ༻ͷྺ࢙ • ϙετϞʔςϜӡ༻Λࢧ͑ΔจԽ • ϙετϞʔςϜӡ༻Λࢧ͑Δٕज़ •
·ͱΊ
ϙετϞʔςϜӡ༻ͷݱঢ় • োൃੜޙʮϙετϞʔςϜॻ͖·͠ΐ͏ʯͷ • ؔऀͰू·ͬͯڞ༗ • ΞΫγϣϯ֤νʔϜͷΠγϡʔͱͯ͠ੵ·ΕΔ
ΧδϡΞϧʹϙετϞʔςϜ͕ߦΘΕΔ༷ࢠ ܰඍͳͷͰʮֶͼͷνϟϯεʯͱଊ͑Δ త͕ਁಁ͍ͯ͠Δ །Ұͷͱͯ͠ Slack ΧελϜ ϨεϙϯεͰ issue template ͕ग़
ͯ͘Δͷॻͨ͘ΊͷϋʔυϧΛ Լ͍͛ͯΔ…?
ੲॻ͍ͨهࣄ͕ࠓͰҾ༻͞Ε͍ͯΔ ࠓճ Findy ͞Μʹ͔͚ͯΒͬ ͨͷ͜ͷهࣄΛݟͯΒ͔ͬͨ ΒͰͨ͠🙏 2019… ʮোରԠͱϙετϞʔςϜ ελσΟαϓϦʯͰݕࡧʂ
Outline • ϙετϞʔςϜӡ༻ͷݱঢ় • ϙετϞʔςϜӡ༻ͷྺ࢙ • ϙετϞʔςϜӡ༻Λࢧ͑ΔจԽ • ϙετϞʔςϜӡ༻Λࢧ͑Δٕज़ •
·ͱΊ
ϙετϞʔςϜӡ༻ͷྺ࢙ • Issue Template ͷ First Commit 20195݄ • ͦΕ͔ΒςϯϓϨʔτͷߋ৽΄ͱΜͲͳ͍
ϙετϞʔςϜӡ༻ͷྺ࢙ • SRE ຊ͔ΒςϯϓϨʔτྲྀ༻ • Issue Template ͷ First Commit
20195݄
ϙετϞʔςϜӡ༻ͷྺ࢙ • TTD/TTR Λه
Outline • ϙετϞʔςϜӡ༻ͷݱঢ় • ϙετϞʔςϜӡ༻ͷྺ࢙ • ϙετϞʔςϜӡ༻Λࢧ͑ΔจԽ • ϙετϞʔςϜӡ༻Λࢧ͑Δٕज़ •
·ͱΊ
ϙετϞʔςϜΛࢧ͑ΔจԽ • ୭͔1ਓͷ͍ͤʹͳΒͳ͍Α͏ʹ͢Δ • Design Doc • Production Readiness Checklist
• ૉૣ͘ɺΈΜͳͰରԠ͢Δ • োରԠϑϩʔ • ো͔ΒֶͿ • ϙετϞʔςϜڞ༗ձ • ϙετϞʔςϜಡॻձ ඪ४Խ͢Δ తҙࣝͷৢ
Design Doc / Production Readiness Checklist • ʮ͏͔ͬΓʯΛඪ४Խ͢Δ • ෳਓͰϨϏϡʔ͢Δ͜ͱͰʮݸਓͷ͍ͤʯʹͮ͠Β͘͢Δ
• ϨϏϡʔͳ͠୯ಠΦϖϨʔγϣϯͰϛεΔͱͲ͏ͯ͠ݪҼ͕ݸਓʹ ͍ͯ͠·͏Ͱ͠ΐ͏ ʮProduction Readiness ελσΟαϓϦʯͰݕࡧʂ
োରԠϑϩʔ • োରԠϑϩʔɾোϨϕϧ͕ఆٛ͞Ε͍ͯΔ • Slack work fl ow ͰใࠂͰ͖Δ •
ো͔ʁͰใࠂ͢Δ͜ͱΛਪ͍ͯ͠Δ
োରԠϑϩʔ ઌͷ CircleCI ͷ݅ͷใࠂྫ ऀʹࣗಈͰϝϯγϣϯ͕ඈͿ
ϙετϞʔςϜಡॻձ • SRE νʔϜͰΦϯϘʔσΟϯάͰϙετϞʔςϜಡॻձΛ ࣮ࢪ • શ෦ಡΊͳ͍ʢ૿͑ΔʣͷͰʮ͓͢͢ΊʯϙετϞʔςϜ ΛϥϕϧͰཧ • ֶͼ͕ଟ͍ͷ
• ݱࡏͷߏཧղʹͭͳ͕Δͷ • োൃੜ࣌ͷಈ͖ͱͯ͠ࢀߟʹͳΔͷ
͓͢͢ΊϙετϞʔςϜ8બ
ϙετϞʔςϜΛࢧ͑ΔจԽ·ͱΊ • ϋʔυϧΛԼ͛Δࡉ͔ͳΈ • Issue Template, Slack custom response •
ඪ४Խ • Production Readiness Checklist, োରԠϑϩʔɺϨϕϧఆٛ • ʮֶͼͷͨΊʯͱ͍͏తҙࣝͷৢ • ࠷ॳݴ͍ଓ͚Δɾॻ͖ଓ͚Δ͔͠ͳ͍ؾ͕͠·͢ • աڈ Slack ݕࡧͯ͠ΈΔͱোʹରͯ͠ʮॻ͍ͯΒ͑·͔͢ʁʯͱΑ͓͘ئ͍͍ͯͨ͠ • ॻ͍ͨ݅ chaspy ͕Ұ൪ଟͦ͏… • ϒϩάΛॻ͘ͷޮՌ͋ͬͨͱࢥ͍·͢
ϙετϞʔςϜΛࢧ͑Δٕज़ • ॏཁͳোࣄલʹ͛ΔΑ͏ʹͳ͍ͬͯ·͔͢ʁ • దʹϦεΫΛऔΔ͜ͱ͕Ͱ͖͍ͯ·͔͢ʁ • ʮ೦ͷҝ֬ೝʯ͕؆୯ʹͰ͖ΔΑ͏ʹͳ͍ͬͯ·͔͢ʁ
ϙετϞʔςϜӡ༻Λࢧ͑Δٕज़ • ෛՙςετ • Canary Release • E2E Test Automation
• σʔλϕʔεϦετΞ
ෛՙςετ Production Readiness Checklist Ͱ Performance Risk Λಛఆͯ͠Β͍ɺ ඞཁͰ͋Ε Loadtest
ΛҊ Load Test ࣮ࢪ༰ͷ Template Requirements Λهࡌͯ͠ SRE ͱ։ൃ νʔϜͰઢΛ߹ΘͤΔ
ෛՙςετ • Gatling ͷίʔυΛॻ͍ͯςετ͕࣮ࢪͰ͖Δڥ • ςετ݁Ռ͕ PR ʹషΒΕΔ • ෛՙςετ͕ߴʹࢼߦࡨޡͰ͖Δ
Ϩϙʔτੜ
ෛՙςετ • ڥ४උ؆୯ͱݴΘͳ͍͕ɺϋʔυϧԼ͕͍ͬͯΔ • Databaseʢຊ൪͔ΒϦετΞ͢Δɻޙड़ʣ • Application (Pull Request Λ࡞ΕͰ͖Δʣ
• EKS Node Group • Test code
Canary Release • Argo Rollouts Λ׆༻ • Rails Upgrade ͳͲɺػೳมߋͳ͍͕ɺϦεΫͷߴ͍มߋʹ͏
φΠεTryͰ͢ΑͶ 1% ͔ΒϦϦʔε͠ɺΤϥʔ͕ग़ͨΒ͙͢ ͢͜ͱͰඃΛ࠷খݶʹͰ͖·ͨ͠
E2E Test Automation • ϒϩάΛݟ͍ͯͩ͘͞ʂ • ݕࡧʮελσΟαϓϦ E2Eʯ • ݕग़͢Δෆ۩߹ͦΕͳΓʹ͋Γɺຊ൪োΛ͍Ͱ͍Δ
σʔλϕʔεϦετΞ • ͪ͜ΒৄࡉϒϩάΛ͝ཡ͍ͩ͘͞ʂ • ݕࡧʮελσΟαϓϦ σʔλϕʔεϦετΞʯ
·ͱΊ • ϙετϞʔςϜӡ༻Λࢧ͑ΔจԽͱٕज़Λհ͠·ͨ͠ • ϓϩηεɾจԽ໘ඪ४Խͱతҙࣝͷৢ͕ॏཁ • ٕज़໘ൃੜޙͷ࠶ൃࢭͷੵΈॏͶ • จԽͱٕज़ɺ྆ํ͕૬ޓʹ࿈ܞ͢Δ •
ੵΈॏͶΔ͜ͱͰʮಉ͡োʯى͖ͮΒ͘ͳΔ • ʮ৽͍͠োʯֶͼͷνϟϯεʹͳΔ
ࠓ͞ͳ͔ͬͨ͜ͱʢεϐʔΧʔτʔΫͰͤͨΒخ͍͠ʣ • োͷධՁɺϨϕϧ͚ • MTTR / MTTD ͷܭଌ • ࣄޙͷλεΫΛ͍͔ʹ։ൃΛ͠ͳ͕Β࣮ࢪ͢Δ͔
• োͱ SLI/SLO
Thank you! chaspy chaspy_ Engineering Manager Site Reliability and Web
Application Development at Recruit Co., Ltd. Takeshi Kondo https://chaspy.me