Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ポストモーテム運用を支える文化と技術 / Culture and Technology Supporting Postmortem Operations

Takeshi Kondo
February 09, 2023

ポストモーテム運用を支える文化と技術 / Culture and Technology Supporting Postmortem Operations

Takeshi Kondo

February 09, 2023
Tweet

More Decks by Takeshi Kondo

Other Decks in Technology

Transcript

 1. ϙετϞʔςϜӡ༻Λࢧ͑ΔจԽͱٕज़
  Takeshi Kondo / @chaspy


  2023/02/07


  ΠϯγσϯτʹͲ͏ରԠ͖͔ͯͨ͠ʁΈΜͳͰֶͿϙετϞʔςϜ Lunch LT

  View full-size slide

 2. Who am I
  chaspy chaspy_
  Engineering Manager

  Site Reliability and Web Application Development

  at Recruit Co., Ltd.
  Takeshi Kondo
  https://chaspy.me

  View full-size slide

 3. લఏɿϓϩμΫτ঺հ - ελσΟαϓϦ

  View full-size slide

 4. ࠓ೔࿩͢͜ͱ


  ʮϙετϞʔςϜӡ༻ʯͷલఏͱͳΔจԽͱٕज़

  View full-size slide

 5. ࠓ೔࿩͞ͳ͍͜ͱ


  ʮϙετϞʔςϜӡ༻ʯͦΕࣗମͷ޻෉

  View full-size slide

 6. Outline
  • ϙετϞʔςϜӡ༻ͷݱঢ়


  • ϙετϞʔςϜӡ༻ͷྺ࢙


  • ϙετϞʔςϜӡ༻Λࢧ͑ΔจԽ


  • ϙετϞʔςϜӡ༻Λࢧ͑Δٕज़


  • ·ͱΊ

  View full-size slide

 7. Outline
  • ϙετϞʔςϜӡ༻ͷݱঢ়


  • ϙετϞʔςϜӡ༻ͷྺ࢙


  • ϙετϞʔςϜӡ༻Λࢧ͑ΔจԽ


  • ϙετϞʔςϜӡ༻Λࢧ͑Δٕज़


  • ·ͱΊ

  View full-size slide

 8. ϙετϞʔςϜӡ༻ͷݱঢ়
  • ো֐ൃੜޙʮϙετϞʔςϜॻ͖·͠ΐ͏ʯͷ੠


  • ؔ܎ऀͰू·ͬͯڞ༗


  • ΞΫγϣϯ͸֤νʔϜͷΠγϡʔͱͯ͠ੵ·ΕΔ

  View full-size slide

 9. ΧδϡΞϧʹϙετϞʔςϜ͕ߦΘΕΔ༷ࢠ
  ܰඍͳ΋ͷͰ΋ʮֶͼͷνϟϯεʯͱଊ͑Δ
  ໨త͕ਁಁ͍ͯ͠Δ
  །Ұͷ޻෉ͱͯ͠ Slack ΧελϜ
  ϨεϙϯεͰ issue template ͕ग़
  ͯ͘Δͷ͸ॻͨ͘ΊͷϋʔυϧΛ
  Լ͍͛ͯΔ…?


  View full-size slide

 10. ੲॻ͍ͨهࣄ͕ࠓͰ΋Ҿ༻͞Ε͍ͯΔ
  ࠓճ Findy ͞Μʹ੠͔͚ͯ΋Βͬ
  ͨͷ΋͜ͷهࣄΛݟͯ΋Β͔ͬͨ
  ΒͰͨ͠🙏


  2019೥…


  ʮো֐ରԠͱϙετϞʔςϜ ελσΟαϓϦʯͰݕࡧʂ

  View full-size slide

 11. Outline
  • ϙετϞʔςϜӡ༻ͷݱঢ়


  • ϙετϞʔςϜӡ༻ͷྺ࢙


  • ϙετϞʔςϜӡ༻Λࢧ͑ΔจԽ


  • ϙετϞʔςϜӡ༻Λࢧ͑Δٕज़


  • ·ͱΊ

  View full-size slide

 12. ϙετϞʔςϜӡ༻ͷྺ࢙
  • Issue Template ͷ First Commit ͸2019೥5݄


  • ͦΕ͔ΒςϯϓϨʔτͷߋ৽͸΄ͱΜͲͳ͍

  View full-size slide

 13. ϙετϞʔςϜӡ༻ͷྺ࢙
  • SRE ຊ͔ΒςϯϓϨʔτྲྀ༻


  • Issue Template ͷ First Commit ͸2019೥5݄

  View full-size slide

 14. ϙετϞʔςϜӡ༻ͷྺ࢙
  • TTD/TTR Λ௥ه

  View full-size slide

 15. Outline
  • ϙετϞʔςϜӡ༻ͷݱঢ়


  • ϙετϞʔςϜӡ༻ͷྺ࢙


  • ϙετϞʔςϜӡ༻Λࢧ͑ΔจԽ


  • ϙετϞʔςϜӡ༻Λࢧ͑Δٕज़


  • ·ͱΊ

  View full-size slide

 16. ϙετϞʔςϜΛࢧ͑ΔจԽ
  • ୭͔1ਓͷ͍ͤʹͳΒͳ͍Α͏ʹ͢Δ


  • Design Doc


  • Production Readiness Checklist


  • ૉૣ͘ɺΈΜͳͰରԠ͢Δ


  • ো֐ରԠϑϩʔ


  • ো֐͔ΒֶͿ


  • ϙετϞʔςϜڞ༗ձ


  • ϙετϞʔςϜಡॻձ
  ඪ४Խ͢Δ
  ໨తҙࣝͷৢ੒

  View full-size slide

 17. Design Doc / Production Readiness Checklist
  • ʮ͏͔ͬΓʯΛඪ४Խ͢Δ


  • ෳ਺ਓͰϨϏϡʔ͢Δ͜ͱͰʮݸਓͷ͍ͤʯʹͮ͠Β͘͢Δ


  • ϨϏϡʔͳ͠୯ಠΦϖϨʔγϣϯͰϛεΔͱͲ͏ͯ͠΋ݪҼ͕ݸਓʹ
  ޲͍ͯ͠·͏Ͱ͠ΐ͏

  ʮProduction Readiness ελσΟαϓϦʯͰݕࡧʂ

  View full-size slide

 18. ো֐ରԠϑϩʔ
  • ো֐ରԠϑϩʔɾো֐Ϩϕϧ͕ఆٛ͞Ε͍ͯΔ


  • Slack work
  fl
  ow ͰใࠂͰ͖Δ


  • ো֐͔΋ʁͰ΋ใࠂ͢Δ͜ͱΛਪ঑͍ͯ͠Δ

  View full-size slide

 19. ো֐ରԠϑϩʔ
  ઌ೔ͷ CircleCI ͷ݅ͷใࠂྫ


  ੹೚ऀʹࣗಈͰϝϯγϣϯ͕ඈͿ


  View full-size slide

 20. ϙετϞʔςϜಡॻձ
  • SRE νʔϜͰ͸ΦϯϘʔσΟϯάͰϙετϞʔςϜಡॻձΛ
  ࣮ࢪ


  • શ෦͸ಡΊͳ͍ʢ૿͑ΔʣͷͰʮ͓͢͢ΊʯϙετϞʔςϜ
  ΛϥϕϧͰ؅ཧ


  • ֶͼ͕ଟ͍΋ͷ


  • ݱࡏͷߏ੒ཧղʹͭͳ͕Δ΋ͷ


  • ো֐ൃੜ࣌ͷಈ͖ͱͯ͠ࢀߟʹͳΔ΋ͷ

  View full-size slide

 21. ͓͢͢ΊϙετϞʔςϜ8બ

  View full-size slide

 22. ϙετϞʔςϜΛࢧ͑ΔจԽ·ͱΊ
  • ϋʔυϧΛԼ͛Δࡉ͔ͳ࢓૊Έ


  • Issue Template, Slack custom response


  • ඪ४Խ


  • Production Readiness Checklist, ো֐ରԠϑϩʔɺϨϕϧఆٛ


  • ʮֶͼͷͨΊʯͱ͍͏໨తҙࣝͷৢ੒


  • ࠷ॳ͸ݴ͍ଓ͚Δɾॻ͖ଓ͚Δ͔͠ͳ͍ؾ͕͠·͢


  • աڈ Slack ݕࡧͯ͠ΈΔͱো֐ʹରͯ͠ʮॻ͍ͯ΋Β͑·͔͢ʁʯͱΑ͓͘ئ͍͍ͯͨ͠


  • ॻ͍ͨ݅਺΋ chaspy ͕Ұ൪ଟͦ͏…


  • ϒϩάΛॻ͘ͷ΋ޮՌ͋ͬͨͱࢥ͍·͢

  View full-size slide

 23. ϙετϞʔςϜΛࢧ͑Δٕज़
  • ॏཁͳো֐͸ࣄલʹ๷͛ΔΑ͏ʹͳ͍ͬͯ·͔͢ʁ


  • ద੾ʹϦεΫΛऔΔ͜ͱ͕Ͱ͖͍ͯ·͔͢ʁ


  • ʮ೦ͷҝ֬ೝʯ͕؆୯ʹͰ͖ΔΑ͏ʹͳ͍ͬͯ·͔͢ʁ

  View full-size slide

 24. ϙετϞʔςϜӡ༻Λࢧ͑Δٕज़
  • ෛՙςετ


  • Canary Release


  • E2E Test Automation


  • σʔλϕʔεϦετΞ

  View full-size slide

 25. ෛՙςετ
  Production Readiness Checklist Ͱ
  Performance Risk Λಛఆͯ͠΋Β͍ɺ
  ඞཁͰ͋Ε͹ Loadtest ΛҊ಺


  Load Test ࣮ࢪ಺༰ͷ Template


  Requirements Λهࡌͯ͠ SRE ͱ։ൃ
  νʔϜͰ໨ઢΛ߹ΘͤΔ


  View full-size slide

 26. ෛՙςετ
  • Gatling ͷίʔυΛॻ͍ͯςετ͕࣮ࢪͰ͖Δ؀ڥ


  • ςετ݁Ռ͕ PR ʹషΒΕΔ


  • ෛՙςετ͕ߴ଎ʹࢼߦࡨޡͰ͖Δ
  Ϩϙʔτੜ੒

  View full-size slide

 27. ෛՙςετ
  • ؀ڥ४උ͸؆୯ͱ͸ݴΘͳ͍͕ɺϋʔυϧ͸Լ͕͍ͬͯΔ


  • Databaseʢຊ൪͔ΒϦετΞ͢Δɻޙड़ʣ


  • Application (Pull Request Λ࡞Ε͹Ͱ͖Δʣ


  • EKS Node Group


  • Test code

  View full-size slide

 28. Canary Release
  • Argo Rollouts Λ׆༻


  • Rails Upgrade ͳͲɺػೳมߋ͸ͳ͍͕ɺϦεΫͷߴ͍มߋʹ࢖͏
  φΠεTryͰ͢ΑͶ


  1% ͔ΒϦϦʔε͠ɺΤϥʔ͕ग़ͨΒ͙͢
  ໭͢͜ͱͰඃ֐Λ࠷খݶʹͰ͖·ͨ͠


  View full-size slide

 29. E2E Test Automation
  • ϒϩάΛݟ͍ͯͩ͘͞ʂ


  • ݕࡧʮελσΟαϓϦ E2Eʯ


  • ݕग़͢Δෆ۩߹΋ͦΕͳΓʹ͋Γɺຊ൪ো֐Λ๷͍Ͱ͍Δ

  View full-size slide

 30. σʔλϕʔεϦετΞ
  • ͪ͜Β΋ৄࡉ͸ϒϩάΛ͝ཡ͍ͩ͘͞ʂ


  • ݕࡧʮελσΟαϓϦ σʔλϕʔεϦετΞʯ

  View full-size slide

 31. ·ͱΊ
  • ϙετϞʔςϜӡ༻Λࢧ͑ΔจԽͱٕज़Λ঺հ͠·ͨ͠


  • ϓϩηεɾจԽ໘͸ඪ४Խͱ໨తҙࣝͷৢ੒͕ॏཁ


  • ٕज़໘͸ൃੜޙͷ࠶ൃ๷ࢭͷੵΈॏͶ


  • จԽͱٕज़ɺ྆ํ͕૬ޓʹ࿈ܞ͢Δ


  • ੵΈॏͶΔ͜ͱͰʮಉ͡ো֐ʯ͸ى͖ͮΒ͘ͳΔ


  • ʮ৽͍͠ো֐ʯ͸ֶͼͷνϟϯεʹͳΔ

  View full-size slide

 32. ࠓ೔࿩͞ͳ͔ͬͨ͜ͱʢεϐʔΧʔτʔΫͰ࿩ͤͨΒخ͍͠ʣ
  • ো֐ͷධՁɺϨϕϧ෇͚


  • MTTR / MTTD ͷܭଌ


  • ࣄޙͷλεΫΛ͍͔ʹ։ൃΛ͠ͳ͕Β࣮ࢪ͢Δ͔


  • ো֐ͱ SLI/SLO

  View full-size slide

 33. Thank you!
  chaspy chaspy_
  Engineering Manager

  Site Reliability and Web Application Development

  at Recruit Co., Ltd.
  Takeshi Kondo
  https://chaspy.me

  View full-size slide