Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ポストモーテム運用を支える文化と技術 / Culture and Technology Supporting Postmortem Operations

Takeshi Kondo
February 09, 2023

ポストモーテム運用を支える文化と技術 / Culture and Technology Supporting Postmortem Operations

Takeshi Kondo

February 09, 2023
Tweet

More Decks by Takeshi Kondo

Other Decks in Technology

Transcript

  1. ϙετϞʔςϜӡ༻Λࢧ͑ΔจԽͱٕज़
    Takeshi Kondo / @chaspy


    2023/02/07


    ΠϯγσϯτʹͲ͏ରԠ͖͔ͯͨ͠ʁΈΜͳͰֶͿϙετϞʔςϜ Lunch LT

    View Slide

  2. Who am I
    chaspy chaspy_
    Engineering Manager

    Site Reliability and Web Application Development

    at Recruit Co., Ltd.
    Takeshi Kondo
    https://chaspy.me

    View Slide

  3. લఏɿϓϩμΫτ঺հ - ελσΟαϓϦ

    View Slide

  4. ࠓ೔࿩͢͜ͱ


    ʮϙετϞʔςϜӡ༻ʯͷલఏͱͳΔจԽͱٕज़

    View Slide

  5. ࠓ೔࿩͞ͳ͍͜ͱ


    ʮϙετϞʔςϜӡ༻ʯͦΕࣗମͷ޻෉

    View Slide

  6. Outline
    • ϙετϞʔςϜӡ༻ͷݱঢ়


    • ϙετϞʔςϜӡ༻ͷྺ࢙


    • ϙετϞʔςϜӡ༻Λࢧ͑ΔจԽ


    • ϙετϞʔςϜӡ༻Λࢧ͑Δٕज़


    • ·ͱΊ

    View Slide

  7. Outline
    • ϙετϞʔςϜӡ༻ͷݱঢ়


    • ϙετϞʔςϜӡ༻ͷྺ࢙


    • ϙετϞʔςϜӡ༻Λࢧ͑ΔจԽ


    • ϙετϞʔςϜӡ༻Λࢧ͑Δٕज़


    • ·ͱΊ

    View Slide

  8. ϙετϞʔςϜӡ༻ͷݱঢ়
    • ো֐ൃੜޙʮϙετϞʔςϜॻ͖·͠ΐ͏ʯͷ੠


    • ؔ܎ऀͰू·ͬͯڞ༗


    • ΞΫγϣϯ͸֤νʔϜͷΠγϡʔͱͯ͠ੵ·ΕΔ

    View Slide

  9. ΧδϡΞϧʹϙετϞʔςϜ͕ߦΘΕΔ༷ࢠ
    ܰඍͳ΋ͷͰ΋ʮֶͼͷνϟϯεʯͱଊ͑Δ
    ໨త͕ਁಁ͍ͯ͠Δ
    །Ұͷ޻෉ͱͯ͠ Slack ΧελϜ
    ϨεϙϯεͰ issue template ͕ग़
    ͯ͘Δͷ͸ॻͨ͘ΊͷϋʔυϧΛ
    Լ͍͛ͯΔ…?


    View Slide

  10. ੲॻ͍ͨهࣄ͕ࠓͰ΋Ҿ༻͞Ε͍ͯΔ
    ࠓճ Findy ͞Μʹ੠͔͚ͯ΋Βͬ
    ͨͷ΋͜ͷهࣄΛݟͯ΋Β͔ͬͨ
    ΒͰͨ͠🙏


    2019೥…


    ʮো֐ରԠͱϙετϞʔςϜ ελσΟαϓϦʯͰݕࡧʂ

    View Slide

  11. Outline
    • ϙετϞʔςϜӡ༻ͷݱঢ়


    • ϙετϞʔςϜӡ༻ͷྺ࢙


    • ϙετϞʔςϜӡ༻Λࢧ͑ΔจԽ


    • ϙετϞʔςϜӡ༻Λࢧ͑Δٕज़


    • ·ͱΊ

    View Slide

  12. ϙετϞʔςϜӡ༻ͷྺ࢙
    • Issue Template ͷ First Commit ͸2019೥5݄


    • ͦΕ͔ΒςϯϓϨʔτͷߋ৽͸΄ͱΜͲͳ͍

    View Slide

  13. ϙετϞʔςϜӡ༻ͷྺ࢙
    • SRE ຊ͔ΒςϯϓϨʔτྲྀ༻


    • Issue Template ͷ First Commit ͸2019೥5݄

    View Slide

  14. ϙετϞʔςϜӡ༻ͷྺ࢙
    • TTD/TTR Λ௥ه

    View Slide

  15. Outline
    • ϙετϞʔςϜӡ༻ͷݱঢ়


    • ϙετϞʔςϜӡ༻ͷྺ࢙


    • ϙετϞʔςϜӡ༻Λࢧ͑ΔจԽ


    • ϙετϞʔςϜӡ༻Λࢧ͑Δٕज़


    • ·ͱΊ

    View Slide

  16. ϙετϞʔςϜΛࢧ͑ΔจԽ
    • ୭͔1ਓͷ͍ͤʹͳΒͳ͍Α͏ʹ͢Δ


    • Design Doc


    • Production Readiness Checklist


    • ૉૣ͘ɺΈΜͳͰରԠ͢Δ


    • ো֐ରԠϑϩʔ


    • ো֐͔ΒֶͿ


    • ϙετϞʔςϜڞ༗ձ


    • ϙετϞʔςϜಡॻձ
    ඪ४Խ͢Δ
    ໨తҙࣝͷৢ੒

    View Slide

  17. Design Doc / Production Readiness Checklist
    • ʮ͏͔ͬΓʯΛඪ४Խ͢Δ


    • ෳ਺ਓͰϨϏϡʔ͢Δ͜ͱͰʮݸਓͷ͍ͤʯʹͮ͠Β͘͢Δ


    • ϨϏϡʔͳ͠୯ಠΦϖϨʔγϣϯͰϛεΔͱͲ͏ͯ͠΋ݪҼ͕ݸਓʹ
    ޲͍ͯ͠·͏Ͱ͠ΐ͏

    ʮProduction Readiness ελσΟαϓϦʯͰݕࡧʂ

    View Slide

  18. ো֐ରԠϑϩʔ
    • ো֐ରԠϑϩʔɾো֐Ϩϕϧ͕ఆٛ͞Ε͍ͯΔ


    • Slack work
    fl
    ow ͰใࠂͰ͖Δ


    • ো֐͔΋ʁͰ΋ใࠂ͢Δ͜ͱΛਪ঑͍ͯ͠Δ

    View Slide

  19. ো֐ରԠϑϩʔ
    ઌ೔ͷ CircleCI ͷ݅ͷใࠂྫ


    ੹೚ऀʹࣗಈͰϝϯγϣϯ͕ඈͿ


    View Slide

  20. ϙετϞʔςϜಡॻձ
    • SRE νʔϜͰ͸ΦϯϘʔσΟϯάͰϙετϞʔςϜಡॻձΛ
    ࣮ࢪ


    • શ෦͸ಡΊͳ͍ʢ૿͑ΔʣͷͰʮ͓͢͢ΊʯϙετϞʔςϜ
    ΛϥϕϧͰ؅ཧ


    • ֶͼ͕ଟ͍΋ͷ


    • ݱࡏͷߏ੒ཧղʹͭͳ͕Δ΋ͷ


    • ো֐ൃੜ࣌ͷಈ͖ͱͯ͠ࢀߟʹͳΔ΋ͷ

    View Slide

  21. ͓͢͢ΊϙετϞʔςϜ8બ

    View Slide

  22. ϙετϞʔςϜΛࢧ͑ΔจԽ·ͱΊ
    • ϋʔυϧΛԼ͛Δࡉ͔ͳ࢓૊Έ


    • Issue Template, Slack custom response


    • ඪ४Խ


    • Production Readiness Checklist, ো֐ରԠϑϩʔɺϨϕϧఆٛ


    • ʮֶͼͷͨΊʯͱ͍͏໨తҙࣝͷৢ੒


    • ࠷ॳ͸ݴ͍ଓ͚Δɾॻ͖ଓ͚Δ͔͠ͳ͍ؾ͕͠·͢


    • աڈ Slack ݕࡧͯ͠ΈΔͱো֐ʹରͯ͠ʮॻ͍ͯ΋Β͑·͔͢ʁʯͱΑ͓͘ئ͍͍ͯͨ͠


    • ॻ͍ͨ݅਺΋ chaspy ͕Ұ൪ଟͦ͏…


    • ϒϩάΛॻ͘ͷ΋ޮՌ͋ͬͨͱࢥ͍·͢

    View Slide

  23. ϙετϞʔςϜΛࢧ͑Δٕज़
    • ॏཁͳো֐͸ࣄલʹ๷͛ΔΑ͏ʹͳ͍ͬͯ·͔͢ʁ


    • ద੾ʹϦεΫΛऔΔ͜ͱ͕Ͱ͖͍ͯ·͔͢ʁ


    • ʮ೦ͷҝ֬ೝʯ͕؆୯ʹͰ͖ΔΑ͏ʹͳ͍ͬͯ·͔͢ʁ

    View Slide

  24. ϙετϞʔςϜӡ༻Λࢧ͑Δٕज़
    • ෛՙςετ


    • Canary Release


    • E2E Test Automation


    • σʔλϕʔεϦετΞ

    View Slide

  25. ෛՙςετ
    Production Readiness Checklist Ͱ
    Performance Risk Λಛఆͯ͠΋Β͍ɺ
    ඞཁͰ͋Ε͹ Loadtest ΛҊ಺


    Load Test ࣮ࢪ಺༰ͷ Template


    Requirements Λهࡌͯ͠ SRE ͱ։ൃ
    νʔϜͰ໨ઢΛ߹ΘͤΔ


    View Slide

  26. ෛՙςετ
    • Gatling ͷίʔυΛॻ͍ͯςετ͕࣮ࢪͰ͖Δ؀ڥ


    • ςετ݁Ռ͕ PR ʹషΒΕΔ


    • ෛՙςετ͕ߴ଎ʹࢼߦࡨޡͰ͖Δ
    Ϩϙʔτੜ੒

    View Slide

  27. ෛՙςετ
    • ؀ڥ४උ͸؆୯ͱ͸ݴΘͳ͍͕ɺϋʔυϧ͸Լ͕͍ͬͯΔ


    • Databaseʢຊ൪͔ΒϦετΞ͢Δɻޙड़ʣ


    • Application (Pull Request Λ࡞Ε͹Ͱ͖Δʣ


    • EKS Node Group


    • Test code

    View Slide

  28. Canary Release
    • Argo Rollouts Λ׆༻


    • Rails Upgrade ͳͲɺػೳมߋ͸ͳ͍͕ɺϦεΫͷߴ͍มߋʹ࢖͏
    φΠεTryͰ͢ΑͶ


    1% ͔ΒϦϦʔε͠ɺΤϥʔ͕ग़ͨΒ͙͢
    ໭͢͜ͱͰඃ֐Λ࠷খݶʹͰ͖·ͨ͠


    View Slide

  29. E2E Test Automation
    • ϒϩάΛݟ͍ͯͩ͘͞ʂ


    • ݕࡧʮελσΟαϓϦ E2Eʯ


    • ݕग़͢Δෆ۩߹΋ͦΕͳΓʹ͋Γɺຊ൪ো֐Λ๷͍Ͱ͍Δ

    View Slide

  30. σʔλϕʔεϦετΞ
    • ͪ͜Β΋ৄࡉ͸ϒϩάΛ͝ཡ͍ͩ͘͞ʂ


    • ݕࡧʮελσΟαϓϦ σʔλϕʔεϦετΞʯ

    View Slide

  31. ·ͱΊ
    • ϙετϞʔςϜӡ༻Λࢧ͑ΔจԽͱٕज़Λ঺հ͠·ͨ͠


    • ϓϩηεɾจԽ໘͸ඪ४Խͱ໨తҙࣝͷৢ੒͕ॏཁ


    • ٕज़໘͸ൃੜޙͷ࠶ൃ๷ࢭͷੵΈॏͶ


    • จԽͱٕज़ɺ྆ํ͕૬ޓʹ࿈ܞ͢Δ


    • ੵΈॏͶΔ͜ͱͰʮಉ͡ো֐ʯ͸ى͖ͮΒ͘ͳΔ


    • ʮ৽͍͠ো֐ʯ͸ֶͼͷνϟϯεʹͳΔ

    View Slide

  32. ࠓ೔࿩͞ͳ͔ͬͨ͜ͱʢεϐʔΧʔτʔΫͰ࿩ͤͨΒخ͍͠ʣ
    • ো֐ͷධՁɺϨϕϧ෇͚


    • MTTR / MTTD ͷܭଌ


    • ࣄޙͷλεΫΛ͍͔ʹ։ൃΛ͠ͳ͕Β࣮ࢪ͢Δ͔


    • ো֐ͱ SLI/SLO

    View Slide

  33. Thank you!
    chaspy chaspy_
    Engineering Manager

    Site Reliability and Web Application Development

    at Recruit Co., Ltd.
    Takeshi Kondo
    https://chaspy.me

    View Slide