Upgrade to Pro — share decks privately, control downloads, hide ads and more …

All for One なポストモーテム運用と工夫

Kuu
February 09, 2023
400

All for One なポストモーテム運用と工夫

Kuu

February 09, 2023
Tweet

Transcript

  1. 2 •αʔϏε։࢝೔ɿ2013೥7݄ •ରԠOSɿAndroidɺiOS ※Webϒϥ΢β͔Β΋ར༻Մೳ •ར༻ྉɿແྉ ※ചΕͨͱ͖ͷख਺ྉɿൢചՁ֨ͷ10% •ରԠ஍Ҭɾݴޠɿ೔ຊɾ೔ຊޠجຊ࢓༷ •ྦྷܭग़඼਺ɿ30ԯ඼Λಥഁ (2022೥11݄) ͦΕΛඞཁͱ͢Δਓͷखʹ౉Γɺ࢖༻͞ΕΔ͜ͱʹتͼΛײ

    ͡ɺ·ͨߪೖऀ͸ɺଟ࠼͔ͭϢχʔΫͳ঎඼ͷத͔Βʮๅ୳ ͠ʯײ֮Ͱ۷Γग़͠෺Λݟ͚ͭΔങ͍෺ମݧΛָ͠ΜͰ͍· ͢ɻ͞ΒʹʮϝϧΧϦʯͰ͸ɺ෺ͷചങ͚ͩͰ͸ͳ͘ɺग़඼ ऀɾߪೖऀؒͷνϟοτ΍ʮ͍͍ͶʂʯػೳΛ௨ͯ͡ɺ͓٬͞ ·ؒͷίϛϡχέʔγϣϯ΋׆ൃʹߦΘΕ͍ͯ·͢ɻ ϑϦϚΞϓϦʮϝϧΧϦʯ͸ɺݸਓ͕؆୯ʹෆཁ඼ͷചങΛߦ͑ΔCtoC ϚʔέοτϓϨΠεͰ͢ɻग़඼ऀɾߪೖऀ૒ํ͕ɺ҆શɾ҆৺ͳऔҾΛ ָ͠ΜͰ͍͚ͨͩΔαʔϏεΛ໨ࢦ͠ɺʮϝϧΧϦʯ͕Ұ࣌తʹߪೖ୅ ۚΛ༬͔ΔΤεΫϩʔܾࡁΛ׆༻ͨ͠औҾ؀ڥͷ੔උ΍ɺ؆୯͔ͭखࠒ ͳՁ֨ͷ഑ૹΦϓγϣϯɺࠩผԽ͞ΕͨϢχʔΫͳ͓٬͞·ମݧΛఏڙ ͍ͯ͠·͢ɻଟ͘ͷग़඼ऀ͸ɺࣗ෼ʹͱͬͯඞཁͰͳ͘ͳͬͨϞϊ͕ɺ 2 ϝϧΧϦͱ͸
  2. 5 • Kuu @ גࣜձࣾϝϧΧϦ • Software Engineer ◦ ϝϧΧϦͷAndroid൛Λ։ൃ͍ͯ͠Δ

    • झຯ ◦ Ϟϯελʔϋϯλʔϫʔϧυ ◦ εΩʔ⛷ & εΩϡʔόμΠϏϯά 🤿 ◦ ேࢄาͨ͠Γɺඒຯ͍͠΋ͷΛௐཧɾ৯΂ͨΓ ࣗݾ঺հ - Kuu
  3. 6 • 2017೥͝Ζ͔Βӡ༻͍ͯ͠Δ • SRE͚ͩͰ͸ͳ͘ɺ৬छΛ໰Θͣӡ༻͍ͯ͠Δ ◦ All for One ͳϙετϞʔςϜӡ༻

    • ӡ༻ͷྲྀΕ ◦ ΠϯγσϯτൃੜɾରԠ ◦ νʔϜ಺ͰϨτϩεϖΫςΟϒɾϙετϞʔςϜͷ࡞੒ ◦ Incident management committee͕ϨϏϡʔɾϙετϞʔςϜͷެ։ ϝϧΧϦͰͷϙετϞʔςϜӡ༻ͷશମ૾
  4. 7 • 2017೥͝Ζ͔Βӡ༻͍ͯ͠Δ ◦ 2018೥౰࣌ͷӡ༻ঢ়گͷϒϩάهࣄ ◦ େࣄͳͱ͜Ζ͸มΘ͍ͬͯͳ͍ ▪ िʹ1౓1࣌ؒ ▪

    Slack্ͰΦϯϥΠϯ։࠵ ▪ ࢀՃऀ͸೚ҙʢձ࿩Λ௥͏͚ͩͷࢀՃ΋Մ) ▪ ਐߦ͸ෳ਺ਓͰ࣋ͪճΔ • ૊৫ن໛ͷ֦େ౳ɺঢ়گʹ߹ΘͤͯਐԽ͠ଓ͚͍ͯΔ ◦ SlackίϚϯυɾBotͷಋೖ ◦ Blameless ౳ͷӡ༻ • ΑΓྑ͍ӡ༻ͷͨΊʹɺ೔ʑվળத ◦ ϙετϞʔςϜςϯϓϨʔτͷվળ ◦ ΦϯϘʔσΟϯάͷվળ ϙετϞʔςϜӡ༻ͷྺ࢙
  5. 8 • ͓٬༷ʹӨڹΛ༩͑Δ༧ظͤ͵αʔϏεͷதஅ΍඼࣭ͷ௿Լ ◦ ͜ΕΒΛΠϯγσϯτͱͯ͠ѻ͍ͬͯΔ • ։ൃνʔϜɾSREνʔϜ౳ʑɺ෯޿͘ϙετϞʔςϜΛӡ༻͍ͯ͠Δ • ΑΓଟ͘ͷਓ͕ϙετϞʔςϜ࡞੒Ͱ͖ΔͨΊʹ ◦

    ࣾ಺ٕज़ݚम޲͚ʹֶशίϯςϯπ࡞੒ ◦ ϙετϞʔςϜ ϨϏϡʔͰͷϑΟʔυόοΫ • ϝϧΧϦͷEngineering Ladderʹؚ·Ε͍ͯΔ ◦ ΤϯδχΞʹظ଴͞ΕΔߦಈΛ໌จԽͨ͠΋ͷ SRE͚ͩͰ͸ͳ͘ɺ৬छΛ໰Θͣӡ༻͍ͯ͠Δ
  6. 10 • Πϯγσϯτ͕ൃੜ͢ΔཁҼ͸༷ʑ ◦ Ξϥʔτ্͕͕͖ͬͯͨ ◦ ͓٬༷͔Βͷ͓໰͍߹Θͤ ◦ ։ൃνʔϜ͕ιϑτ΢ΣΞͷόάΛൃݟ •

    ٙΘ͖͠ࣄ৅͕ൃੜͨ͠ࡍ͸ΤεΧϨʔγϣϯ ◦ SlackίϚϯυΛར༻ͯ͠଎΍͔ʹ։࢝Ͱ͖Δ ϙετϞʔςϜӡ༻ͷྲྀΕ - ΤεΧϨʔγϣϯ
  7. 12 • ςϯϓϨʔτΛ׆༻ͯ͠ɺৼΓฦΓΛߦ͏ ◦ ۩ମతͳ಺༰͸ޙड़͠·͢ • େࣄʹ͍ͯ͠Δ͜ͱ͸ ◦ ૬खΛඇ೉ͤͣݐઃతͳٞ࿦Λߦ͏ ◦

    Πϯγσϯτ͔ΒֶͿ ◦ ߃ٱରԠɺ࠶ൃ๷ࢭΛߦ͍ΠϯγσϯτΛ܁Γฦ͞ͳ͍ ϙετϞʔςϜӡ༻ͷྲྀΕ - ϨτϩεϖΫςΟϒ
  8. 13 • ϨτϩεϖΫςΟϒͰग़͖ͯͨվળҊΛ࣮ߦ͢Δ ◦ νέοτԽ࣮ͯ֬͠ʹߦ͑ΔΑ͏ʹ͢Δ • Incident Management Committee ◦

    ΠϯγσϯτϚωδϝϯτͷվળΛ໨తͱͨࣾ͠಺ҕһձ ◦ ద੾ʹӡ༻Ͱ͖͍ͯΔͷ͔ɺϨϏϡʔͱϑΟʔυόοΫΛߦ͍ͬͯΔ ϙετϞʔςϜӡ༻ͷྲྀΕ - ߃ٱରԠɾ࠶ൃ๷ࢭࡦͷ࣮ࢪ
  9. 14 ϙετϞʔςϜͷ߲໨঺հ Overview Analysis Πϯγσϯτͷ֓ཁ ͲͷΑ͏ͳ಺༰͔Λཁ໿ͯ͠ॻ͘ ΠϯγσϯτͷݪҼ ໰୊ͷࠜຊݪҼΛ୳ΔͨΊɺෳ਺ճਂ۷Γͯ͠ॻ͘ Follow up

    action νέοτԽͯ͠ɺ࣮֬ʹ࣮ߦͰ͖ΔΑ͏ʹ͢Δ ൃੜ࣌ࠁ ݕ஌࣌ࠁ MTTA: Mean Time To Acknowledgeͷࢉग़ʹ࢖͏ ղܾ࣌ࠁ
 MTTR : Mean Time To Resolveͷࢉग़ʹ࢖͏
  10. 16 • ࠓ݄ͷϙετϞʔςϜ ◦ All hands(ࣾһ͕ू·ΔΠϕϯτ)ʹͯϙετϞʔςϜͷ౷ܭɾࣄྫ঺հΛߦ͍ͬͯΔ • ϫʔΫϑϩʔʹམͱ͠ࠐΉ ◦ Blameless΍SlackΛ׆༻ͯ͠ɺ׳Εͯͳ͍ਓͰ΋ӡ༻͠΍͍͢Α͏ʹ

    • ڭҭ ◦ DevDojo(ࣾ಺ٕज़ݚम)Ͱͷڭҭˍެ։ͯ͠ΔΑ ◦ https://engineering.mercari.com/blog/entry/20221223-showcasing-devdojo-a-series-of- mercari-developed-learning-content-for-engineering/ ◦ αϯυϘοΫε؀ڥͷ੔උ ϙετϞʔςϜจԽͷಋೖ