Upgrade to Pro — share decks privately, control downloads, hide ads and more …

何でも屋になっている SRE 的なチームから責務を分離するまでの道のり 〜新設チームでオンコール体制を構築するまで〜 / SRE Lounge #15

何でも屋になっている SRE 的なチームから責務を分離するまでの道のり 〜新設チームでオンコール体制を構築するまで〜 / SRE Lounge #15

https://sre-lounge.connpass.com/event/290455/ の発表資料です。
発表動画や発表で触れられなかった内容については https://tech.repro.io/entry/2023/09/28/101708 に載っています。

Takeshi Arabiki

August 31, 2023
Tweet

More Decks by Takeshi Arabiki

Other Decks in Technology

Transcript

  1. © 2023 Repro Inc. │ CONFIDENTIAL Reproגࣜձࣾ Development Div. Platform

    Team Takeshi Arabiki (@a_bicky) ԿͰ΋԰ʹͳ͍ͬͯΔ SRE తͳνʔϜ͔Β ੹຿Λ෼཭͢Δ·ͰͷಓͷΓ ʙ৽ઃνʔϜͰΦϯίʔϧମ੍Λߏங͢Δ·Ͱʙ
  2. ࣗݾ঺հ Takeshi Arabiki • X (چ Twitter): @a_bicky • Blog:

    ͋Βͼ͖೔ه • ॴଐ: Rerpo גࣜձࣾʢ2017 ೥ 8 ݄ʙʣ • SREɺσʔλΤϯδχΞϦϯάɺΞϓϦέʔγϣ ϯ։ൃɺΞʔΩςΫτͬΆ͍͜ͱΛ΍ΔνʔϜͰ ϚωʔδϟʔΛͭͭ͠ɺΤϯδχΞͱͯ͠͸ίʔ υϨϏϡʔ΍ো֐ରԠɾ҆ఆԽʹؔ͢Δ͜ͱΛϝ Πϯʹɺͨ·ʹػೳ։ൃ΋΍Δਓʢ΋ͬͱίʔυ ॻ͖͍ͨʣ
  3. © 2023 Repro Inc. │ CONFIDENTIAL 3 ΞδΣϯμ • Repro

    ʹ͍ͭͯ • ౰࣌ͷSys-Infraͷ୲౰ྖҬ • ৽νʔϜઃཱͱ૊৫ମ੍ͷมߋ • ৽ઃνʔϜͰΦϯίʔϧମ੍Λߏங͢Δ·Ͱ • ࢒Δ՝୊ • ·ͱΊ
  4. © 2023 Repro Inc. │ CONFIDENTIAL 4 ΞδΣϯμ • Repro

    ʹ͍ͭͯ • ౰࣌ͷSys-Infraͷ୲౰ྖҬ • ৽νʔϜઃཱͱ૊৫ମ੍ͷมߋ • ৽ઃνʔϜͰΦϯίʔϧମ੍Λߏங͢Δ·Ͱ • ࢒Δ՝୊ • ·ͱΊ
  5. © 2023 Repro Inc. │ CONFIDENTIAL ৘ใऩू ෼ੳɾηάϝϯτ ҰਓͻͱΓʹ࠷దͳࢪࡦΛ࣮ࢪ ଟ༷ͳσʔλΛ෼ੳ

    ϢʔβʔΛηάϝϯτ ϙοϓΞοϓ ਧ͖ग़͠ ϝʔϧ ΞϓϦ಺઀٬ ΞϓϦ಺઀٬ ΞϓϦ಺઀٬ ίϯςϯπຒΊࠐΈ ϓογϡ௨஌ Ϩίϝϯυ Web ళฮ ΞϓϦ ϦΞϧλΠϜͷ One to One ίϛϡχέʔγϣϯ
  6. © 2023 Repro Inc. │ CONFIDENTIAL ৘ใऩू ෼ੳɾηάϝϯτ ҰਓͻͱΓʹ࠷దͳࢪࡦΛ࣮ࢪ ଟ༷ͳσʔλΛ෼ੳ

    ϢʔβʔΛηάϝϯτ ϙοϓΞοϓ ਧ͖ग़͠ ϝʔϧ ΞϓϦ಺઀٬ ΞϓϦ಺઀٬ ΞϓϦ಺઀٬ ίϯςϯπຒΊࠐΈ ϓογϡ௨஌ Ϩίϝϯυ Web ళฮ ΞϓϦ ϦΞϧλΠϜͷ One to One ίϛϡχέʔγϣϯ SDK ͔Βͷେྔͷ ϦΫΤετΛࡹ͘ API େྔͷσʔλΛՃ޻͢Δ ΞϓϦέʔγϣϯ ४ϦΞϧλΠϜʹର৅ ϢʔβʔΛߋ৽͢Δ ΞϓϦέʔγϣϯ
  7. © 2023 Repro Inc. │ CONFIDENTIAL ReproͷαʔόαΠυΛࢧ͑Δओͳٕज़ Amazon Corretto (Java)

    BigQuery Kafka Streams Rundeck Amazon EC2 Amazon ECS Amazon Aurora (MySQL Compatible) Amazon ElastiCache (Redis) Amazon S3
  8. © 2023 Repro Inc. │ CONFIDENTIAL ໿1೥લͷαʔόαΠυνʔϜͷߏ੒ Sys-Infra ˚˚ϓϩδΣΫτ ̋̋ϓϩδΣΫτ

    ✕✕ϓϩδΣΫτ Development Div. SREతͳνʔϜ ػೳ։ൃνʔϜʢϓϩδΣΫτ੍ʣ
  9. © 2023 Repro Inc. │ CONFIDENTIAL 11 ΞδΣϯμ • Repro

    ʹ͍ͭͯ • ౰࣌ͷSys-Infraͷ୲౰ྖҬ • ৽νʔϜઃཱͱ૊৫ମ੍ͷมߋ • ৽ઃνʔϜͰΦϯίʔϧମ੍Λߏங͢Δ·Ͱ • ࢒Δ՝୊ • ·ͱΊ
  10. © 2023 Repro Inc. │ CONFIDENTIAL ৘ใऩू ෼ੳɾηάϝϯτ ҰਓͻͱΓʹ࠷దͳࢪࡦΛ࣮ࢪ ଟ༷ͳσʔλΛ෼ੳ

    ϢʔβʔΛηάϝϯτ ϙοϓΞοϓ ਧ͖ग़͠ ϝʔϧ ΞϓϦ಺઀٬ ΞϓϦ಺઀٬ ΞϓϦ಺઀٬ ίϯςϯπຒΊࠐΈ ϓογϡ௨஌ Ϩίϝϯυ Web ళฮ ΞϓϦ ϦΞϧλΠϜͷ One to One ίϛϡχέʔγϣϯ SDK ͔Βͷେྔͷ ϦΫΤετΛࡹ͘ API େྔͷσʔλΛՃ޻͢Δ ΞϓϦέʔγϣϯ ४ϦΞϧλΠϜʹର৅ ϢʔβʔΛߋ৽͢Δ ΞϓϦέʔγϣϯ
  11. © 2023 Repro Inc. │ CONFIDENTIAL Ұ෦ͷΞϓϦέʔγϣϯͷ։ൃɾͦͷଞશൠͷӡ༻ Amazon Corretto (Java)

    BigQuery Kafka Streams Rundeck Amazon EC2 Amazon ECS Amazon Aurora (MySQL Compatible) Amazon ElastiCache (Redis) Amazon S3 SDK ΍ Rails ͷϑϩϯτΤϯυͷ։ൃҎ֎ ʹؔͯ͠͸΄΅શͯʹਂؔ͘༩
  12. © 2023 Repro Inc. │ CONFIDENTIAL ೔ʑͷӡ༻ɾόʔδϣϯΞοϓ Amazon Corretto (Java)

    BigQuery Kafka Streams Rundeck Amazon EC2 Amazon ECS Amazon Aurora (MySQL Compatible) Amazon ElastiCache (Redis) Amazon S3
  13. © 2023 Repro Inc. │ CONFIDENTIAL 16 Ͳ͏ͯ͜͠͏ͳͬͨ • Rails

    ΞϓϦέʔγϣϯͷ֤ػೳ୲౰͕ᐆດ - ಛఆͷػೳͷ։ൃ͕׬ྃ͢Δ౓ʹνʔϜΛεΫϥοϓΞϯυϏϧυ - ৽ػೳ։ൃͰ͋·ΓखΛೖΕͳ͍ೃછΈͷͳ͍ΞϓϦέʔγϣϯͷଘࡏ • εέδϡʔϧΛ༏ઌͯ͠Sys-InfraΛத৺ʹߦͬͨΞϓϦέʔγϣϯ։ൃ • Chief Architectʢݩ CTOʣ͕ಋೖ͢Δج൫ͷड͚ೖΕઌ͕͍͍ͩͨSys-Infra - ೝ஌ෛՙͷݶք͔Βػೳ։ൃνʔϜ΋ड͚ೖΕઌʹͳΓͭͭ͋ͬͨ • ͦ΋ͦ΋Sys-Infraͷ੹຿͕ᐆດ - Կ౓͔੹຿Λ੔ཧ͠Α͏ͱ͢Δ΋ड͚ೖΕઌ͕ͳ͘಴࠳
  14. © 2023 Repro Inc. │ CONFIDENTIAL 17 ΞδΣϯμ • Repro

    ʹ͍ͭͯ • ౰࣌ͷSys-Infraͷ୲౰ྖҬ • ৽νʔϜઃཱͱ૊৫ମ੍ͷมߋ • ৽ઃνʔϜͰΦϯίʔϧମ੍Λߏங͢Δ·Ͱ • ࢒Δ՝୊ • ·ͱΊ
  15. © 2023 Repro Inc. │ CONFIDENTIAL ൃ଍ޙͷαʔόαΠυνʔϜͷߏ੒ Sys-Infra Feature Team

    1 Repro Core Feature Team 2 Development Div. SREతͳνʔϜ ػೳ։ൃνʔϜʢجຊݻఆʣ New!!
  16. © 2023 Repro Inc. │ CONFIDENTIAL 20 Repro Core ৽ઃʹΑͬͯղܾ͍ͨ͠՝୊

    • ReproΛࢧ͑Δॏཁͳج൫ͷӡ༻͕ଐਓԽ͍ͯ͠Δ্ʹઐ೦ͮ͠Β͍ঢ়گ - ސ٬͕௚઀ར༻͠ͳ͍Օॴͷӡ༻ෛՙ͕ Sys-Infra ʹҰۃूத • Chief Architectͷ஌ࣝͷ఻ঝઌ͕ෆࡏ • ػೳ։ൃ͢ΔνʔϜ͕஌Βͳ͍ͱ͍͚ͳ͍͜ͱ͕ଟ্͍ʹखް͍αϙʔτ΋ͳ͍
  17. © 2023 Repro Inc. │ CONFIDENTIAL 21 Repro Core ৽ઃʹΑͬͯղܾ͍ͨ͠՝୊

    • ReproΛࢧ͑Δॏཁͳج൫ͷӡ༻͕ଐਓԽ͍ͯ͠Δ্ʹઐ೦ͮ͠Β͍ঢ়گ - Repro Core ͕୲͏ • Chief Architectͷ஌ࣝͷ఻ঝઌ͕ෆࡏ - Repro Core ͕୲͏ • ػೳ։ൃ͢ΔνʔϜ͕஌Βͳ͍ͱ͍͚ͳ͍͜ͱ͕ଟ্͍ʹखް͍αϙʔτ΋ͳ͍ - Repro Core ͱ Sys-Infra ͕୲͏
  18. © 2023 Repro Inc. │ CONFIDENTIAL 22 νʔϜͷ໾ׂͷ੔ཧ • Repro

    Core - ސ٬ʹ௚઀తͳՁ஋͸ಧ͚ͳ͍͕ɺFeature Team ͕ސ٬ʹՁ஋Λಧ͚Δ্Ͱඞཁ ͱͳΔڞ௨ج൫Λఏڙ͢Δ • Sys-Infra - ۃ୺ͳ࿩Reproݻ༗ͷ஌͕ࣝͳͯ͘΋੒Γཱͭج൫ͷఏڙ • Feature Team - ސ٬ʹ௚઀తͳՁ஋Λಧ͚Δػೳͷ։ൃ
  19. © 2023 Repro Inc. │ CONFIDENTIAL 23 νʔϜͷ໾ׂΛྉཧͰྫ͑Δͱ Feature Team

    Repro Core Sys-Infra ਫɾΨεͷ҆ఆͨ͠ڙڅ ศརͳಓ۩ͷ։ൃ ྉཧΛ࡞Δ
  20. © 2023 Repro Inc. │ CONFIDENTIAL ৘ใऩू ෼ੳɾηάϝϯτ ҰਓͻͱΓʹ࠷దͳࢪࡦΛ࣮ࢪ ଟ༷ͳσʔλΛ෼ੳ

    ϢʔβʔΛηάϝϯτ ϙοϓΞοϓ ਧ͖ग़͠ ϝʔϧ ΞϓϦ಺઀٬ ΞϓϦ಺઀٬ ΞϓϦ಺઀٬ ίϯςϯπຒΊࠐΈ ϓογϡ௨஌ Ϩίϝϯυ Web ళฮ ΞϓϦ Repro Coreͷίϯϙʔωϯτͷྫ େྔͷσʔλΛՃ޻͢Δ ΞϓϦέʔγϣϯ ४ϦΞϧλΠϜʹର৅ ϢʔβʔΛߋ৽͢Δ ΞϓϦέʔγϣϯ
  21. © 2023 Repro Inc. │ CONFIDENTIAL 26 ΞδΣϯμ • Repro

    ʹ͍ͭͯ • ౰࣌ͷSys-Infraͷ୲౰ྖҬ • ৽νʔϜઃཱͱ૊৫ମ੍ͷมߋ • ৽ઃνʔϜͰΦϯίʔϧମ੍Λߏங͢Δ·Ͱ • ࢒Δ՝୊ • ·ͱΊ
  22. © 2023 Repro Inc. │ CONFIDENTIAL 28 ౰࣌ͷΦϯίʔϧମ੍ • ౰࣌ͷ

    PagerDuty αʔϏε͸ओʹ 4 छྨͰ Feature Team ͱ Sys-Infra Ͱ෼୲ - ϓογϡ௨஌ؔ࿈αʔϏε - Sys-Infra ؔ࿈αʔϏε - Kafka BrokerɾKafka Streams ΞϓϦέʔγϣϯؔ࿈αʔϏε - Rails ΞϓϦέʔγϣϯαʔϏε • ্هͷαʔϏε͸खಈͰ؅ཧ • ֤αʔϏεৗʹ Primary ͱ Secondary ͷ 2 ໊͕଴ػ
  23. © 2023 Repro Inc. │ CONFIDENTIAL 29 ౰࣌ͷΦϯίʔϧମ੍ • ౰࣌ͷ

    PagerDuty αʔϏε͸ओʹ 4 छྨͰ Feature Team ͱ Sys-Infra Ͱ෼୲ - ϓογϡ௨஌ؔ࿈αʔϏε - Sys-Infra ؔ࿈αʔϏε - Kafka BrokerɾKafka Streams ΞϓϦέʔγϣϯؔ࿈αʔϏε - Rails ΞϓϦέʔγϣϯαʔϏε • ্هͷαʔϏε͸खಈͰ؅ཧ • ֤αʔϏεৗʹ Primary ͱ Secondary ͷ 2 ໊͕଴ػ ← ࠓޙ͸Repro Core͕Ұ෦Λ୲౰ ← ࠓޙ͸Repro Core͕Ұ෦Λ୲౰ ← ࠓޙ͸Repro Core͕Ұ෦Λ୲౰ ← ࠓޙ͸Repro Core͕Ұ෦Λ୲౰
  24. © 2023 Repro Inc. │ CONFIDENTIAL 30 ΦϯίʔϧҠߦͰ΍Δ͜ͱ • PagerDuty

    αʔϏεͷ࡞੒୯Ґͷํ਑ܾఆ • ҠߦظؒதͷΤεΧϨʔγϣϯϙϦγʔͷํ਑ܾఆ • ֤छίϯϙʔωϯτͷΩϟονΞοϓ • υΩϡϝϯτͷ੔උ • Φϯίʔϧͷϧʔϧ (Working Agreement) ͷઃఆ • Ξϥʔτͷ޲͖ઌมߋ
  25. © 2023 Repro Inc. │ CONFIDENTIAL 31 ΦϯίʔϧҠߦͰ΍Δ͜ͱ • PagerDuty

    αʔϏεͷ࡞੒୯Ґͷํ਑ܾఆ • ҠߦظؒதͷΤεΧϨʔγϣϯϙϦγʔͷํ਑ܾఆ • ֤छίϯϙʔωϯτͷΩϟονΞοϓ • υΩϡϝϯτͷ੔උ • Φϯίʔϧͷϧʔϧ (Working Agreement) ͷઃఆ • Ξϥʔτͷ޲͖ઌมߋ
  26. © 2023 Repro Inc. │ CONFIDENTIAL 32 PagerDuty ͷ༻ޠ •

    αʔϏε - ઃఆ͞ΕͨΤεΧϨʔγϣϯϙϦγʔʹैͬͯΦϯίʔϧ୲౰ऀʹ௨஌Λߦ͏୯Ґ • ΤεΧϨʔγϣϯϙϦγʔ - αʔϏεͰΠϯγσϯτ͕࡞੒͞Εͨࡍʹ 
 Ͳͷεέδϡʔϧʹैͬͯ௨஌͢Δ͔Λܾఆ͢Δ • εέδϡʔϧ - ࣌ؒଳʹΑͬͯΦϯίʔϧ୲౰ऀΛมߋ͢Δ࢓૊Έ - ௨ৗ͸1νʔϜ1εέδϡʔϧͳͷͰνʔϜͰಡସ͑Մ ΤεΧϨʔγϣϯϙϦγʔ εέδϡʔϧ
  27. © 2023 Repro Inc. │ CONFIDENTIAL 33 PagerDuty ͷαʔϏεͱ͸ You

    can call this something a microservice, a piece of functionality, a feature, a slice of a monolith, a component, a shared piece of infrastructure, or an internal tool. For the purposes of this guide, these somethings will all be labeled as a type of "service." Although the monolith shares the same code base, each logical source of functionality you identify can be considered its own service. That logical service can be represented as a different service for your documentation, runbooks, and wikis, along with your on-call ownership in PagerDuty. Introduction - PagerDuty Full-Service Ownership Documentation Defining a Service - PagerDuty Full-Service Ownership Documentation
  28. © 2023 Repro Inc. │ CONFIDENTIAL 34 PagerDuty ͷαʔϏεͱ͸ You

    can call this something a microservice, a piece of functionality, a feature, a slice of a monolith, a component, a shared piece of infrastructure, or an internal tool. For the purposes of this guide, these somethings will all be labeled as a type of "service." Although the monolith shares the same code base, each logical source of functionality you identify can be considered its own service. That logical service can be represented as a different service for your documentation, runbooks, and wikis, along with your on-call ownership in PagerDuty. Introduction - PagerDuty Full-Service Ownership Documentation Defining a Service - PagerDuty Full-Service Ownership Documentation ϚΠΫϩαʔϏε΍ίϯϙʔωϯτͱ͔ΛαʔϏεͱݺͿΑ ಉ͡ίʔυΛڞ༗͢ΔϞϊϦγοΫαʔϏεͰ΋ҟͳΔػೳ Λఏڙ͍ͯ͠ΔͳΒผͷαʔϏεͱΈͳͤΔΑ
  29. © 2023 Repro Inc. │ CONFIDENTIAL 35 ౰࣌ͷΦϯίʔϧମ੍ʢ࠶ܝʣ • ౰࣌ͷ

    PagerDuty αʔϏε͸ओʹ 4 छྨͰ Feature Team ͱ Sys-Infra Ͱ෼୲ - ϓογϡ௨஌ؔ࿈αʔϏε - Sys-Infra ؔ࿈αʔϏε - Kafka BrokerɾKafka Streams ΞϓϦέʔγϣϯؔ࿈αʔϏε - Rails ΞϓϦέʔγϣϯαʔϏε • ্هͷαʔϏε͸खಈͰ؅ཧ • ֤αʔϏεৗʹ Primary ͱ Secondary ͷ 2 ໊͕଴ػ
  30. © 2023 Repro Inc. │ CONFIDENTIAL 36 PagerDuty αʔϏεͷ෼ׂ •

    ౰࣌ͷ PagerDuty αʔϏε͸ओʹ 4 छྨͰ Feature Team ͱ Sys-Infra Ͱ෼୲ - ϓογϡ௨஌ؔ࿈αʔϏε → 5 ͭʹ෼ׂ - Sys-Infra ؔ࿈αʔϏε → 3 ͭʹ෼ׂ - Kafka BrokerɾKafka Streams ΞϓϦέʔγϣϯؔ࿈αʔϏε → 2 ͭʹ෼ׂ - Rails ΞϓϦέʔγϣϯαʔϏε → 4 ͭʹ෼ׂ
  31. © 2023 Repro Inc. │ CONFIDENTIAL 38 PagerDuty ͷ༻ޠʢ࠶ܝʣ •

    αʔϏε - ઃఆ͞ΕͨΤεΧϨʔγϣϯϙϦγʔʹैͬͯΦϯίʔϧ୲౰ऀʹ௨஌Λߦ͏୯Ґ • ΤεΧϨʔγϣϯϙϦγʔ - αʔϏεͰΠϯγσϯτ͕࡞੒͞Εͨࡍʹ 
 Ͳͷεέδϡʔϧʹैͬͯ௨஌͢Δ͔Λܾఆ͢Δ • εέδϡʔϧ - ࣌ؒଳʹΑͬͯΦϯίʔϧ୲౰ऀΛมߋ͢Δ࢓૊Έ - ௨ৗ͸1νʔϜ1εέδϡʔϧͳͷͰνʔϜͰಡସ͑Մ ΤεΧϨʔγϣϯϙϦγʔ εέδϡʔϧ
  32. © 2023 Repro Inc. │ CONFIDENTIAL • Repro Core -

    Repro Core ͚ͩͰΦϯίʔϧʹೖΔ 
 αʔϏε༻ • Repro Core Trial with FT - Feature Team ͱҰॹʹΦϯίʔϧʹೖΔ 
 αʔϏε༻ • Repro Core Trial with Sys Infra - Sys-Infra ͱҰॹʹΦϯίʔϧʹೖΔ 
 αʔϏε༻ • etc. 39 PagerDuty ΤεΧϨʔγϣϯϙϦγʔͷํ਑ Repro Core Trial with Sys Infra ͷྫ
  33. © 2023 Repro Inc. │ CONFIDENTIAL 40 PagerDuty εέδϡʔϧͷํ਑ •

    Repro Core Trial Primary/Secondary - Repro Core ͱଞͷνʔϜͷΦϯίʔϧ୲౰͕Ұॹʹ଴ػ͢Δεέδϡʔϧ - ฏ೔೔த͸ෆ׳Εͳϝϯόʔ͕ৗʹ଴ػͯ͠ܦݧΛੵΉ - ฏ೔೔தҎ֎͸׳Ε͍ͯΔϝϯόʔ͕଴ػ͢Δ͔ɺ୲౰ෆࡏͷঢ়ଶ • Repro Core Primary/Secondary - Repro Core ͚͕ͩ଴ػ͢ΔϙϦγʔ - ฏ೔೔த͸ෆ׳Εͳϝϯόʔ͕ৗʹ଴ػͯ͠ܦݧΛੵΉ - ฏ೔೔தҎ֎͸ϩʔςʔγϣϯʹैͬͯ଴ػ - Repro Core Trial Primary/Secondary ͔Β࢝Ίͯͪ͜Βʹ੾Γସ͍͑ͯ͘
  34. © 2023 Repro Inc. │ CONFIDENTIAL • ෆ׳ΕͳαʔϏε͸γϟυʔΠϯά͔Β࢝ΊΔ - γϟυʔΠϯά

    = ରԠ͍ͯ͠Δਓͷ࡞ۀΛࣗ෼Ͱ΋ͳͧΔ͜ͱ • ͋Δఔ౓׳Ε͍ͯΔαʔϏε͸ࣗ෼ҰਓͰΦϯίʔϧʹೖ͍ͬͯΔͭ΋ΓͰରԠ - खʹෛ͑ͳ͍΋ͷʹؔͯ͠͸ผνʔϜͷ౰൪ͷਓͷॿ͚ΛआΓΔ • Φϯίʔϧظؒʹൃੜͨ͠Ξϥʔτʹશͯʹؔͯ࣍͠ͷ 2 ఺Λԡ͑͞Δ - ΞϥʔτΛ์ஔͨ͠৔߹ʹαʔϏεʹͲͷΑ͏ͳӨڹΛ༩͑ΔՄೳੑ͕͋Δ͔ - ΞϥʔτΛղফ͢ΔͨΊʹ͸ͲͷΑ͏ͳ࡞ۀΛߦΘͳ͚Ε͹ͳΒͳ͍͔ • ࣍ͷΑ͏ͳ͜ͱ΋Ͱ͖Δͱ๬·͍͕͠ඞਢͰ͸ͳ͍ - Ξϥʔτ͕໐ΔݪҼʹ͍ͭͯͷٕज़తͳཧղɾΞϥʔτͷࠜຊղফ 41 ΦϯίʔϧҠߦظؒதͷ৺ߏ͑
  35. © 2023 Repro Inc. │ CONFIDENTIAL 42 ΞδΣϯμ • Repro

    ʹ͍ͭͯ • ౰࣌ͷSys-Infraͷ୲౰ྖҬ • ৽νʔϜઃཱͱ૊৫ମ੍ͷมߋ • ৽ઃνʔϜͰΦϯίʔϧମ੍Λߏங͢Δ·Ͱ • ࢒Δ՝୊ • ·ͱΊ
  36. © 2023 Repro Inc. │ CONFIDENTIAL • ࠓͷ Repro Core

    ͸Ҏલͷ Sys-Infra ฒΈʹೝ஌ෛՙ͕େ͖͍ • ೝ஌ෛՙͷݶքʹ߹ΘͤͯνʔϜΛ෼ׂ͍͕ͨ͠෼ׂͰ͖Δ΄Ͳͷن໛͡Όͳ͍ • Φϯίʔϧ΋ߟ͑Δͱ 1 νʔϜ࠷௿ 4 ਓ͸֬อ͍ͨ͠ - ೖ໳؂ࢹʹΑΕ͹ɺΦϯίʔϧͰηΧϯμϦΛ༻ҙ͢ΔͳΒ࠷௿8ਓ͸ඞཁ • Repro Core͸ࣾ಺ͷϝϯόʔͰ݁੒͞ΕͨͷͰ࠾༻ϑϩʔ͕੔͍ͬͯͳ͍ - ͜Ε͔Β੔උ͍͔ͯ͠ͳ͚Ε͹… 44 ਓ͕଍Γͳ͍ʂʂ
  37. © 2023 Repro Inc. │ CONFIDENTIAL 45 ΞδΣϯμ • Repro

    ʹ͍ͭͯ • ౰࣌ͷSys-Infraͷ୲౰ྖҬ • ৽νʔϜઃཱͱ૊৫ମ੍ͷมߋ • ৽ઃνʔϜͰΦϯίʔϧମ੍Λߏங͢Δ·Ͱ • ࢒Δ՝୊ • ·ͱΊ
  38. © 2023 Repro Inc. │ CONFIDENTIAL 47 ·ͱΊ • ໿1೥લɺReproͰ͸SREతͳνʔϜ͕ԿͰ΋԰ʹͳ͍ͬͯͨ

    • ৽νʔϜઃཱ΍૊৫ମ੍มߋʹ൐ͬͯ੹຿Λ੔ཧͨ͠ - ৽νʔϜઃཱ͸੹຿Λ੔ཧ͢Δνϟϯεʂ • ৽ઃνʔϜͰطଘνʔϜ͔Β੹຿Λണ͕ͯ͠Φϯίʔϧମ੍Λߏங͢Δ޻෉ - PagerDuty ͷαʔϏεཻ౓ͷݟ௚͠ - Terraform Ͱ؅ཧ͢Δ͜ͱͰมߋΛՃ͑΍ͨ͘͢͠ - ΦϯίʔϧʹೖΔ࣌ؒଳͷ޻෉ - ΦϯίʔϧҠߦظؒதͷ৺ߏ͑Λ໌จԽ
  39. © 2023 Repro Inc. │ CONFIDENTIAL 49 طଘίϯϙʔωϯτ΍ඞཁͳ஌ࣝͷΩϟονΞοϓ • طଘίϯϙʔωϯτͷϦϑΝΫλϦϯάɾόʔδϣϯΞοϓ

    • ಡॻձͷ࣮ࢪ - σʔλࢦ޲ΞϓϦέʔγϣϯσβΠϯ - ιϑτ΢ΣΞΞʔΩςΫνϟɾϋʔυύʔπ • ୲౰ίϯϙʔωϯτ୯ҐͰεΩϧϚοϓΛ࡞੒ - جૅతͳεΩϧ͸ίϯϙʔωϯτͷεΩϧʹؚ·ΕΔ • ษڧձͷ࣮ࢪ - εΩϧϚοϓͰॏཁ͔ͭଐਓੑͷߴ͍΋ͷΛத৺ʹ࣮ࢪ • Ұ෦ͷίϯϙʔωϯτͰ͸ staging ؀ڥͰٖࣅతͳো֐Λൃੜͤͯ͞ରԠ܇࿅Λ࣮ࢪ
  40. © 2023 Repro Inc. │ CONFIDENTIAL 50 GRPI ϞσϧΛҙࣝͨ͠νʔϜϏϧσΟϯά •

    Goal: νʔϜͱͯ͠ͷํ޲ੑʹ͍ͭͯશһ͕ཧղ͍ͯ͠Δ - → νʔϜͷ੹຿ɾ໨ඪ౳ʹ͍ͭͯσΟεΧογϣϯ • Role: ͓ޓ͍ʹର͢Δظ଴஋ʢ໾ׂʣΛཧղ͍ͯ͠Δ - → ࣗ෼ͷ໾ׂɾ૬खʹظ଴͢Δ໾ׂͷ͢Γ߹Θͤ • Process: ීஈͲ͏ߦಈ͢΂͖͔ɾͲ͏໨ඪΛୡ੒͢Δ͔શһ͕ཧղ͍ͯ͠Δ - → ੹຿ɾ໨ඪ͔Β۩ମతͳλεΫʹམͱ͢ɾWorking Agreement ͷઃఆ • Interaction: ͓ޓ͍ͷಛੑʢੑ֨ɾՁ஋؍౳ʣΛཧղ͍ͯ͠Δ - → ετϨϯάεϑΝΠϯμʔ౳Λར༻ͨ͠ϫʔΫγϣοϓͷ࣮ࢪ
  41. © 2023 Repro Inc. │ CONFIDENTIAL 51 Repro Coreͷϛογϣϯɾ੹຿ •

    ϛογϣϯ - Repro ͷػೳΛఏڙ͢Δ্Ͱத֩ͱͳΔج൫Λఏڙ͠ɺFeature Team ͕ސ٬Ձ஋ ͷఏڙʹूதͰ͖Δ૊৫Λ࡞Δ • ੹຿ - SDK͔ΒͷσʔλΛଞͷαʔϏεͰར༻Ͱ͖ΔΑ͏ʹ֤छσʔλετΞʹอଘ͢Δ - Feature Team ͕ػೳ։ൃ͢Δࡍʹར༻͢Δڞ௨ج൫ɾϥΠϒϥϦͷ։ൃ - ্هڞ௨ج൫ɾϥΠϒϥϦΛ࢖ͬͨ։ൃͷࢧԉ - Feature Team ͷઃܭ૬ஊ
  42. © 2023 Repro Inc. │ CONFIDENTIAL • Datadog - Πϯϑϥ౳ͷϞχλϦϯάαʔϏε

    • Rollbar - ΞϓϦέʔγϣϯΞϥʔτΛ؅ཧ͢ΔαʔϏε • Rundeck - δϣϒεέδϡʔϥ • Papertrail - ϩά؅ཧαʔϏε • CloudWatch Logs - AWS ͷϞχλϦϯάαʔϏε 53 PagerDuty ͱ࿈ܞ͍ͯ͠ΔओͳαʔϏε
  43. © 2023 Repro Inc. │ CONFIDENTIAL • Datadog - Πϯϑϥ౳ͷϞχλϦϯάαʔϏε

    • Rollbar - ΞϓϦέʔγϣϯΞϥʔτΛ؅ཧ͢ΔαʔϏε • Rundeck - δϣϒεέδϡʔϥ • Papertrail - ϩά؅ཧαʔϏε • CloudWatch Logs - AWS ͷϞχλϦϯάαʔϏε 54 PagerDuty ͱ࿈ܞ͍ͯ͠ΔओͳαʔϏε
  44. © 2023 Repro Inc. │ CONFIDENTIAL 55 DatadogϞχλʔͷPagerDuty࿈ܞઌͷมߋʢม਺ฤʣ locals {

    clusters = flatten(values(var.services).*.clusters) } resource "datadog_monitor" "cpu" { ... query = "avg(last_5m):cpu{cluster NOT IN (${join(",", local.clusters)})} ..." message = <<-EOF {{#is_alert}} @pagerduty-Sys-Infra {{/is_alert}} {{#is_recovery}} @pagerduty-resolve {{/is_recovery}} EOF } resource "datadog_monitor" "cpu_others" { for_each = var.services ... query = "avg(last_5m):cpu{cluster IN (${join(",", each.value.clusters)})) ..." message = <<-EOF {{#is_alert}} ${each.value.on_alert} {{/is_alert}} {{#is_recovery}} ${each.value.on_recovery} {{/is_recovery}} EOF }
  45. © 2023 Repro Inc. │ CONFIDENTIAL 56 DatadogϞχλʔͷPagerDuty࿈ܞઌͷมߋʢ෼ذฤʣ resource "datadog_monitor"

    "consumer_lag" { ... query = "min(last_10m):max:kafka.consumer_lag{*} by {consumer_group} > 1000" message = <<-EOF {{#is_alert}} {{#is_match "consumer_group.name" "xxxxx" }} @pagerduty-Push-Service {{else}} @pagerduty-Repro-Core-Service-1 {{/is_match}} {{/is_alert}} EOF }
  46. © 2023 Repro Inc. │ CONFIDENTIAL • Datadog - Πϯϑϥ౳ͷϞχλϦϯάαʔϏε

    • Rollbar - ΞϓϦέʔγϣϯΞϥʔτΛ؅ཧ͢ΔαʔϏε • Rundeck - δϣϒεέδϡʔϥ • Papertrail - ϩά؅ཧαʔϏε • CloudWatch Logs - AWS ͷϞχλϦϯάαʔϏε 57 PagerDuty ͱ࿈ܞ͍ͯ͠ΔओͳαʔϏε
  47. © 2023 Repro Inc. │ CONFIDENTIAL • ReproͰ͸ΞϓϦέʔγϣϯΞϥʔτͷ֤छ࿈ܞʹ Rollbar Λ࢖༻

    • ReproͷRailsΞϓϦέʔγϣϯ͸ϞϊϨϙͰRollbarͷϓϩδΣΫτ͸ڞ௨ - ECSαʔϏε͸Կݸʹ΋෼͔Ε͍ͯΔ • RollbarͰϓϩδΣΫτΛ෼͚ΔͱϓϩδΣΫτΛԣஅͯ͠ΞϥʔτΛ֬ೝͮ͠Β͍ - ෳ਺ͷϓϩδΣΫτΛԣஅͯ͠ΞϥʔτΛ֬ೝͰ͖Δ͕ • Rollbarͷ௨஌ઌͷഉଞ੍ޚ͸େม - Τϥʔϝοηʔδʹ A ؚ͕·ΕͨΒ PagerDuty ͷ Service A ʹ incident Λ࡞੒ - Τϥʔϝοηʔδʹ B ؚ͕·ΕͨΒ PagerDuty ͷ Service B ʹ incident Λ࡞੒ - → Τϥʔϝοηʔδʹ A ͱ B ؚ͕·Ε͍ͯͨΒͲͪΒʹ΋ incident ͕࡞੒͞ΕΔ 58 ΞϓϦέʔγϣϯΞϥʔτͷPagerDuty࿈ܞઌมߋͷ՝୊