Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥

よりUXに近いSLI・SLOの運用による可用性の再設計

nade
May 15, 2022

 よりUXに近いSLI・SLOの運用による可用性の再設計

nade

May 15, 2022
Tweet

More Decks by nade

Other Decks in Programming

Transcript

  1. • SREνʔϜʹϞ バ ΠϧΤϯ ジ χΞ(iOSɺAndroid֤1ਓ) が ઐଐ で ॴଐ͢Δܗ

    • ΑΓϢʔβʔʹ͍ۙ෦෼ͰͷܭଌɺBFFͷਪਐͱ͍ͬͨόοΫΤϯυSRE͚ͩͰ͸खΛ৳͹ͮ͠Β ͍෦෼ʹ΋ΞδϦςΟߴ͘औΓ૊ΊΔΑ͏ʹ͢Δ໨త • Πϯϑϥ΍όοΫΤϯυͷઐ໳ੑͷ֫ಘ΋ SREνʔϜͷMobileΤϯδχΞʹ͍ͭͯ
  2. • Firebase Performance, CrashlyticsͷϩάΛར༻͠ɺϕϩγςΟΞϥʔτΛར༻ • Data PortalͰμογϡϘʔυ؅ཧ • ҧ൓࣌͸Өڹͷൣғɺػೳͷॏཁ౓ʹԠͯ͡ΞΫγϣϯΛܾఆ 


    ैདྷͷSLI/SLOͷӡ༻ — ΫϥΠΞϯτ — ΠϯύΫτ ΞΫγϣϯ High ࢪࡦΛͱΊͯͰ΋νʔϜશମͰՄೳͳݶΓϦιʔεΛͭ͗ࠐΈɺ ϗοτϑΟοΫε΋ݕ౼ Middle ୲౰ͷΤϯδχΞΛΞαΠϯ্ͨ͠Ͱɺ࣍ճϦϦʔεͰඞͣରԠɻ ඞཁͰ͋Ε͹ࢪࡦͷख΋ͱΊΔ Low ୲౰ͷΤϯδχΞΛΞαΠϯ্ͨ͠Ͱɺ ։ൃϦιʔε͕ۭ͖࣍ୈదٓରԠ͢Δ
  3. SLO͕αʔϏεͷՄ༻ੑͷ൑அ࣠ʹͳ͍ͬͯͳ͍ 
 
 
 1. emergencyϨϕϧͷΞϥʔτ͕ൃՐ 2. ΤϯδχΞ͕֤ࣗӨڹൣғʢػೳɺϢʔβʔ਺ʣௐࠪ 3. ΞΫγϣϯܾఆʢ͓஌ΒͤɺTwitterࠂ஌ɺ޿ࠂఀࢭɺϝϯςΠϯʣ

    
 
 
 1. emergencyϨϕϧͷΞϥʔτ͕ൃՐ 2. SLI/SLOμογϡϘʔυͰӨڹൣғ֬ೝ 3. ӨڹͷͰ͍ͯΔSLIɺΤϥʔόδΣοτͷফඅ͔Β࣍ͷΞΫγϣϯΛܾఆ ैདྷͷSLI/SLOͷӡ༻ — ՝୊ᶃ — ຊདྷͷཧ૝ ݱࡏͷӡ༻
  4. όοΫΤϯυɺΫϥΠΞϯτʹดͨ͡ࢦඪͦΕͧΕͷΈͰ͸Մ༻ੑΛอূͰ͖ͳ͍ 
 
 αʔόʔͷSLO͕ຬͨ͞Ε͍ͯͯ΋ɺ • ૬खͷϝοηʔδ͕ݟΕ͍ͯͳ͍ 
 → ΫϥΠΞϯτଆͰͷύʔεΤϥʔ͕ݪҼͰϨεϙϯεΛड৴Ͱ͖ͯͳ͍ •

    ৽نొ࿥Ͱ͖ͳ͍ 
 → ֎෦SNSଆͰো֐ൃੜɺΦϯϘʔσΟϯάϑϩʔ͕ແݶϧʔϓ 
 
 
 • ϝοηʔδը໘Λ։͍ͨϢʔβʔͷ◦%͕ϝοηʔδͷӾཡʹ੒ޭ͍ͯ͠Δঢ়ଶ͔Θ͔Δ • ৽نొ࿥Λ։࢝ͨ͠Ϣʔβʔͷ◦%͕৽نొ࿥ʹ੒ޭ͍ͯ͠Δ͔Θ͔Δ ैདྷͷSLI/SLOͷӡ༻ — ՝୊ᶄ — ຊདྷͷཧ૝ ݱࡏͷӡ༻
  5. ϓϥοτϑΥʔϜͷఏڙ͢Δࢦඪͷ࠾༻ Apple, Google͕σΟϕϩούʔʹఏڙ͢ΔSLI • ϓϥοτϑΥʔϜଆͷߟ͑Δྑ͍ΞϓϦͷఆٛ • ΞϓϦετΞ΍OSϨϕϧͰӨڹ͕͋Δ͜ͱ΋ ◦ ASOʢApp Store

    Optimizationʣ ◦ ༏ઌͯ͠ΞϯΠϯετʔϧΛଅ͞ΕΔ ◦ ΞϓϦͷόοΫάϥ΢ϯυͰͷੜଘ࣌ؒ • λοϓϧͰ͸ԼهͷࢦඪΛSLIͱͯ͠࠾༻ ◦ ΞϓϦαΠζ ◦ શମΫϥογϡ཰ʢANRʣ ◦ ىಈ࣌ؒ https://developers.cyberagent.co.jp/blog/archives/20354/
  6. ࠶ఆٛͨ͠SLI όοΫΤϯυʢܧଓʣ • ϨΠςϯγʔ 
 ΫϥΠΞϯτʢʹϓϥοτϑΥʔϜࢦඪʣ • ىಈ࣌ؒɺશମΫϥογϡ཰ɺΞϓϦαΠζ 
 ίΞػೳʢʹػೳ͝ͱͷՄ༻ੑʣ

    • ϩάΠϯɺ৽نొ࿥ɺΧʔυϑϦοΫɺ 
 ϝοηʔδɺ͓Ͱ͔͚ɺWish 
 Մ༻ੑΛ΋ͱʹͨ͠SLIΛઃఆ 
 ڞ௨ Ϋϥογϡ / ANR ػೳ͝ͱͷΩʔϝτϦΫε ϝοηʔδ: ಡΈࠐΈ଎౓ ΧʔυϑϦοΫ: ࠶ϑϦοΫՄೳʹͳΔ·Ͱͷ࣌ؒ 
 ɾɾɾ
  7. ΫϥΠΞϯτଆͰͷSLI / SLOͷܭଌ 1. ·ͣ͸Firebase (+BigQuery) ʹ׬݁ͯ͠ϛχϚϜʹՔಇ • ػೳతʹे෼ͳ͜ͱΛݕূ 2.

    Monitoring, DashboardΛDataDogʹҠߦʢ͜Ε͔Βʣ • Ξϥʔτͷࣗ༝౓Λ্͛Δ • ଞͷϩάͰར༻ͯ͠ΔαʔϏεͱ߹ΘͤΔ
  8. ΫϥΠΞϯτଆͰͷϞχλϦϯάπʔϧͷݕ౼ ͻͱ·ͣྉۚ໘Λߟྀ͠FirebaseΛར༻͠ɺʮӡ༻Λ։࢝ → ՝୊͕͋Ε͹ϒϥογϡΞοϓʯͷํ਑ʹ 
 1. Firebase Crashlytics / Performance

    + BigQuery • Firebaseࣗମ͸ແྉͷͨΊBigQueryͷQuery, Active StorageͷྉۚͷΈ • Crashlytics, Performance͸طଘͰ΋ར༻͍ͯ͠ΔͨΊಋೖίετ͕௿͍ • BigQueryʹ஝ੵ͢Δ͜ͱͰൺֱతࣗ༝౓͸ߴ͘ར༻Ͱ͖Δ 2. DataDog RUM • طଘͷϞχλϦϯάπʔϧͱͯ͠DataDogΛར༻͍ͯ͠ΔͨΊπʔϧΛ౷ҰͰ͖Δ • ηογϣϯ୯Ґͷ՝ۚମܥͷͨΊɺࠓճ͸ྉۚ໘Ͱஅ೦ 3. New Relic One • όοΫΤϯυͰDataDogΛར༻͍ͯ͠Δ͜ͱɺྉۚ໘Λഎܠʹஅ೦
  9. FirebaseͰߏ੒͢ΔϛχϚϜͳܭଌϑϩʔ Firebase ( Crashlytics + Performance ) ͷΈͰϛχϚϜͳϑϩʔ͸࣮ݱՄೳ • BigQueryʹExportʢϦΞϧλΠϜ΋Մೳʣ

    • BigQuery্ʹμογϡϘʔυදࣔ༻ͷதؒςʔϒϧ࡞੒ • Data PortalͰμογϡϘʔυදࣔ )LUHEDVH 3HUIRUPDQFH )LUHEDVH &UDVKO\WLFV  8VHU &UDVK .H\0HWULFV *RRJOH'DWD3RUWDO $XWR([SRUW %LJ4XHU\ ,QWHUPHGLDWHWDEOH %LJ4XHU\
  10. Ϋϥογϡϩάͷ෼ྨ https://firebase.google.com/docs/crashlytics/customize-crash-reports import UIKit import FirebaseCrashlytics final class LoginViewController: UIViewController

    { // ը໘Λදࣔ͢ΔλΠϛϯάͰCrashlyticsʹΧελϜΩʔΛઃఆ override func viewWillAppear(_ animated: Bool) { super.viewWillAppear(animated) Crashlytics.crashlytics().setCustomValue("login", forKey: "domain") } iOSΞϓϦͰͷϩάΠϯը໘ʹର͢Δ࣮૷ྫ ֤ػೳͷը໘දࣔ࣌ʹCrashlyticsʹΧελϜΩʔΛઃఆ͢Δ • ΧελϜΩʔΛ΋ͱʹBigQuery (Data Portal) ্ͰػೳυϝΠϯ͝ͱͷΫϥογϡΛूܭ • ػೳ͝ͱͷΞϥʔτ͸ઃఆͰ͖ͳ͍ͨΊɺVelocity Alert → μογϡϘʔυͰػೳӨڹ֬ೝ
  11. ӡ༻ͷมԽ ϦϦʔεޙͷϝτϦΫε֬ೝ࣌ʹόοΫΤϯυɺΫϥΠΞϯτͷڞ௨ݴޠʹ 
 • όοΫΤϯυɺΫϥΠΞϯτͦΕͧΕ͕׬શʹ෼͔Εͯ֬ೝ • ֬ೝՄೳͳϝτϦΫεɺμογϡϘʔυΛҰ௨Γ໢ཏతʹ֬ೝ • όοΫΤϯυɺΫϥΠΞϯτڞ௨ͷSLI /

    SLO μογϡϘʔυΛ֬ೝ ◦ ௐࠪͷ৔߹͸SLIΛڞ௨ݴޠͱͯ͠ίϛϡχέʔγϣϯ • ҧ൓͕͋ͬͨ৔߹ʹؔ࿈ͷϝτϦΫεɺμογϡϘʔυΛ֬ೝ ◦ SLI / SLO΋ಉ࣌ʹߦ͏ ݱࡏͷӡ༻ ैདྷͷӡ༻
  12. ࠓޙͷల๬ • ·ͩ·ͩӡ༻࢝ΊͨͯͳͷͰɺࠓޙ΋஌ݟΛஷΊ͍ͯ͘ • ࢦඪͦͷ΋ͷͷܧଓతͳϒϥογϡΞοϓ ◦ ࢦඪͷ͖͍͠஋ɺλʔήοτʢ90% → 50%ʣ ◦

    ΤϥʔόδΣοτ ◦ ػೳͷΩʔϝτϦΫεͷର৅ • DataDogͰҰؾ௨؏ͯ͠ϞχλϦϯάɺμογϡϘʔυΛ؅ཧͰ͖ΔΑ͏ʹ͢Δ • ૊৫಺ͰͷSLI / SLOͷਁಁ ◦ ݱঢ়·ͩ·ͩΤϯδχΞҎ֎ʹ͸ਁಁͰ͖ͯͳ͍