Upgrade to Pro — share decks privately, control downloads, hide ads and more …

開発と運用でサービスの信頼性を高める
「SRE」の実践/Mercari SRE in prac...

Avatar for kazeburo kazeburo
September 01, 2017

開発と運用でサービスの信頼性を高める
「SRE」の実践/Mercari SRE in practice Enterprise Development Conference

開発と運用でサービスの信頼性を高める
「SRE」の実践
Enterprise Development Conference

Avatar for kazeburo

kazeburo

September 01, 2017
Tweet

More Decks by kazeburo

Other Decks in Technology

Transcript

  1. ࣗݾ঺հ • Masahiro Nagano / ௕໺խ޿ • @kazeburo (twitter/github) •

    גࣜձࣾϝϧΧϦ
 ϓϦϯγύϧΤϯδχΞ
 Site Reliability Engineering (SRE) νʔϜ • BASE, Inc ٕज़ΞυόΠβʔ
  2. ࣗݾ঺հ(ܦྺɾ׆ಈ) • ܦྺ • 2006೥ mixi - ΞϓϦӡ༻νʔϜ • 2010೥

    livedoor (LINE) - ։ൃࢧԉνʔϜ • 2015೥ ݱ৬ - SRE • 15೥Ҏ্ WebαʔϏεΛΠϯϑϥ͔Βࢧ͑Δۀ຿ • ొஃʗࣥච • AWS Dev Day Tokyo 2017 ొஃ • WEB+DB PRESS Vol. 100 هࣄࣥච
  3. ϝϧΧϦ • ࠃ಺࠷େڃͷϑϦϚΞϓϦ • 3෼Ͱ؆୯ʹग़඼ 1) ࣸਅΛࡱΔ 2) ঎඼৘ใΛهೖ 3)

    ग़඼ϘλϯΛԡ͢ • ҆৺҆શͳܾࡁɾऔҾ • ΤεΫϩʔ • ͓ۚͷ΍ΓͱΓ͸౰͕ࣾؒʹհࡏ • ಗ໊഑ૹ
  4. ϝϧΧϦγεςϜ֓ཁ ©2011 Amazon Web Services LLC or its affiliates. All

    rights reserved. Client Multimedia Corporate data center Traditional server Mobile Client IAM Add-on Example: IAM Add-on ence ) Assignment/ Task Requester Workers ग़඼! DB Search 5-දࣔ ݕࡧ൓ө ©2011 Amazon Web Services LLC or its affiliates. All rights reserved. User Users Client Multimedia Corporate data center Traditional server Mobile Client Internet AWS Management Console IAM Add-on Example: IAM Add-on Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task Requester Workers Amazon Mechanical Turk Non-Service Specific ©2011 Amazon Web Services LLC or its affiliates. All rights reserved. User Users Client Multimedia Corporate data center Traditional server Mobile Client Internet AWS Management Console IAM Add-on Example: IAM Add-on Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task Requester Workers Amazon Mechanical Turk Non-Service Specific ©2011 Amazon Web Services LLC or its affiliates. All rights reserved. User Users Client Multimedia Corporate data center Traditional server Mobile Client Internet AWS Management Console IAM Add-on Example: IAM Add-on Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task Requester Workers Amazon Mechanical Turk Non-Service Specific ©2011 Amazon Web Services LLC or its affiliates. All rights reserved. User Users Client Multimedia Corporate data center Traditional server Mobile Client Internet AWS Management Console IAM Add-on Example: IAM Add-on Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task Requester Workers Amazon Mechanical Turk Non-Service Specific ©2011 Amazon Web Services LLC or its affiliates. All rights reserved. User Users Client Multimedia Corporate data center Traditional server Mobile Client Internet AWS Management Console IAM Add-on Example: IAM Add-on Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task Requester Workers Amazon Mechanical Turk Non-Service Specific ©2011 Amazon Web Services LLC or its affiliates. All rights reserved. User Users Client Multimedia Corporate data center Traditional server Mobile Client Internet AWS Management Console IAM Add-on Example: IAM Add-on Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task Requester Workers Amazon Mechanical Turk Non-Service Specific ©2011 Amazon Web Services LLC or its affiliates. All rights reserved. User Users Client Multimedia Corporate data center Traditional server Mobile Client Internet AWS Management Console IAM Add-on Example: IAM Add-on Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task Requester Workers Amazon Mechanical Turk Non-Service Specific ©2011 Amazon Web Services LLC or its affiliates. All rights reserved. User Users Client Multimedia Corporate data center Traditional server Mobile Client Internet AWS Management Console IAM Add-on Example: IAM Add-on Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task Requester Workers Amazon Mechanical Turk Non-Service Specific ©2011 Amazon Web Services LLC or its affiliates. All rights reserved. User Users Client Multimedia Corporate data center Traditional server Mobile Client Internet AWS Management Console IAM Add-on Example: IAM Add-on Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task Requester Workers Amazon Mechanical Turk Non-Service Specific ©2011 Amazon Web Services LLC or its affiliates. All rights reserved. User Users Client Multimedia Corporate data center Traditional server Mobile Client Internet AWS Management Console IAM Add-on Example: IAM Add-on Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task Requester Workers Amazon Mechanical Turk Non-Service Specific ©2011 Amazon Web Services LLC or its affiliates. All rights reserved. User Users Client Multimedia Corporate data center Traditional server Mobile Client Internet AWS Management Console IAM Add-on Example: IAM Add-on Amazon Mechanical Turk On-Demand Workforce Human Intelligence Tasks (HIT) Assignment/ Task Requester Workers Amazon Mechanical Turk Non-Service Specific େྔͷϦΫΤετ ©2011 Amazon Web Services LLC or its affiliates. All rights reserved. User Users Client Multimedia Corp data c Mobile Client Internet AWS Management Console IAM Add-on Example: IAM Add-on Human Intelligence Tasks (HIT) Assignment/ Task Requester Workers Amazon Mechanical Turk Non-Service Specific ϦΫΤετԠ౴ DB Search ߪೖ! ਺ඵʙ30ඵ ਺ඵʙ ߴ଎ʹฒߦͯ͠େྔͷτϥϯβΫγϣϯΛѻ͏ ը૾ ܾࡁ AI ϑΟʔυόοΫ
  5. ΠϯϑϥετϥΫνϟ JP US UK DNS: Amazon Route53 CDN: Akamai, Fastly,

    ImageFlux Storage: Amazon S3 Analysis: Google BigQuery ܾࡁ/෺ྲྀαʔϏε ܾࡁ/෺ྲྀαʔϏε ܾࡁ/෺ྲྀαʔϏε
  6. SREͱ͸ • Site Reliability Engineering/Engineer ͷུ • Reliability = ৴པੑ

    • γεςϜ؅ཧͱαʔϏεӡ༻ͷํ๏࿦ͱͯ͠Googleͷӡ༻νʔϜΛ཰͍͍ͯͨ Ben Treynor͕ఏএ • USΛத৺ʹେن໛ͳITΠϯϑϥΛӡ༻͢Δ֤ࣾʹ޿·Δ • ໌֬ͳఆٛ͸ͳ͍͕ʮιϑτ΢ΣΞΤϯδχΞϦϯάʹΑͬͯɺΠϯϑϥετϥΫ νϟɾαʔϏεશମͷՄ༻ੑɺੑೳɺηΩϡϦςΟΛվળ͢ΔʯΤϯδχΞ/νʔϜ
  7. Google SRE • ۀ຿࣌ؒͷ50%͸ιϑτ΢ΣΞΤϯδχΞϦϯάΛߦ͏ • ࣗಈԽ(ࣗ཯Խ)ɺ৴པੑ޲্ʹ͋ͯΔ • 50%Λ௒͑Δ͜ͱ͕͋Ε͹ۀ຿ͷେ෯ͳݟ௚͠ΛഭΒΕΔ • SLAɺΤϥʔόδΣοτ(༧ࢉ)ʹΑΔ։ൃऀͷར֐ௐ੔

    • ։ൃऀνʔϜͱՄ༻ੑͷ໨ඪΛαʔϏε͝ͱʹઃఆ • ΤϥʔόδΣοτ಺ʹ͋Δͱ͖͸։ൃऀ͸ੵۃతͳϦϦʔεΛߦ͍ɺ༧ࢉΛ௒ ͑Δ৔߹͸৴པੑճ෮ͷͨΊͷ։ൃʹઐ೦͢Δ͜ͱ͕ٻΊΒΕΔ
  8. SRE΁ͷظ଴ͷߴ·Γ • ॻ੶/ࡶࢽ • ΦϥΠϦʔʮSRE αΠτϦϥΠΞϏϦςΟΤϯδχΞϦϯάʯ • ೔ܦBPʮ೔ܦSYSTEM 2017/7ʯ •

    Πϯλʔωοτ্ͷಛूهࣄ • ITPro - άʔάϧൃͷ৽ख๏ʮSREʯɺ೔ຊͰ֦େ • http://itpro.nikkeibp.co.jp/atcl/column/14/346926/030600869/ • @IT - ಛूɿ৘γεʹٻΊΒΕΔʮSREʯͱ͍͏৽ͨͳ໾ׂ • http://www.atmarkit.co.jp/ait/series/4503/
  9. Mercari SRE ͷۀ຿ൣғ Operations Software Eng. ج൫ߏங OnCall (ো֐ରԠ) Automation

    εέʔϥϏϦςΟɾՄ༻ੑվળ DBAɺϛυϧ΢ΣΞߏங ΞϓϦέʔγϣϯͷઃܭϨϏϡʔ ϩάऩूɾ෼ੳج൫ͷߏஙɺӡ༻ αʔόϓϩϏδϣχϯάɾσϓϩΠɺϚΠΫϩαʔϏεج൫ͷ੔උ ηΩϡϦςΟʗෆਖ਼ར༻ݕग़ γεςϜӡ༻Λʮ࢓૊Έʯͱͯ͠
 ࡞Γ্͛Δ͜ͱ͕ٻΊΒΕ͍ͯΔ
  10. mackerel: Ϋϥ΢υܕ؂ࢹαʔϏε • גࣜձࣾ͸ͯͳ ఏڙͷ؂ࢹαʔϏε • ͸ͯͳࣾͰͷαʔόӡ༻ϊ΢ϋ΢ • ֤छAPI͕༻ҙ͞ΕDevOpsͱͷ૬ੑ΋ྑ͍ •

    PluginͰ؂ࢹ߲໨ͷ֦ு͕Մೳ • 40ݸҎ্ͷSREνʔϜ։ൃͷPluginΛར༻ • αʔόͷঢ়ଶ͚ͩͰ͸ͳ͘ɺ֎ܗ؂ࢹɺαʔϏεʹؔΘΔ਺஋ͷՄࢹԽɺΞϥʔτઃఆՄೳ • Ξϥʔτͷ௨஌͸SlackΛ࢝Ί֤छαʔϏε࿈ܞ͕༻ҙ
  11. PagerDutyʹΑΔ௨஌ • ༷ʑͳखஈͰ௨஌Λߦ͏͜ͱ͕Ͱ͖Δɻ ൓Ԡ͢Δ·Ͱଓ͘ • mail • SMS • App

    (iOS, Android) • ి࿩ • ʮ10෼Λ௒͑ͨͱ͜ΖͰҰ౓ి࿩Λೖ ΕΔʯϧʔϧͰӡ༻
  12. େن໛ύεϫʔυϦετ߈ܸࣄྫ • 2016೥ʹ࣮ࡍʹى͖ͨ߈ܸͷΞΫ ηεݩͷࠃ • ࣍ʑʹIPΛมߋ͠ɺͦΕͧΕͷIPͰ ͸਺ճ͔͠ϩάΠϯࢼߦͤͣɺࣗಈ Ͱ๷͙͜ͱ͕೉͍͠ ͦͷଞ 18%

    Armenia 2% Azerbaijan 2% Bahrain 2% Georgia 2% Japan 2% Russian 2% Indonesia 3% Nepal 3% Pakistan 5% Thailand 5% Taiwan 6% Viet Nam 6% Brazil 10% India 30%