$30 off During Our Annual Pro Sale. View Details »

XFLAG スタジオにおけるSRE / xflag-studio-sre

XFLAG スタジオにおけるSRE / xflag-studio-sre

第76回: SRE大全: XFLAG スタジオ編

Isao Shimizu

August 24, 2017
Tweet

More Decks by Isao Shimizu

Other Decks in Technology

Transcript

 1. XFLAG ελδΦʹ͓͚ΔSRE
  XFLAG ࣄۀຊ෦ ήʔϜ։ൃࣨ SREάϧʔϓ
  ਗ਼ਫ ܄ @isaoshimizu
  hbstudy ୈ76ճɿ SREେશ: XFLAG ελδΦฤ
  XFLAG STUDIO

  View Slide

 2. ࣗݾ঺հ
  2

  View Slide

 3. About me
  • ਗ਼ਫ ܄ / Isao SHIMIZU @isaoshimizu
  • גࣜձࣾϛΫγΟ XFLAG ࣄۀຊ෦ ήʔϜ։ൃࣨ SREάϧʔϓ
  • SIerͰडୗ։ൃɺࣗࣾϓϩμΫτ։ൃɺӡ༻Λ໿8೥
  • גࣜձࣾϛΫγΟ
  • 2011.8ʙ ӡ༻෦ ΞϓϦӡ༻άϧʔϓɻSNSͷӡ༻ɻ
  • Fedora 8 -> 17ΞοϓσʔτɺsystemdಋೖɺLXCಋೖɺͦͷଞվળͳͲ
  • 2014.4ʙ ϞϯελʔετϥΠΫͷӡ༻ʹδϣΠϯɻ
  • 2015.8ʙ XFLAG ελδΦ͕૑ઃ͞ΕΔɻ
  • 2016.7ʙ XFLAG ελδΦʹSREάϧʔϓ૑ઃɻ
  • ϓϥΠϕʔτ͸ɺΫϥϑτϏʔϧ ɺָثԋ૗ʢTromboneʣɺि຤ྉཧɺήʔϜͳͲ
  3

  View Slide

 4. XFLAG ελδΦʹ͍ͭͯ
  4

  View Slide

 5. XFLAG ελδΦ
  • εϚʔτϑΥϯ޲͚ήʔϜ
  • ϞϯελʔετϥΠΫʢ2013.10ʙʣ
  • ϞϯετελδΞϜʢ2015.4ʙʣ
  • ϑΝΠτϦʔάʢ2017.6ʙʣ
  • ಈը
  • ϞϯετΞχϝ
  • YouTube഑৴ʢ2017.6.14ʹੈքྦྷܭ࠶ੜճ਺2ԯճಥഁʣ
  • ࡢ೥຤ʹ͸ܶ৔൛΋ެ։
  • XFLAG STORE SHIBUYAʢৗઃళฮʣɺXFLAG STOREʢΦϯϥΠϯετΞʣ
  • ͦͷଞ
  5

  View Slide

 6. SREʹ͍ͭͯ
  6

  View Slide

 7. Googleʹ͓͚ΔSRE
  43&͸ɺྺ࢙తʹΦϖϨʔγϣϯۀ຿Λجຊతʹߦ͍ͬͯΔ͕ɺ

  ਓखͰߦ͍ͬͯͨ͜ͱΛࣗಈԽʹஔ͖׵͑Διϑτ΢ΣΞΤϯδχΞͷ໾ׂ΋୲͍ͬͯΔ

  https://landing.google.com/sre/interview/ben-treynor.html
  "Fundamentally, it's what happens when you ask a software engineer to design an operations function." Ben Treynor Sloss, Vice President, Google Engineering, founder of
  Google SRE
  جຊతʹɺιϑτ΢ΣΞΤϯδχΞʹΦϖϨʔγϣϯۀ຿ͷઃܭΛґཔ͢Δͱ͖ʹඞཁͱ͞ΕΔ΋ͷ
  "Traditional software engineers tend to focus on one particular system, and understand it in great depth. Software engineers in Site Reliability Engineering tend to spread their
  focus across a broad range of systems." Nida Farrukh, Site Reliability Engineer, Zurich
  ैདྷ͔Βͷιϑτ΢ΣΞΤϯδχΞ͸ɺ͋ΔͭͷγεςϜʹϑΥʔΧε͠ɺਂ͘ཧղ͢Δ܏޲͕͋Δɻ43&͸෯޿͍γεςϜʹϑΥʔΧε͢Δ܏޲͕͋Δɻ
  "Our work is like being part of the world's most intense pit crew. We change the tires of a race car as it's going 100 mph." Andrew Widdowson, Site Reliability Engineer,
  Mountain View
  ࢲୡ43&ͷ࢓ࣄ͸ੈքͰ࠷΋ܹ͍͠ϐοτΫϧʔͷҰһͰ͋ΔΑ͏ͳ΋ͷɻզʑ͸ɺϚΠϧຖ࣌ʢ࣌଎໿LNʣͰ૸ΔϨʔεΧʔͷλΠϠΛม͑Δɻ
  "SREs engineer services, instead of binaries. This is a shift in perspective that exploits unusual skills and creativity. SREs are specialists in making changes safely." John T.
  Reese, Site Reliability Engineer, San Francisco
  όΠφϦͰ͸ͳ͘43&͕ఏڙ͢Δ΋ͷɻ͜Ε͸ಛघͳεΩϧͱ૑଄ੑΛੜΈग़͢มԽͰ͋Δɻ43&͸҆શʹมԽΛ΋ͨΒ͢εϖγϟϦετͰ͋Δɻ
  https://landing.google.com/sre/
  7

  View Slide

 8. ֤ࣾͷSREࣄ৘
  • WantedlyͰืूΛSREͰݕࡧ͢Δͱ60݅ώοτ
  • https://www.wantedly.com/search?t=projects&q=SRE
  • Rettyʮैདྷͷӡ༻ͱ͞΄ͲมΘΒͳ͍ʯʮ։ൃੜ࢈ੑϓϩδΣΫτʯ
  • http://itpro.nikkeibp.co.jp/atcl/column/14/346926/030600869/
  • freeeʮΠϯϑϥ෦ୂΛղࢄ͢Δͷ͕໨ඪʯ ʮՔಇ཰99.9%Λ໨ࢦ͢͜ͱʯ
  • http://itpro.nikkeibp.co.jp/atcl/column/14/346926/030600869/
  • ϝϧΧϦʮӡ༻ۀ຿ͱιϑτ΢ΣΞΤϯδχΞͷ໾ׂ͕ٻΊΒΕΔʯ
  • http://tech.mercari.com/entry/2015/11/18/153421
  • αΠϘ΢ζʮ໾ׂΛݻఆ͗͢͠ͳ͍ʯʮιϑτ΢ΣΞ։ൃεΩϧΛຏ͘ʯʮToil͸ͳ͍ͯ͘͘͠ʯ
  • http://blog.cybozu.io/entry/2016/09/01/080000
  8

  View Slide

 9. XFLAG ελδΦʹ͓͚ΔSRE
  9

  View Slide

 10. XFLAG ελδΦʹ͓͚ΔSREͱ͍͏૊৫
  • SREάϧʔϓ͕Ͱ͖ͯ໿1೥ʢ2016.7݄ελʔτʣ
  • ਓ਺͸7໊ʢ2017.8࣌఺ʣ
  • όοΫάϥ΢ϯυ͸༷ʑ
  • ৽ଔೖࣾͱத్ೖ͕ࣾ൒ʑ͘Β͍
  • ಘҙ෼໺͸ͦΕͧΕɻϑϧελοΫΛٻΊͯ͸͍ͳ͍ɻ
  • Ϛωʔδϟʔ͸͓͍͍ͯͳ͍
  • ΍Δ΂͖࢓ࣄ͸ࣗ෼Ͱݟ͚ͭɺೳಈతʹ࣮ߦ͢Δ
  • ༩͑ΒΕͨ࢓ࣄ͚ͩ͜ͳ͍ͯͯ͠΋ධՁ͸͞Εͳ͍ʢ༩͑ΒΕΔ͜ͱ͸كʣ
  • ۀ຿্ͷίϛϡχέʔγϣϯ͸Slack͕த৺
  • ࡞ۀ͸GitHub Issue/Pull Requestͱͯ͠Ξ΢τϓοτ͠ɺٞ࿦΍ϨϏϡʔΛ͓͜ͳ͏
  10

  View Slide

 11. ౰൪੍
  • 2ਓମ੍Ͱ1िؒަ୅ɻ೔༵೔࢝·Γ౔༵೔ऴΘΓʢຖ݄1ճ͘Β͍ͷϖʔεʣ
  • جຊతʹ1࣍ରԠ
  • PagerDutyͷ׆༻
  • ༷ʑͳ௨஌ʢి࿩ɺϝʔϧɺϓογϡͳͲʣ
  • ౰൪͕௨஌ʹؾ͔ͮͳ͔ͬͨ৔߹ɺ౰൪֎΁ࣗಈΤεΧϨʔγϣϯ
  • ౰൪த͸ΦϯϥΠϯͰ͋Δ͜ͱʢΞϥʔτ͕ड͚औΕΔΑ͏ʹʣɺΞϥʔτൃใ࣌
  ʹଈ࣌ʹରԠͰ͖Δ͜ͱ
  • ྫ͑͹ɺ౰൪த͸өըΛ؍ͨΓɺӡస͢ΔͳͲ͸ආ͚Δ
  • ඞཁͳ࡞ۀखॱ͸WikiʹυΩϡϝϯτͱͯ͠هࡌɻπʔϧԽͰखॱΛγϯϓϧʹɻ
  11

  View Slide

 12. SRE͕ؔΘ͍ͬͯΔ͜ͱ
  • ෛՙରࡦ
  • ίʔυվળɺDB෼ׂɺϨϏϡʔ
  • ϋʔυ΢ΣΞબఆɺεέʔϧΞ΢τ/ΞοϓɺKernel΍ϛυϧ΢ΣΞͷνϡʔχϯά
  • ؂ࢹɺ౰൪
  • ؂ࢹπʔϧͷվળ
  • ιϑτ΢ΣΞͷϝτϦΫεҎ֎ʹిྗ΋ؾʹ͢Δ
  • ো֐ରԠ
  • Ϋϥ΢υো֐ɺϋʔυ΢ΣΞނো
  • σʔλΠϯϙʔτ
  • ήʔϜσʔλɺϦιʔεͷߋ৽
  • σϓϩΠ
  • εςʔδϯάɺຊ൪΁ͷίʔυσϓϩΠ
  12

  View Slide

 13. SRE͕ؔΘ͍ͬͯΔ͜ͱ
  • WebαΠτߏஙɺCDNઃఆ
  • ChefͰCMSߏஙɺCloudFrontઃఆ
  • ϝϯςφϯε
  • όʔδϣϯΞοϓϝϯςφϯε
  • ։ൃ؀ڥɺCI؀ڥ
  • ։ൃ༻ΠϯελϯεɺCIπʔϧʢJenkinsͳͲʣͷ੔උ
  • ηΩϡϦςΟରࡦ
  • ੬ऑੑ਍அґཔɺ੬ऑੑͷ͋Διϑτ΢ΣΞͷΞοϓσʔτͳͲ
  • πʔϧ։ൃ
  • νϟοτϘοτɺCLIπʔϧͳͲ
  • ֤छ૬ஊ
  • ৽Ωϟϯϖʔϯͷෛՙ૬ஊͳͲ
  13

  View Slide

 14. SRE͕ࢧ͍͑ͯΔϞϯελʔετϥΠΫͷ؀ڥʢ࠷৽൛ʣ
  14

  View Slide

 15. ϞϯελʔετϥΠΫ೔ຊ൛ͷΠϯϑϥ
  15
  DC1 DC2
  GMO
  ΞϓϦΫϥ΢υ AWS
  App/Batch/DB/
  Memcached/Redis
  DB/Memcached/Redis
  Backup
  App App
  ن໛ͱͯ͠͸1,400୆͘Β͍
  DC1-2ؒ͸40Gbps
  ৗ࣌ DB 350୆ɺApp 10,000ίΞલޙʢ৔߹ʹΑͬͯ૿ݮʣɺMemcached 90୆લޙఔ౓
  ΦϯϓϨϛε Ϋϥ΢υ

  View Slide

 16. ΦϯϓϨϛε؀ڥ
  • ෳ਺ͷDCΛར༻͠ɺσʔλετΞͷόοΫΞοϓʢϨϓϦέʔγϣϯʣΛ഑ஔ
  • App/DBαʔόͱͯ͠24ʙ56ίΞʢXeon E5-2670v4ʣͷϚγϯɺMemcached͸8ίΞͷϚγϯ
  • OSΠϯετʔϧɺ࠶Πϯετʔϧ͸Cobbler+KoanͰϦϞʔτ࣮ߦ
  • αʔόͱͯ͠࠷খߏ੒ͷ؀ڥ͕Ͱ͖͕͋Δ
  • ߴෛՙͳDBαʔόͰ͸ɺioMemory SX350 1.3TB΍ioDriveΛ׆༻ʢNVMe SSD΋ࢹ໺ʣ
  • ނো཰ΛԼ͛ΔͨΊʹɺجຊతʹͲΜͳαʔόͰ΋SSDΛ࢖͍ɺSASͳͲͷ࣓ؾσΟεΫσόΠ
  ε͸࢖Θͳ͍ʢ౰વɺSSDͰ΋յΕΔ࣌͸յΕΔʣ
  • ϋʔυ΢ΣΞRAID͸࢖Θͳ͍ɺιϑτ΢ΣΞRAID͸΄ͱΜͲ࢖Θͳ͍
  • SRE͕௚઀DCͰ࡞ۀΛ͢Δ͜ͱ͸͋·Γͳ͍
  • ֤ϥοΫͷిྗ࢖༻ྔΛ࠷దԽͨ͠αʔό഑ஔΛ৺͕͚͍ͯΔ
  16

  View Slide

 17. Ϋϥ΢υ؀ڥ
  • ΦϯϓϨϛεͷ؀ڥͱઐ༻ઢͰ઀ଓ͠ɺϓϥΠϕʔτͰ௨৴Ͱ͖ΔΑ͏ʹ
  • ֤DCͱΫϥ΢υͷϨΠςϯγେࣄ
  • APIͱυΩϡϝϯτʢͰ͖ͨΒαϯϓϧίʔυ΋ʣͷॆ࣮ͨ͠Ϋϥ΢υ͸ѻ͍΍͍͢
  • AWSɺGMOΞϓϦΫϥ΢υɺͲͪΒ΋APIΛ࢖ͬͨಠࣗπʔϧʹΑͬͯૢ࡞
  • 100୆Ұؾʹىಈͯ͠αʔϏεΠϯͤ͞Δ͜ͱ΋
  • GMOΞϓϦΫϥ΢υ͸ݱঢ়Appαʔόͱͯ͠ͷΈར༻ʢݱࡏ40ίΞͷλΠϓΛϝΠϯʹར༻ʣ
  • ϚϧνϓϨΠͰ༻͍ΔTURNαʔό͸AWSͰӡ༻
  • ։ൃ؀ڥɺεςʔδϯά؀ڥ͸AWSʹ౷Ұ
  • ৚͕݅߹͑͹ͲΜͳΫϥ΢υͰ΋׆༻͍ͨ͠
  17

  View Slide

 18. ΞʔΩςΫνϟʢ؆қ൛ʣ
  18
  A10 Load
  Balancer
  Unicorn
  Fluentd
  Redis
  MariaDB
  Memcached
  Batch

  Worker
  Cron
  APIΞΫηε

  View Slide

 19. CDN
  • σʔλϦιʔε
  • AkamaiɺCloudFrontΛซ༻
  • ར༻ൺ཰͸αʔόͷConfigΛσϓϩΠ੍ͯ͠ޚ
  • ঢ়گʹΑͬͯར༻ൺ཰Λม͑Δ͜ͱ΋ʢۃكʣ
  • ΦϦδϯ͸Amazon S3
  • WebαΠτ
  • CloudFrontʹΩϟογϡ
  • ΦϦδϯ͸Amazon EC2
  19

  View Slide

 20. Provisioning
  • جຊ͸ChefʢҰ෦Ansibleʣ
  • ಠࣗaptϨϙδτϦ
  • AptlyΛ࢖ͬͯS3্ʹߏங
  • ࣗ࡞πʔϧͷdebύοέʔδԽ
  • Chefͷ՝୊
  • ϝϯς͞Εͳ͍Cookbookɺdeprecated warningͷཛྷ
  • ChefͷϝδϟʔόʔδϣϯΞοϓͷλΠϛϯά
  • Կ͔͍͍ํ๏͸ͳ͍͔໛ࡧத
  • ผͷProvisioning Tool΁ͷҠߦίετ໰୊
  20

  View Slide

 21. ϩάసૹ
  • Fluentd
  • Amazon S3΁సૹ͠ɺղੳιʔεͱͯ͠ར༻
  • Elasticsearch΁సૹ͠ɺKibanaΛ؆қతͳϩάղੳπʔϧͱͯ͠
  • td-agent 3΁ͷҠߦʢ·ͩstable଴ͪʣ
  21

  View Slide

 22. SRE͕΍͖ͬͯͨ͜ͱʢൈਮʣ
  22

  View Slide

 23. SRE͕΍͖ͬͯͨ͜ͱʢൈਮʣ
  • ௨ৗͷӡ༻ۀ຿ɺෛՙରࡦҎ֎Ͱ΍͖ͬͯͨ͜ͱͷൈਮʢৄ͘͠͸࣍ͷύʔτͰ঺հ&ࢀߟࢿྉʣ
  • DBγϟʔσΟϯάʢαʔϏεແఀࢭʣ
  • THPʢTransparent Huge Pageʣ໰୊ͷௐࠪɺղܾ
  • Kernel΍MySQLͷνϡʔχϯάʹΑΔIOվળ
  • ιϑτ΢ΣΞΞοϓσʔτɺϦϓϨΠε
  • Memcachedʢ1.4.37ɺmodernΦϓγϣϯར༻ʣ
  • Ubuntu ServerɺNginxɺElasticsearchɺKibana
  • ֤छϋʔυ΢ΣΞɺΠϯελϯελΠϓ
  • ࢀߟ
  • https://www.slideshare.net/FumihiroIto/sre-78912803
  • https://speakerdeck.com/isaoshimizu/sregurupugadekitekofalseban-nian-jian-yatutekitakoto
  23

  View Slide

 24. ·ͱΊ
  24

  View Slide

 25. ·ͱΊ
  • XFLAG ελδΦͷࣄۀ͸೔ʑ֦େத
  • SREͱͯ͠ͷ΍Δ΂͖͜ͱɺ՝୊͸ଟ͋͘Δ
  • ࢓ࣄʹରͯ͠SRE͸͜͏͔ͩΒͱนΛઃ͚ͳ͍
  • ࣄۀʹੵۃతʹؔΘ͍ͬͯ͘ɺߩݙ͢Δ
  • ιϑτ΢ΣΞΤϯδχΞϦϯάͰ༷ʑͳ՝୊Λղܾ͢Δ
  25

  View Slide

 26. We're hiring!!
  XFLAG ελδΦͷSREʹڵຯ͕͋Δํ͸͓ؾܰʹ͓੠͕͚͍ͩ͘͞
  https://xflag.com/recruit/engineer/404.html
  ※ϑΝΠϧ໊͕404ͳͷ͸ۮવͰ͢ʢসʣ
  26

  View Slide

 27. Thank you!

  View Slide