モンストを支えるインフラの今とこれから

 モンストを支えるインフラの今とこれから

dots. Conference Spring 2016
ゲーム開発の裏側
http://eventdots.jp/event/580344

46839cf590a549efe13547c17a6b2fde?s=128

Isao Shimizu

March 01, 2016
Tweet

Transcript

  1. 2.

    About Me • ਗ਼ਫ ܄ SHIMIZU ISAO @isaoshimizu • 2011೥8݄ΑΓגࣜձࣾϛΫγΟ

    • ݱࡏ͸ΤοΫεϑϥοάελδΦ γεςϜ։ൃ෦ ॴଐ • ໿2೥લʹιʔγϟϧɾωοτϫʔΩϯά αʔϏε mixi ͷӡ༻͔Β
 εϚʔτϑΥϯήʔϜʮϞϯελʔετϥΠΫʯͷӡ༻΁ 2
  2. 3.

    ϞϯελʔετϥΠΫ • 2013೥10݄10೔ ਖ਼ࣜϦϦʔε • iOS൛, Android൛Λఏڙ • શੈքͰ3,000ສਓҎ্͕ϓϨΠʢಉҰ୺຤Ͱॏෳμ΢ϯϩʔυ͞Εͨ਺͸ؚ·ͣʣ •

    ೔ຊɺ୆࿷ɺؖࠃɺ๺ถɺ߳ߓɾϚΧΦͰల։ • YouTubeͰΞχϝͷ഑৴ 2,500ສճͷ࠶ੜɺχϯςϯυʔ3DS൛ ग़ՙ਺100ສຊಥഁ 3
  3. 6.

    6 Application 300+ Servers Resque 50 Servers Redis 30 Servers

    Memcached 200+ Servers CDN Traffic Max 270Gbps API Traffic Avg 1Gbps TURN 40 Servers MariaDB 250+ Servers Internal Traffic Max 20Gbps
  4. 9.

    ෳ਺DCͱAmazon VPC • DC 1ʢϝΠϯʣɺDC 2ʢϝΠϯͱόοΫΞοϓʣɺAWSʢϝΠϯͱ։ൃ؀ڥʣͰߏ੒ • ֤DC͸TY2,4ܦ༝ͰAmazon VPCͱDirectConnectͰϓϥΠϕʔτ઀ଓ •

    ֤DCؒ͸ยܥো֐࣌ʹ͓͍ͯ΋20Gbps·Ͱ଱͑ΒΕΔઃܭ • ৄ͍͠ղઆ
 “ڌ఺৑௕ͨ͠࿩ʢ಺෦τϥϑΟοΫϘϦϡʔϜฤʣ”
 http://xflag.com/blog/infradb/internal_traffic_volume.html 9
  5. 11.

    11 DC1 DC2 Main Main/Backup Main/Dev Application MariaDB Redis Resque

    Memcached MariaDB Redis Memcached TURN Server ։ൃ༻Πϯελϯε܈ Application Resque
  6. 12.

    ֤ڌ఺ͷ໾ׂ • ຊ൪ͷApplicationɺResque͸DC1ɺDC2ͷ྆ํͰՔಇ • ຊ൪ͷRedisɺMariaDBɺMemcached • DC1ʹMaster/Backupͷηοτ • DC2ʹBackup •

    Backup͸DC1ͱDC2߹Θͤͯ2ηοτ͋Δ • AWSͰ͸ຊ൪ͷTURN ServerʢϚϧνϓϨΠ༻ʣɺ։ൃ༻Πϯελϯε܈ͳͲ͕Քಇɻ 12
  7. 18.

    σϓϩΠ • Capistrano 2 • ඞཁͳͱ͖ʹඞཁͳ͚ͩσϓϩΠ͢Δ • capistrano-s3copy-awscli https://github.com/bacter1a/capistrano-s3copy-awscli •

    GitHub͔Β࠷৽ͷίʔυΛऔಘͯ͠S3Ξοϓϩʔυ༻ʹtarballΛ࡞੒ • ApplicationαʔόͰ͸S3ʹΞοϓϩʔυ͞ΕͨtarballΛμ΢ϯϩʔυɺల։ • GitHub΁ͷ઀ଓ਺੍ݶରࡦʢgithub.com΁େྔʹ઀ଓ͕ཁٻ͞ΕΔͱڋ൱͞ΕΔͨΊʣ 18
  8. 21.

    21 DC1 DC2 Deploy Server S3 Bucket GitHub ssh Chef

    Cookbook Ubuntu Package Mirror Internet deb Package deb Package
  9. 24.
  10. 27.

    ෛՙରࡦ • DBͷγϟʔσΟϯάɺςʔϒϧ෼ׂΛܧଓ࣮ࢪ • Applicationαʔόͷ૿ઃ
 12-20 core ͳCPU x 2ʢHyper-Threading׵ࢉͰ24-40

    coreʣͷϚγϯ͕جຊߏ੒ • ΫΤϦվળ΍ΩϟογϡରԠ
 ioDrive(ioMemory)͔ΒSSD΁ͷεέʔϧμ΢ϯͷ࣮ݱ • Memcachedͷ૿ઃ 27
  11. 29.

    Memcachedͷߏ੒ • Memcached͸2ϓʔϧߏ੒ • 1ϓʔϧ͋ͨΓ໿100୆ • Ϛγϯ1୆͋ͨΓͷϝϞϦׂΓ౰ͯ26GBʢϝϞϦ32GB౥ࡌͷϚγϯʣ • 1ϓʔϧશମͰ໿2.6TBͷ༰ྔΛ΋ͭ •

    ྆ํͷϓʔϧʹಉ࣌ॻ͖ࠐΈ
 DoubleWriteCacheStores https://github.com/hirocaster/double_write_cache_stores • ϝϯςφϯεແ͠Ͱϓʔϧͷ੾Γସ͕͑ՄೳɻϓʔϧαΠζͷ૿ڧ͕Մೳʹɻ 29
  12. 31.

    31 App M App App M M M M M

    M M M M M M set get set Pool 1 Pool 2 App App App M M M M M M set get Pool 1 Pool 2 ௨ৗ࣌ ఀࢭɾ૿ઃ M M શΫϦΞ͓ͯ͘͠ M M M M M M Pool2΁ͷsetΛఀࢭ M Memcachedϗετ
  13. 32.

    32 App M App App M M M M M

    set get set Pool 1 Pool 2 ੾Γସ͑ M M M M M M M M App App App set get Pool 1 Pool 2 Ωϟογϡ͋ͨͨΊ set M M M M M M M M M M M M M M ͠͹Β͋ͨͨ͘ΊΔ
  14. 34.

    ϩʔυόϥϯαʔ • Applicationαʔόͷલஈ͸ΞϓϥΠΞϯεͷϩʔυόϥϯαʔΛར༻ • A10 Networks AX2500 • RESTful APIʹΑΔૢ࡞͕Մೳ

    • ࣗࣾ։ൃͷGo੡ͷCLIπʔϧʢa10-cliʣ • Applicationαʔόͷ௥Ճɺ࡟আ࣌ʹίϚϯυ1ͭͰαʔϏεΠϯɾΞ΢τ͕Ͱ͖Δ 34
  15. 35.
  16. 39.

    39 • ͢΂ͯͷαʔόͷ؂ࢹ • 12,000Λ௒͑Δ؂ࢹ߲໨ • ಠࣗͷϓϥάΠϯΛ௥Ճ • PagerDutyͱͷ࿈ܞ •

    ౰൪ͷεέδϡʔϦϯά • ΤεΧϨʔγϣϯϙϦγʔ • Slack΁ͷ௨஌ https://www.nagios.org/
  17. 40.

    40 • nginx, ApplicationͷϩάΛ
 FluentdΛ࢖ͬͯElasticsearch΁సૹ • APIຖͷॲཧ࣌ؒ΍ϨεϙϯελΠϜͷܭଌ • Τϥʔස౓ͷܭଌ •

    ಛఆϗετ͔ΒͷΞΫηεͷௐࠪ • Slow QueryͷՄࢹԽ • CloudTrailϩάͷՄࢹԽ https://www.elastic.co/
  18. 41.

    41 nginx unicorn Amazon S3 GrowthForecast LTSV Log Application Servers

    ϩάసૹϑϩʔ Monitoring Agent Response Time Status Code Monitoring Agent Log Log td-agent 2.3.x Max 500Mbps Max 500Mbps
  19. 43.

    43 • ίϛϡχέʔγϣϯ͸SlackͰ౷Ұ • Hubot https://hubot.github.com ར༻ʹΑΔChatOpsͷ࣮ݱ • ։ൃ؀ڥ΁ͷσϓϩΠͳͲΛSlack͔Β࣮ߦՄೳ •

    ৄ͘͠͸Software Design 2016೥1݄߸ͷChatOpsಛूʹ
 http://gihyo.jp/magazine/SD/archive/2016/201601 • ΞϓϦέʔγϣϯίʔυ΋Πϯϑϥߏங༻ͷCookbookͳͲ
 ͢΂ͯGitHub্Ͱ؅ཧ • pull requestsϕʔεͷ։ൃɺϨϏϡʔ • GitHub্Ͱͷ׆ಈ͸Slack΁௨஌͞ΕΔ • GitHubͰpull requests͕࡞੒͞ΕΔͱࣗಈςετ࣮ߦ • Dockerίϯςφͷੜ੒ɺςετ࣮ߦɺഁغ • Hubotͱͷ࿈ܞ
  20. 46.