Mackerel をオンプレミスから AWS に移してからの1年半を振り返る / Hatena Engineer Seminar #11

81e8ac1d3a766b3d9880cd08d9a7aba4?s=47 astj
January 23, 2019

Mackerel をオンプレミスから AWS に移してからの1年半を振り返る / Hatena Engineer Seminar #11

81e8ac1d3a766b3d9880cd08d9a7aba4?s=128

astj

January 23, 2019
Tweet

Transcript

  1. MackerelΛΦϯϓϨϛε͔Β AWSʹҠ͔ͯ͠Βͷ 1೥൒ΛৼΓฦΔ Hatena Engineer Seminar #11 id:astj

  2. id:astjʢ͋͞ͱ͐͡ʔʣ ͸ͯͳ ΞϓϦέʔγϣϯΤϯδχΞ ৽ଔ (2014/04) Mackerel ։ൃνʔϜ (2016/08~) Mackerel ςοΫϦʔυ

    (2018/05~)
  3. Mackerel

  4. None
  5. None
  6. None
  7. • 2014೥ϦϦʔεͷʮαʔόʔ؂ࢹαʔϏεʯ • αʔόʔαΠυ: Scala / Go • ػցֶशύʔτʢ։ൃதʂʣ: Python

    • AWS Lambda : NodeJS
  8. ຊ೔ͷத਎ • ΦϯϓϨϛε͔ΒAWSʹҠߦͨ͜͠ͱ • Ҡߦͯ͠ಘΒΕͨ͜ͱɾมΘͬͨ͜ͱ

  9. ༧උ஌ࣝ / ͝঺հ

  10. On-Premises DC Office AWS Account A (Main) AWS Account B

    AWS Account C Dedicated Hosting ͸ͯͳͷΠϯϑϥߏ੒ …
  11. ͸ͯͳͷΠϯϑϥߏ੒ On-Premises DC Office AWS Account A AWS Account B

    AWS Account C Dedicated Hosting … ΦϯϓϨϛεͷDCͱAWSʢͳͲʣΛซ༻ ྺ࢙తܦҢͰڞ௨ͷAWSΞΧ΢ϯτ αʔϏεʹΑͬͯ͸ಠཱΞΧ΢ϯτ
  12. Mackerelͷߏ੒ RDBMS (Postgres) Redis Tsdb (nginx) (nginx) reverse proxy (nginx)

    (nginx) (nginx) app (Scala) (nginx) (nginx) Sub- systems
  13. hatena.mackerel.host1.cpu.user 2018/02/01T21:15:00Z 44.00 2018/02/01T21:16:00Z 6.00 2018/02/05T21:17:00Z 8.00 hatena.mackerel.host1.cpu.user 2018/02/01T21:15:00Z 44.00

    2018/02/01T21:16:00Z 6.00 2018/02/05T21:17:00Z 8.00 hatena.mackerel.host1.cpu.user 2018/02/01T21:15:00Z 44.00 2018/02/01T21:16:00Z 6.00 2018/02/05T21:17:00Z 8.00 hatena.mackerel.host1.cpu.user 2018/02/01T21:15:00Z 44.00 2018/02/01T21:16:00Z 6.00 2018/02/05T21:17:00Z 8.00 … … ࣌ܥྻσʔλ
  14. ࣌ܥྻσʔλϕʔε • ΦϯϓϨϛε࣌୅: Graphite • http://graphite.readthedocs.org/ • AWS Ҡߦޙ: ಺੡ϓϩμΫτ

    • ίʔυωʔϜ "diamond"
  15. ΦϯϓϨϛε͔ΒAWS΁

  16. ॳظͷ Mackerel ͷߏ੒ • ϩʔϯν౰ॳ͸ΦϯϓϨϛε • جຊతʹ͸XenͰԾ૝Խͨ͠VMӡ༻ • Tsdb΍RDBͰ͸ϋΠεϖοΫͳ෺ཧαʔόΛ
 Ծ૝Խͤͣར༻

    • ioDrive (~380k IOPS Ͱେ༰ྔͰߴ͍)
  17. ՝୊ײ • αʔϏε੒௕ʹ൐͍՝୊͕ग़͖ͯͨ • Graphiteͷӡ༻͕͍ͨ΁Μ • σΟεΫ͕΋Γ΋Γຒ·Δɾ৑௕Խߏ੒ͷӡ༻ • εέʔϧΞ΢τʹ෺ཧαʔόͷௐୡ

  18. => AWS

  19. ʮ࣍ੈ୅MackerelϓϩδΣΫτʯ

  20. AWSҠߦ • TsdbͷΩϟύγςΟ֬อͷॊೈԽ • ηΩϡϦςΟରࡦج൫ͷॆ࣮ • ܭը => Ҡߦ·Ͱ͓Αͦ1೥ •

    Tsdbͷݕূɾ։ൃʹ࣌ؒͱ޻਺Λେׂ͖͍ͨ͘
  21. TsdbͷAWSҠߦ • ߴεϖοΫ෺ཧαʔόಉ༷ͷӡ༻͸ݫ͍͠ • EBSͰٻΊΒΕͨiops͸ग़ͤͳ͍ • ΦϯϓϨϛεͷ࣌఺Ͱӡ༻ෛՙ͸͋ͬͨ

  22. ࣍ੈ୅Tsdb: diamond DynamoDBΛ࣠ʹͨ͠ϚωʔδυαʔϏε
 த৺ͷ࣌ܥྻσʔλϕʔεΛ։ൃ • https://blog.yuuk.io/entry/the-rebuild-of-tsdb-on-cloud • https://itchyny.hatenablog.com/entry/2017/11/06/090000 • https://astj.hatenablog.com/entry/2018/02/06/175902

  23. http://blog.yuuk.io/entry/the-rebuild-of-tsdb-on-cloud

  24. Ҡߦϓϩηε

  25. "࣍ੈ୅Ͱ΍Γ͍ͨ͜ͱ" ແݶʹͰͯ͘Δ

  26. ·ͣ͸EC2ʹશ෦৐ͤΔ • είʔϓΛߜΔ • ʢͱ͸͍͑Tsdb͸diamondʹҠߦ͢Δʣ • ੾Γ෼͚Մೳͳ෦෼͸͞Βʹ੾Γ෼͚Δ • ϦϦʔεཻ౓Λۃྗখ͘͢͞Δ •

    ιϑτ΢ΣΞ։ൃͱಉ༷ʢܧଓతσϦόϦʔʁʣɹ
  27. ࡞ઓ

  28. On-Premises DC AWS nginx app db redis tsdb (Graphite) subsystem

    Πϝʔδਤ STEP0 (ॳظঢ়ଶ)
  29. On-Premises DC AWS nginx app db redis tsdb (Graphite) subsystem

    tsdb (diamond) Πϝʔδਤ STEP1 (tsdb)
  30. STEP1 • ࣌ܥྻσʔλϕʔεͷ৽چ྆ํ΁ͷॻ͖ࠐΈ • ৽چͷࠩҟΛࣄલʹ֬ೝ͢Δ • ͜ͷ͋ͱ΋͠͹Β͘͸৽چ྆ํʹॻ͖ଓ͚Δ • ສ͕Ұͷ࣌ͷ੾Γ໭͠ʹඋ͑Δ

  31. On-Premises DC AWS nginx app db redis tsdb (Graphite) subsystem

    tsdb (diamond) Πϝʔδਤ STEP1 (tsdb)
  32. On-Premises DC AWS tsdb (Graphite) subsystem app db redis tsdb

    (diamond) Πϝʔδਤ HTTP STEP2-1 (app etc) nginx
  33. On-Premises DC AWS tsdb (Graphite) subsystem nginx app db redis

    tsdb (diamond) Πϝʔδਤ HTTP STEP2-2 (app etc)
  34. STEP2 • ΞϓϦέʔγϣϯͱಉظతʹ௨৴͢ΔՕॴʢେ൒ʣ • subsystem͸ΦϯϓϨϛεͷ·· • HTTPܦ༝ͷඇಉظ௨৴ͳͷͰ༨༟͕͋Δ • diamondͷ͓൸࿐໨΋͜ͷλΠϛϯά

  35. On-Premises DC AWS tsdb (Graphite) subsystem nginx app db redis

    tsdb (diamond) Πϝʔδਤ HTTP STEP2 (app etc)
  36. On-Premises DC AWS tsdb (Graphite) nginx app db redis tsdb

    (diamond) subsystem Πϝʔδਤ STEP3 (subsystem)
  37. STEP3 • HTTPܦ༝Ͱ௨৴͢ΔαϒγεςϜͷҠઃ • ֎ܗ؂ࢹͷΫϩʔϥΛؚΉͷͰग़ޱIP͕มߋ • ࣄલࠂ஌ɻ͝ڠྗ͋Γ͕ͱ͏͍͟͝·ͨ͠

  38. On-Premises DC AWS tsdb (Graphite) nginx app db redis tsdb

    (diamond) subsystem Πϝʔδਤ STEP3 (subsystem)
  39. On-Premises DC AWS nginx app db redis tsdb (diamond) subsystem

    Πϝʔδਤ STEP4 (DONE!)
  40. Ҡߦ׬ྃ • ύϑΥʔϚϯεɺඅ༻ͳͲ੾Γ໭͠ͳ͠ͷ൑அ • STEP2͔Β1ϲ݄ଓ͚ͨฒߦՔಇΛऴྃ

  41. ࣮ࡍͷҠߦ

  42. STEP1 (tsdb) • ෛՙࢼݧͯ͠ϦϦʔεՄೳͷ൑அ • astj͸STEP2͕৺഑ͩͬͨ • ৽ن։ൃ = ౰વաڈͷՔಇ࣮੷͸ͳ͍

    • ࢥͬͨΑ͏ʹύϑΥʔϚϯεग़Δͷ͔ʁ
  43. STEP2 (app etc) • ແఀࢭͷ੾Γସ͑Λ࣮ࢪ • ੾ସ࣌ʹΠϯελϯεো֐ͰҰ෦σʔλফࣦ • ޙ೔ௐࠪ಺༰ͱ࠶ൃ๷ࢭࡦΛެ։ •

    https://mackerel.io/ja/blog/entry/2017/08/15/113803
  44. STEP2 (app etc) • ୹࣌ؒ෼ͷΈͱ͸͍͑σʔλফࣦ • Ϣʔβ͞Μʹ͝໎࿭Λ͓͔͚ͯ͠͠·ͬͨ • Ұํɺdiamond ʹؔͯ͠͸໰୊ͳ͘Քಇ

    • ؼ୐͔ͯ͠Βײ֒ʹ;͚͍ͬͯͨ
  45. STEP3 (subsystem) • ͕ͭͭͳ͘੾Γସ͑ • ʢastj͸͜ͷ೔ٳՋͰͨ͠…ʣ

  46. Ҡߦ׬ྃ

  47. Ҡߦ׬ྃʁ

  48. Ҡߦ׬ྃʁ • ʮ·ͣ͸EC2ʹ৐ͤͨʯஈ֊ • ϚωʔδυαʔϏε (ex. RDS) ΁ͷҠߦ༨஍ • “࣍ੈ୅Mackerel”ͷεςοϓΞοϓɹ

  49. RDSͷ࣮ྫ࿩࣌ؒ͢͸ ଟ෼ͳ͍Ͱ͢

  50. RDSҠߦ • EC2Ҡߦ౰ॳ͸ PostgreSQL 9.3 (EC2) • ϚωʔδυαʔϏεԽͰ DB ӡ༻ίετΛԼ͛Δ

    • EC2Ҡߦ࣌ͷΑ͏ͳো֐Λආ͚Δ͜ͱʹ΋ܨ͕Δ • ݕ౼ࣄ߲: Aurora for PostgreSQL
  51. RDSҠߦ • 9.3 (EC2) => 9.3 (RDS) => 9.6 (RDS)

    • 2ճʹ෼͚ͯҠߦ • ͦΕͧΕఀࢭϝϯςφϯεΛ࣮ࢪ • ͝ཧղ/͝ڠྗ͋Γ͕ͱ͏͍͟͝·ͨ͠
  52. ҠߦશମΛৼΓฦΔ • Ҡߦखॱͷ෼ׂ͸ਖ਼ղ • ΦϖϨʔγϣϯͱͯ͠͸DB੾ସ͕ϔϏʔ • σʔλফࣦͰ͸͝໎࿭Λ͓͔͚͠·ͨ͠

  53. Ҡߦ͔ͯ͠ΒͷมԽ

  54. ΩϟύγςΟ֬อͷॊೈੑ • tsdbͷεέʔϥϏϦςΟͷݒ೦ͷղফ • େن໛ͳεέʔϧΞ΢τ/Πϯ • ؾܰʹԾ૝Ϛγϯͷ࡞੒/ഇغ͕ߦ͑Δ • ΠϯϑϥΤϯδχΞ΁ͷʮ͓ئ͍ʯ͕ෆཁʹ •

    ΠϯϑϥͷݕূɾߏஙͷεϐʔυΞοϓ
  55. ӡ༻ͷखؒͷݮগ • ෺ཧαʔόΛҙࣝͨ͠ӡ༻͔Βͷղ์ • ϚωʔδυαʔϏεԽ • ετϨʔδ૚ͷऔΓճ͠ͷྑ͞ • EBS snapshot

  56. ΞʔΩςΫνϟબఆͷࣗ༝౓޲্ • ӡ༻ମྗͷඞཁʢͰ͋Ζ͏ʣίϯϙʔωϯτΛ
 ϚωʔδυͰར༻ग़དྷΔ • Kinesis Data Streams, Lambda, 


    DynamoDB, ... • ৽نߏஙՕॴͰࣗવͱ࠾༻ग़དྷΔ • ػցֶशػೳͰͷ AWS Batch, ...
  57. ศར

  58. ͱ͸͍͑

  59. ཧ૝ڷʹདྷͨΘ͚Ͱ͸ͳ͍

  60. • αʔόɾωοτϫʔΫো֐ࣗମ͸ଘࡏ • Ҿ͖ଓ͖ؤு͍͖ͬͯ·͠ΐ͏ • Ծ૝Ϛγϯ͸Ծ૝Ϛγϯ • Ҿ͖ଓ͖ؤு͍͖ͬͯ·͠ΐ͏ • ΑΓந৅౓ͷߴ͍Πϯϑϥʁ

  61. • ϚωʔδυαʔϏεʹ͋Θͤͨઃܭɾӡ༻ • ٕज़બఆ • ʮ࢖͑ͦ͏͚ͩͲϋϚΒͳ͍ʯ • AWS APIΛར༻ͨ͠ӡ༻ઃܭ

  62. None
  63. ·ͱΊ

  64. AWSʹҠߦͨ͠ɹ • ΩϟύγςΟΛओͳཁ݅ͱͯ͠AWSʹҠߦ • ஈ֊Λ౿ΜͰҠߦͨ͠ • Tsdb͸DynamoDBத৺ʹ಺੡ • ͦͷଞ͸Ұ୴EC2্ʹ •

    ͦͷޙঃʑʹϚωʔδυԽ
  65. AWSʹҠߦͯ͠ • TsdbͷΩϟύγςΟʹک͑Δੜ׆ʹผΕ • ϚωʔδυαʔϏεΛੵۃతʹऔΓೖΕΒΕΔΑ͏ʹ • ӡ༻ͷखؒΛݮΒ͢ • ։ൃΞʔΩςΫνϟબ୒ࢶΛ૿΍͢ •

    ϓϩμΫτͷมߋ଎౓ΛߴΊΒΕ͍ͯΔ
  66. ͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠