Mackerelにおける時系列データベースの性能改善 / Performance Improvement of TSDB in Mackerel

Mackerelにおける時系列データベースの性能改善 / Performance Improvement of TSDB in Mackerel

ペパボ・はてな技術大会〜インフラ技術基盤〜@福岡

A658ec7f1badf73819dfa501165016c1?s=128

Yuuki Tsubouchi (yuuk1)

July 09, 2016
Tweet

Transcript

  1. Mackerelʹ͓͚Δ ࣌ܥྻσʔλϕʔεͷੑೳվળ ϖύϘɾ͸ͯͳٕज़େձʙΠϯϑϥٕज़ج൫ʙ@෱Ԭ ͸ͯͳ id:y_uuki

  2. id:y_uuki yuuki ΢ΣϒΦϖϨʔγϣϯΤϯδχΞ@͸ͯͳ ೖࣾ3೥໨͘Β͍

  3. 07/02@ژ౎ https://speakerdeck.com/yuukit/linux-network-performance-improvement-at-hatena

  4. ΋͘͡ 1. Mackerelͱ࣌ܥྻσʔλ 2. GraphiteͷΞʔΩςΫνϟͱੑೳঢ়گ 3. σΟεΫεϥογϯά໰୊ͱͦͷղܾ 4. ·ͱΊ

  5. ΋͘͡ 1. Mackerelͱ࣌ܥྻσʔλ 2. GraphiteͷΞʔΩςΫνϟͱੑೳঢ়گ 3. σΟεΫεϥογϯά໰୊ͱͦͷղܾ 4. ·ͱΊ

  6. https://mackerel.io

  7. αʔόͷϝτϦοΫՄࢹԽ

  8. MackerelͷΞʔΩςΫνϟ

  9. Mackerelͷ࣌ܥྻσʔλͷಛੑ • ΤʔδΣϯτ͕Ϣʔβ͞Μͷϗετ͔Βຖ෼ϝτϦοΫ ౤ߘ • 2016/01࣌఺ͰΞΫςΟϒΤʔδΣϯτ਺ 10,000+ • 1ΤʔδΣϯτ͋ͨΓͷϝτϦοΫ਺͸࠷େ200 •

    ԾʹฏۉϝτϦοΫ਺Λ100 metrics/agentͱ͢Δͱɹ ߹ܭૹ৴ϝτϦοΫ਺ 1,000,000 metrics/min + • ϝτϦοΫͷେྔॻ͖ࠐΈʹ଱͑ΒΕΔσʔλϕʔε͕ ඞཁ
  10. Graphite

  11. ΋͘͡ 1. Mackerelͱ࣌ܥྻσʔλ 2. GraphiteͷΞʔΩςΫνϟͱੑೳঢ়گ 3. σΟεΫεϥογϯά໰୊ͱͦͷղܾ 4. ·ͱΊ

  12. Graphiteͱ͸ • PythonͰॻ͔Εͨ࣌ܥྻσʔλϕʔεϛυϧ΢ΣΞ • HTTPΠϯλϑΣʔε ʢॻ͖ࠐΈ͸ಠࣗϓϩτίϧʣ • ग़ྗσʔλܗࣜ͸άϥϑը૾·ͨ͸JSON Graphite (timestamp,

    name, value) graph request Image or JSON
  13. GraphiteͷΞʔΩςΫνϟ (timestamp, name, value) graph request Image or JSON carbon

    graphite-web filesystem write read whisper whisper
  14. GraphiteͷΞʔΩςΫνϟ (graphite-web) (timestamp, name, value) graph request Image or JSON

    carbon graphite-web filesystem write read whisper whisper ಡΈࠐΈཁٻΛड͚෇͚ΔͨΊͷWebΞϓϦέʔγϣϯ
  15. GraphiteͷΞʔΩςΫνϟ (carbon) (timestamp, name, value) graph request Image or JSON

    carbon graphite-web filesystem write read whisper whisper ॻ͖ࠐΈཁٻΛड͚෇͚ΔͨΊͷσʔϞϯ
  16. GraphiteͷΞʔΩςΫνϟ (whisper) (timestamp, name, value) graph request Image or JSON

    carbon graphite-web filesystem write read whisper whisper ࣌ܥྻDBϑΝΠϧΛ࡞੒ɾߋ৽͢ΔͨΊͷϥΠϒϥϦ ϝτϦοΫ͝ͱʹ ϑΝΠϧ͕Ͱ͖Δ
  17. Whisperͷσʔλߏ଄ • ͢΂ͯͷσʔλΛอଘ͢ΔͱσΟεΫ࢖༻ྔ͕ංେԽ • timestamp: 4byte, value: 8byteͱͯ͠12bytes/datapointͱ͢Δ ͱɺ1೥Ͱ6MB/metric •

    ݹ͍σʔλʹ͍ͭͯ͸ҰఆظؒͰฏۉԽor࠷େ஋Λ࢒ؙͯ͠Ί ͯ͠·ͬͯσΟεΫ࢖༻ྔΛઅ໿ • ex. 1෼ਫ਼౓ͷσʔλ͸1೔෼͚ͩͰΑ͍͕ɺ5෼ਫ਼౓ͷσʔλ ͸1िؒ࢒͢ͱ͍͏Α͏ͳΠϝʔδ
  18. Graphiteͷॻ͖ࠐΈύϑΥʔϚϯεಛੑ(CPUར༻཰) • carbon͸2ͭͷεϨου͕ڠௐͯ͠ಈ࡞͢Δ • σʔλΛड͚औΔωοτϫʔΫI/OεϨου • ϑΝΠϧॻ͖ࠐΈͷͨΊͷI/OεϨου • ΠϕϯτۦಈϞσϧͷωοτϫʔΫαʔό •

    όοϑΝ͝͠ʹεϨουؒͰσʔλϙΠϯτΛ౉͢ • ֤εϨου͕1ίΞͰ཯଎͢Δ໰୊ • carbonϓϩηεΛෳ਺ݸͨͯͯ෼ࢄͤ͞Δ
  19. Graphiteͷॻ͖ࠐΈύϑΥʔϚϯεಛੑ(σΟεΫIO) • େྔͷϑΝΠϧʹখ͞ͳσʔλྔʢ໿12ByteʣΛ1෼Ҏ ಺ʹॻ͖ࠐΉ • ϑΝΠϧγεςϜ্ͷۙྡϒϩοΫʹ·ͱΊͯॻ͘͜ͱ ͕Ͱ͖ͳ͍ͨΊɺI/Oޮ཰͸ѱ͍ (શํҐॻ͖ࠐΈ) • ൓໘ɺಉ࣌ʹෳ਺ͷεϨου͕1ͭͷϑΝΠϧʹॻ͖ࠐ

    Ή͜ͱ͕ͳ͍ͨΊɺ I/Oͷฒྻ౓͸ߴΊ΍͍͢ • XFSͷΑ͏ͳฒྻI/Oʹ༏ΕͨϑΝΠϧγεςϜͰͳ͘ ͯ΋ɺੑೳ͸มΘΒͳ͍ (ext4ͳͲ)
  20. ϋʔυ΢ΣΞߏ੒ͱϦιʔε࢖༻ྔ • CPU: Xeon E5-2697 v3 @ 2.60GHz 2 socket

    28ίΞ • ϝϞϦ: 126GB • σΟεΫ: Fusion ioMemory ioDrive2 6.4TB • ͍ΘΏΔϑϨογϡετϨʔδɻϝʔΧʔެশ஋͸ 300k write IOPS • ࣮ޮI/Oੑೳ: 50k ~ 100k write IOPS • ී௨ͷSSDͳΒ1/10ͷੑೳ͕ͰΕ͹ྑ͍ํ
  21. Graphiteνϡʔχϯά • ioDriveͷIOPSΛ࢖͍੾ΔલʹCPUϦιʔεΛ࢖͍͖ͬ ͯ͠·͏ͨΊɺCPUΛઅ໿ͯ͠I/Oʹ޲͚Δߟ͑ํ • random writeʹڧ͍ߴ଎ͳσΟεΫͳͨΊɺجຊతʹ carbon΍I/Oεέδϡʔϥʹ͸༨ܭͳ࠷దԽΛͤ͞ͳ͍ • ιʔτʹΑΔI/Oޮ཰Խ΍I/OϦιʔεΛ࢖͍͖Βͳ͍

    ͨΊͷ੍ݶͷύϥϝʔλ͕͋Δ • echo noop > /sys/block/fioa/queue/scheduler
  22. GraphiteΫϥελߏ੒ (timestamp, name, value) graphite-web carbon carbon … … LB

    carbon carbon … … LB LB carbon carbon … …
  23. ৄ͘͠͸ϒϩάͰ http://blog.yuuk.io/entry/high-performance-graphite

  24. ΋͘͡ 1. Mackerelͱ࣌ܥྻσʔλ 2. GraphiteͷΞʔΩςΫνϟͱੑೳঢ়گ 3. σΟεΫεϥογϯά໰୊ͱͦͷղܾ 4. ·ͱΊ

  25. write IOPS read IOPS ಥવͷreadෛՙ૿େ

  26. ͳʹ͕ى͖ͨͷ͔ • read IOPS͕૿Ճ͠ɺwrite IOPS͕ݮগ͍ͯ͘͠ • ϝϞϦෆ଍ʹΑΔSwapྖҬͷ࢖༻͸ͳ͠ɻOSͷϝϞϦ ࢖༻ྔ͸1/3ఔ౓ͩͬͨ • αʔϏε΁ͷಥൃతͳΞΫηε૿Ճ͸ͳ͠

    • sar -BͰɺҰఆ࣌ؒ಺ͷϖʔδΠϯͱϖʔδΞ΢τͷ਺͕ ҟৗʹ૿͍͑ͯͨ͜ͱ͕൑໌ • ͜ͷݱ৅ΛσΟεΫεϥογϯάͱݺͿ͜ͱʹ͢Δ • LinuxͷϖʔδΩϟογϡͷ࢓૊ΈͱGraphiteͷI/Oύ λʔϯ͔ΒݪҼΛਪ࡯ͨ͠
  27. LinuxͷϖʔδΩϟογϡ • ϝϞϦͷ಺༁ = used + buffers/caches + free •

    ϑΝΠϧγεςϜ͔ΒσʔλΛಡΈࠐΉ/ॻ͖ࠐΉͱɺ࣍ճ Ҏ߱ߴ଎ʹಡ·ͤΔͨΊʹɺOS͕ϖʔδ୯ҐͰσΟεΫ্ ͷσʔλΛϝϞϦʹࡌͤΔ • ϖʔδΩϟογϡͱݺͿ • ϖʔδΩϟογϡ͸LRUΞϧΰϦζϜɻ࠷ۙࢀর͞Εͨ Ωϟογϡσʔλ͸࢒͠ɺࢀর͞Εͳ͍ݹ͍Ωϟογϡσʔ λΛফ͢ • ϖʔδΩϟογϡ͸௨ৗϝϞϦ࢖༻ྔʹؚ·Εͳ͍
  28. GraphiteͷI/Oύλʔϯ • 1෼Ҏ಺ʹશͯͷΞΫςΟϒͳwhisperϑΝΠϧʹॻ͖ ࠐΉͨΊɺσΟεΫͷ޿ൣғʹ౉ͬͯॻ͖ࠐΈ͕૸Δ • whisperͷϝτϦοΫॻ͖ࠐΈૢ࡞͸ɺwrite(2)͚ͩͰ ͳ͘ɺϝλσʔλͷಡΈࠐΈ΍ΦϑηοτܭࢉͷͨΊ ͷread(2)΋૸Δ • ϖʔδΩϟογϡ͸read͚ͩͰͳ͘writeʹ΋༗ޮ

    (Direct I/O͸আ͘) • Graphiteϗετ͸େྔͷϖʔδΩϟογϡΛ΋ͭ
  29. read IOPS૿ͷݪҼ • ϖʔδΠϯͱϖʔδΞ΢τճ਺͕ଟ͍ͱ͍͏͜ͱ͸ɺ LRUʹΑΓݹ͍Ωϟογϡ͕௥͍ग़͞Ε͍ͯΔ • whisperॻ͖ࠐΈͷreadͰϖʔδΩϟογϡ͕ޮ͔ͳ͘ ͳͬͨ݁Ռɺread IOPS͕૿͑ͨ Memory

    used page cache page in page out
  30. ϖʔδΩϟογϡͷઅ໿ • ౥ࡌϝϞϦΛ૿΍͢͜ͱͰҰԠղܾͰ͖Δ͕ɺ͢Ͱʹ 126GB RAMͳͷͰɺແବͳϖʔδΩϟογϡΛ࡟ݮ͍ͨ͠ • writeͨ͠σʔλΛ͙͢ʹಡΉͱ͸ݶΒͳ͍ͨΊɺwrite࣌ͷ σʔλΛΩϟογϡʹͷͤͳ͍ => Direct

    I/O • ͔͠͠ɺDirect I/OΛ࢖͏ͨΊʹ͸ɺϒϩοΫαΠζͰϝϞ ϦΞϥΠϝϯτΛἧ͑Δඞཁ͕͋Δ => PythonͰ΍Δͷ͕ ͱͯ΋໘౗ (malloc => posix_memalign) • posix_fadvise(2)Λ࢖ͬͯղܾ
  31. posix_fadvise(2) • ϓϩηε͕Χʔωϧ΁ϑΝΠϧσʔλͷΞΫηεύλʔϯΛ ௨஌ • Χʔωϧ͸ࢦఆ͞ΕͨΞΫηεύλʔϯʹԠͯ͡I/Oੑೳ͕޲ ্͢ΔΑ͏ʹ࠷దԽ • ΞΫηεύλʔϯ •

    POSIX_FADV_SEQUENTIAL: 2ഒͷઌಡΈ • POSIX_FADV_RANDOM: ઌಡΈఀࢭ • POSIX_FADV_DONTNEED: Ωϟογϡͨ͠ϖʔδͷղ์ • etc int posix_fadvise(int fd, off_t offset, off_t len, int advice);
  32. posix_fadvise(2)ΛGraphiteʹద༻ • ࠷ॳ͸ɺϖʔδΩϟογϡΛམͱ͢Φϓγϣϯʹண໨ • whisperͷॻ͖ࠐΈϩδοΫ͸݁ߏෳࡶͳͨΊɺwriteʹ ΑΔϖʔδΩϟογϡ෦෼͚ͩΛམͱ͢ͷ͕೉͍͠ • FAD_RANDONʹΑΓɺઌಡΈΛͤͣඞཁͳϖʔδ෼͚ͩ Ωϟογϡ͢ΔΑ͏ʹͨ͠ •

    whisperͷॻ͖ࠐΈͰγʔέϯγϟϧʹᢞΊΔॲཧ͸ͳ͍ • ઌಡΈ͍ͯͨ͠ແବͳϖʔδΩϟογϡ͕ݮͬͨ Active(file): 5387160 kB Inactive(file): 37566804 kB Active(file): 32252136 kB Inactive(file): 7231020 kB /proc/meminfo before & after
  33. Graphite΁ͷPull Request

  34. Pull Request಺༰ • มߋ಺༰͸͞΄Ͳ೉͘͠ͳ͍ • fadvise ϞδϡʔϧΛ࢖͏ • straceͯ͠posix_fadvise͕Ͱͯ͘Ε͹ok •

    ৗʹfadvise͢Δͷ͕Α͍͔Θ͔Βͳ͍ͨΊɺઃఆϑΝΠϧ ʹΑΔ༗ޮɾແޮΛ੾Γସ͑ΒΕΔΑ͏ʹ (σϑΥϧτແޮ) • Ϛʔδͯ͠΋Β͏·Ͱ1ϲ݄͘Β͍͔͔ͬͨ with open(path, 'r+b') as fh: if CAN_FADVISE and FADVISE_RANDOM: posix_fadvise(fh.fileno(), 0, 0, POSIX_FADV_RANDOM)
  35. ςετεΫϦϓτʹΑΔݕূ https://gist.github.com/yuuki/8d5d386115b0f01b5371 • whisperͷॻ͖ࠐΈؔ਺Λ࢖ͬͯɺ࣮ࡍʹϖʔδΩϟο γϡͷྔ͕ݮΔ͔Ͳ͏͔֬ೝ • 100ݸͷwhisperϑΝΠϧʹରͯ͠100ݸͷσʔλϙΠϯ τΛॻ͖ࠐΉεΫϦϓτ • /proc/<pid>/io

    ͷread_bytes(࣮ࡍʹσΟεΫ͔ΒಡΈͩ ͨ͠αΠζ)ΛΈΔ • POSIX_FAD_RANDOMΦϓγϣϯΛ͚ͭΔͱϖʔδ Ωϟογϡྔ͕1/2ʹͳͬͨ
  36. ΋͘͡ 1. Mackerelͱ࣌ܥྻσʔλ 2. GraphiteͷΞʔΩςΫνϟͱੑೳঢ়گ 3. σΟεΫεϥογϯά໰୊ͱͦͷղܾ 4. ·ͱΊ

  37. ·ͱΊ • MackerelͰ͸ 1,000,000 metrics/min + ͷϝτϦοΫ ॻ͖ࠐΈΛࡹ͘ඞཁ͕͋Δ • ࣌ܥྻσʔλϕʔεͱͯ͠GraphiteΛબ୒

    • ioDriveલఏͰOS೚ͤͷνϡʔχϯά • σΟεΫεϥογϯά໰୊Λposix_fadviseʹΑΓ writebackʹΑΔϖʔδΩϟογϡΛແޮʹ͢Δύον Ͱղܾ
  38. None
  39. 1෼ҎԼͷཻ౓ͷϝτϦοΫ ཻ౓ΛଛͳΘͣ௕ظอଘ ϦΞϧλΠϜͳҟৗݕ஌

  40. ࣍ੈ୅ͷ࣌ܥྻσʔλϕʔεʹ ࡮৽͍ͨ͠

  41. http://hatenacorp.jp/recruit/fresh/operation-engineer ٕज़͕޷͖ͳਓ΁

  42. ຊεϥΠυͷKeynoteςϯϓϨʔτͱͯ͠ shoya140͞ΜͷZebra(http://shoya.io/blog/zebra/) Λ࢖Θ͍͖ͤͯͨͩ·ͨ͠ Mackerelʹ͓͚Δ ࣌ܥྻσʔλϕʔεͷੑೳվળ ϖύϘɾ͸ͯͳٕज़େձʙΠϯϑϥٕज़ج൫ʙ@෱Ԭ ͸ͯͳ id:y_uuki