10倍速いNode.js並列プログラミング

 10倍速いNode.js並列プログラミング

Node.jsのプログラミングを並列化して10倍速く動かした話です。
Node学園 31時限目 のLT
https://nodejs.connpass.com/event/90936/

4f8c3a1aedaf9aafd6c74ab077d9ad18?s=128

shigeru. nakajima

June 29, 2018
Tweet

Transcript

  1. 10ഒ଎͍Node.jsฒྻϓϩάϥϛϯά Shigeru Nakajima a.k.a. @ledsun Luxiar co., ltd NodeֶԂ LT

    2018/06/29
  2. Node εΫϨΠϐϯά͓͡͞Μ ʮnode εΫϨΠϐϯάʯͰάάͬͯ

  3. PANQ panq http://www.panq.jp/ ͱ͍͏αʔϏεΛ࡞ͬ ͍ͯΔ QiitaͷهࣄΛࢀর਺ͰධՁ Qiita APIͰ͸ࢀর਺͸औΕͳ͍ εΫϨΠϐϯάͰऔಘ 8000هࣄΛूΊͨ

  4. NodeͱεΫϨΠϐϯά εΫϨΠϐϯά = HTMLμ΢ϯϩʔυɹ + ɹHTMLύʔε = Network IOɹɹɹɹɹ+ɹ CPU

    work
  5. εΫϨΠϐϯά͸ඇಉظϓϩ άϥϛϯάϞσϧͱΑ͘߹͏ request('http://example.com/', (e, res, body) => { const doc

    = libxmljs.parseHtmlString(body) const data = doc.get('(//dl[@class="newsList"])[1]/dt[1]/text()') }) ;ͭ͏ʹॻ͘ͱޮ཰తʹಈ͘
  6. ࠶ύʔε • ධՁ߲໨Λมߋ͍ͨ͠ • 8000htmlϑΝΠϧΛ࠶ύʔε

  7. 20min

  8. ΋ͬͱ଎͍ͨ͘͠

  9. C++ Addon: libxmljs • ύʔεॲཧΛ଎͘͢Δ • parse5Λlibxmljsʹม͑Δ • 1200s(20min) =>

    160s • 7.5ഒ଎͘ͳͬͨ • parse5: pure js • libxmljs: libxml2 (c) ͷϥούʔ
  10. ΋ͬͱ଎͍ͨ͘͠

  11. ॲཧ಺༰ͷ෼ੳ • SSD؀ڥͰFileಡΈࠐΈ͸շ଎ • IO଴ͪ͸ຆͲແ͍ • CPUॲཧ͕େ൒Λ઎ΊΔ • ୯ମͷCPUॲཧΛ଎͘͢Δͷ͸೉͍͠

  12. ฒྻϓϩάϥϛϯά ݱ୅ͷύιίϯ͸ෳ਺ίΞ͕ࡌ͍ͬͯΔ ෳ਺ͷίΞͰCPUॲཧΛฒྻʹ૸ΒͤΔ => ฒྻϓϩάϥϛϯά

  13. ຊ೔ͷ͓୊ Node.jsϓϩάϥϜΛฒྻԽͯ͠ੑೳΛ্͛Δ

  14. ฒྻԽ • 1୆ͷϚγϯͷෳ਺ͷίΞΛ࢖ͬͯɺಉ࣌ʹෳ ਺ͷܭࢉΛ࣮ߦ • ෳ਺Ϛγϯͷ࿩͸ѻΘͳ͍

  15. ϓϩηε • ฒྻϓϩάϥϛϯάͱ͍͑͹εϨου • Node.jsʹ͸εϨου͸ͳ͍ • ϓϩηεΛ࢖͏

  16. child_process.fork() ϫʔΧʔϓϩηεΛཱͯΔ const processes = [] for (var i =

    0; i < number; i++) { processes.push(fork(program, [], { stdio: ['ignore', 'ignore', process.stderr, 'ipc'] })) }
  17. ֤ϓϩηεʹৼΓ෼͚Δ // ϑΥϧμ಺ͷϑΝΠϧҰཡΛऔಘ const dir = `${process.cwd()}/data/public/cache` const stream =

    readdirp({ root: dir, fileFilter: '*.html' }) // ࢠϓϩηεͰύʔε let count = 0 stream.on('data', (data) => { // ϥ΢ϯυϩϏϯͰϑΝΠϧΛ഑෍ count++; processes[count % processes.length].send(data) })
  18. ݁Ռɿ80sʂ͖Ε͍ʹ2ഒ଎

  19. ΋ͬͱ଎͍ͨ͘͠

  20. Node.jsʹεϨου͸ͳ͍ͱ ݴͬͨͳɺ͋Ε͸ӕͩ https://github.com/xk/node-threads-a-gogo • C++ AddonͰωΠςΟϒεϨουΛىಈ

  21. ໰୊ɿNode 6.x Ͱಈ͘ • Node 6.xͰ͔͠ಈ͔ͳ͍ • Node 6.x ͩͱͦΕ͚ͩͰ30%͙Β͍஗͍

    • node-threads-a-gogo ͷ Node 10ରԠ • 4࣌ؒؤுͬͯఘΊͨ
  22. Node.jsʹεϨου͸ͳ͍ͱ ݴͬͨͳɺ͋Ε͸ӕͩʢ̎ʣ https://nodejs.org/api/worker_threads.html • v10.5.0@6/20 Ҏ߱ • ΤΫεϖϦϝϯλϧ • --experimental-worker

    • ϑϥάΛ͚ͭΔͱಡΈࠐ·ΕΔϞδϡʔϧ
  23. worker_threads.Worker child_process.fork()ͱେମҰॹ const workers = [] for (var i =

    0; i < number; i++) { workers.push(new Worker(program)) }
  24. threadʹৼΓ෼͚ // ϑΥϧμ಺ͷϑΝΠϧҰཡΛऔಘ const dir = `${process.cwd()}/data/public/cache` const stream =

    readdirp({ root: dir, fileFilter: '*.html' }) // ϫʔΧʔͰύʔε let count = 0 stream.on('data', (data) => { // ϥ΢ϯυϩϏϯͰϑΝΠϧΛ഑෍ count++; workers[count % workers.length].postMessage(data) })
  25. ໰୊ɿC++ Addon͸ಈ͔ͳ ͍ • libxmljs͕࢖͑ͳ͍ • 7.5ഒ͕ɺ஗͘ͳΔ • https://github.com/nodejs/node/issues/ 21481

  26. ΋ͬͱ଎͍ͨ͘͠

  27. ίΞΛ૿΍ͤ͹ฒྻ਺΋ • 2ίΞͰ2ഒ଎͕ݶքͳΒɺͨ͘͞ΜίΞͷ͋ ΔϚγϯΛ࢖͑͹͍͍ • AWS EC2 c4.8xlarge • 36ίΞ

    => 80s͕2.2sʹ!? • 2.016USD/࣌ؒ
  28. None
  29. c4.8xlarge • ͦ΋ͦ΋CPU͕͍͢͝ • 1ฒྻͰ90sʢx1.8ʣ • 12ฒྻͰ8s • ݶքʢ36ίΞΛ࢖͍੾Εͳ͍ʣ

  30. ϘτϧωοΫෆ໌ • ಡΈࠐΈIO͕ϘτϧωοΫʁ • SSDϚγϯʢi3.4xlargeʣΛ࢖ͬͯ΋܏޲͕ มΘΒͳ͍ • ϓϩηεͷݶքʁ • εϨουΛࢼ͔ͨͬͨ͠

  31. ຊ೔ͷ·ͱΊ • libxmljsΛ࢖ͬͯ 7.5ഒ • 2$/hͷϚγϯͰ 1.8ഒ • ฒྻϓϩάϥϛϯάͰ 11.3

    ഒ • worker threads΁ͷҠߦ͕؆୯ ฒྻϓϩάϥϛϯά͸͓͍͍͠
  32. ฒྻϓϩάϥϛϯά͠Α͏ worker threadsΛָ͠Έʹ͠·͠ΐʔ