AssemblyScriptでライブラリコードの高速化をしてみる

 AssemblyScriptでライブラリコードの高速化をしてみる

9278c3a06b8d8752fb913dea93f959c1?s=128

FUJI Goro

July 24, 2019
Tweet

Transcript

  1. AssemblyScriptͰϥΠϒϥϦ ίʔυͷߴ଎ԽΛͯ͠ΈΔ Emscripten & WebAssembly night!! #8 In Mercari, Inc.

    ,2019/07/24 by FUJI Goro (@__gfx__)
  2. ࣗݾ঺հ • FUJI Goro / @__gfx__ • ϑϩϯτΤϯυͱ͔Web APIͱ͔ •

    ࠷ۙ஫໨͍ͯ͠Δٕज़͸GraphQLͱ TypeScriptͱWebAssembly
  3. WebAssembly as a universal executable binary • Φϓτϐ: WebAssemblyʹ͸ϢχόʔαϧͳόΠφϦͱ ͯ͠ظ଴ͯ͠Δ

    • ͭ·ΓϓϥοτϑΥʔϜඇґଘͳόΠτίʔυ • ࢼ͠ʹzopfli (C library) ΛwasmʹϏϧυͯ͠ @gfx/ zopfli ͱͯ͠഑෍ͯ͠Έͨ • native binding൛ΑΓ஗͍͕Πϯείָ͕ͳͷͰͦ͜ ͦ͜࢖ΘΕΔΑ͏ʹͳͬͨ
  4. WASM in intereters • কདྷతʹ͸Ruby / Perl / Python ͱ͍ͬͨΠϯλϓϦλ͕

    WASM࣮ߦΤϯδϯΛ࣋ͭΑ͏ʹͳΔͷͰ͸ͳ͍͔ • ͜ͷํ޲ͰͷࢼΈ͕wasmerͰɺ͢Ͱʹϝδϟʔͳݴޠ༻ͷ binding͕͋Δ • WASM͕ϏϧυࡁΈόΠφϦͷϑΥʔϚοτͱͯ͠ϝ δϟʔʹͳΔͱ͍͍ͳ͋…ͱࢥ͍ͬͯΔ • ߴ଎ͳwasmॲཧܥ͸ඞཁʢwasmer͕Ͳ͏͔͸ະௐࠪʣ
  5. ؓ࿩ٳ୊

  6. WebAssemblyͷ༻్ • طଘͷ C / C++ / Rust / Go

    ੡඼ΛJSʹίϯύΠϧ ͢Δ • ͦΕͳΓʹ҆ఆ͍ͯͯ͠ී௨ʹศར • ৽͍͠੡඼ͷҰ෦ΛWebAsemblyͰߴ଎Խ͢Δ • ࠓճ͸ͬͪ͜ • ͜ͷ༻్ͷύϑΥʔϚϯε͸ະ஌਺Ͱ͸ʁʁ
  7. ΍ͬͯΈͨ

  8. ݴޠ: AssemblyScript • બ୒ࢶͱͯ͠͸ C / C++ / Rust /

    Go / AssemblyScript • ࠓճ͸ϥϯλΠϜ͕Ұ൪খͦ͞͏ͳ AssemblyScriptΛબ୒ • ʢଞͷݴޠ΋ࢼ͔͕ͨͬͨؒ͠ʹ߹Θͣʣ
  9. AssemblyScript • TypeScriptͷαϒηοτΛߏจͱͯ͠ར༻ͨ͠શ͘ ৽͍͠ϓϩάϥϛϯάݴޠ • ॻ͖ຯ͸TypeScriptΑΓ͸C/C++ʹ͍ۙ • ͱ͍͏͔TypeScriptͷൽ͚ͩ࢒ͯ͠த਎Λ͘Γൈ ͍ͯC/C++Λ٧Ί௚ͨ͠ݴޠͱ͍͏΂͖ •

    ͨͩ͠CΑΓ͸ଟػೳɺC++ΑΓ͸ශऑ
  10. AssemblyScriptͷ࠷దԽث • AssemblyScriptͷόοΫΤϯυ͸Bynarien • Emscripten͕௕Β͘࢖͖࣮ͬͯͨ੷ͷ͋Δόο ΫΤϯυ • ʢ͍·͸Emscripten͸LLVM backendਪ͠ʣ •

    খ͞ͳؔ਺ͷΠϯϥΠϯԽͳͲɺجຊతͳ࠷ద Խ͸Bynarien͕΍ͬͯ͘ΕΔ
  11. ςʔϚ: MessagePack codec • ͦ΋ͦ΋WebAssembly͸όΠτྻ (ArrayBuffer) ͔͠ѻ͑ͳ͍ͷͰɺಘҙ෼໺͸ ݶΒΕͦ͏ • खݩͷίʔυͩͱMessagePack͸όΠφϦγ

    ϦΞϥΠβͳͷͰWASMͰߴ଎Խͷ༨஍͸͋ Γͦ͏ͩͱ౿Μͩ
  12. ϦϙδτϦ • https://github.com/msgpack/msgpack-javascript • v1.6.0 ݱࡏͷ࿩ • npm install @msgpack/msgpack

    ͰΠϯείՄೳ • ͨͩ͠WASM൛͸ݱࡏweb൛Ͱ͸࢖ΘΕͳ͍Α͏ʹͳͬ ͍ͯΔ • WASM·ΘΓͷίʔυ͸͍ͣΕফͨ͠ΓRustʹॻ͖׵͑ͨ Γ͢Δ͔΋
  13. ࣮૷ • MessagePack decoderͷҰ෦ɺจࣈྻͷσ ίʔυΛJS / WASM (AS) / ωΠςΟϒίʔυ

    ͦΕͧΕͰ࣮૷ͨ͠ • ΍͍ͬͯΔ͜ͱ͸UTF-8ͷ഑ྻΛUTF-16ͷ഑ ྻʹม׵͢Δ͜ͱ • AssemblyScriptʹҠ২ͯ͠100ߦఔ౓
  14. JS൛ͷίʔυʢൈਮʣ export function utf8DecodeJs(bytes: Uint8Array, inputOffset: numbe byteLength: number): string

    { let offset = inputOffset; const end = offset + byteLength; const units: Array<number> = []; while (offset < end) { const byte1 = bytes[offset++]; if ((byte1 & 0x80) === 0) { // 1 byte units.push(byte1); } // ... } return String.fromCharCode(...units); }
  15. ωΠςΟϒίʔυ൛ͷίʔυ const sharedTextDecoder = new TextDecoder(); export function utf8DecodeTD(bytes: Uint8Array,

    inputOffset: numbe byteLength: number): string { const stringBytes = bytes.subarray(inputOffset, inputOffset + byteLength); return sharedTextDecoder!.decode(stringBytes); }
  16. AS൛ͷίʔυʢൈਮ, AS 0.6ʣ export function utf8DecodeToUint16Array(outputPtr: usize, inputPtr: usize, byteLength:

    usize): usize { let inputOffset = inputPtr; let outputOffset = outputPtr; let inputOffsetEnd = inputOffset + byteLength; const u16s = sizeof<u16>(); while (inputOffset < inputOffsetEnd) { let byte1: u16 = load<u8>(inputOffset++); if ((byte1 & 0x80) === 0) { // 1 byte store<u16>(outputOffset, byte1); outputOffset += u16s; } } return (outputOffset - outputPtr) / u16s; }
  17. AS൛ͷίʔυʢJSଆʣ // wm = InstantiatedWasmModule.exports type pointer = number; //

    32-bit integer export function utf8DecodeWasm(bytes: Uint8Array, inputOffset: numbe byteLength: number): string { const inputPtr: pointer = wm.malloc(byteLength); const outputPtr: pointer = wm.malloc(byteLength * 2); try { setMemoryU8(inputPtr, bytes.subarray(inputOffset, inputOffset + byteLength), byteLength); const outputArraySize = wm.utf8DecodeToUint16Array(outputPtr, inputPtr, byteLength); const units = new Uint16Array(wm.memory.buffer, outputPtr, outputArraySize); return String.fromCharCode(...units); } finally { wm.free(inputPtr); wm.free(outputPtr); } }
  18. WASM function΁ͷೖྗ • WASMʹ౉ͤΔ஋͸2छྨ • (1) WASM functionͷݺͼग़͠ͷҾ਺ͱͯ͠ɺ೚ҙݸͷ੔਺·ͨ͸ුಈখ ਺఺਺ •

    (2) WASM moduleͷbuffer: ArrayBufferʹ஋Λॻ͖ࠐΉɻArrayBufferʹॻ ͖ࠐΊΔ஋ͳΒͳΜͰ΋OK • Ͳ͜ʹॻ͖ࠐΜ͔ͩͷoffsetΛ(1)ͷҾ਺ͱͯ͠౉͢ • Cݴޠతʹݴ͑͹͜Ε͕ϙΠϯλ • ASͰ͸ load<T>(offset) ؔ਺ͰಡΈग़ͤΔ
  19. WASM function͔Βͷग़ྗ • ੔਺·ͨ͸ුಈখ਺఺਺1͚ͭͩ • ʢWASMతʹ͸೚ҙͷ਺ͷ஋ΛฦͤΔ͕ʣ • ࠓճ͸ೖྗ஋ͱͯ͠outputPtr (offset) Λ౉͠ɺ

    WASM function͸ग़ྗ஋ΛoutputPtrͷҐஔʹ ॻ͖ࠐΈɺॻ͖ࠐΜͩαΠζΛWASM function ͔Βฦ͢ͱ͍͏͜ͱʹͨ͠
  20. WASM functionͷγάωνϟ • ίϝϯτ͖ͭͰ࠶ܝ͢Δͱɺ͜Μͳײ͡ export function utf8DecodeToUint16Array( outputPtr: usize, //

    output offset inputPtr: usize, // input offset byteLength: usize, // input length ): usize; // output length
  21. AS (0.6) ͷϋϚΓͲ͜Ζ • ϙΠϯλܕ͕ͳ͘͢΂ͯ usize (uint32_t) ܕ • load<u16>()

    / store<u16>() ͳͲ͸ϦτϧΤϯσΟΞϯͱنఆ ͞Ε͍ͯΔ͕ɺJSଆͷtyped arrays (Uint16Array) ͸ϗετͷΤ ϯσΟΞϯͳͷͰຊ౰͸ޓ׵ੑ͕ͳ͍ • ͔͠͠ɺ͖ΐ͏ͼͷϚγϯ͸ϦτϧΤϯσΟΞϯͳͷͰಈ͍ ͯ͠·͏ʢASͷ໰୊Ͱ͸ͳ͍͕ʣ • ݟͨ໨͕TypeScriptͳ͜ͱʹؾ͕࣋ͪҾ͖ͣΒΕͯຌϛε͕සൃ ͢Δ
  22. AS (0.7) ͷϋϚΓͲ͜Ζ • ͳ͓ݱߦόʔδϣϯ (0.7) ͸ϦϑΝϨϯεΧ΢ ϯτϥϯλΠϜ͕௥Ճ͞ΕͨͷͰJSͱͷ૬ޓ ӡ༻͕ΑΓ೉͘͠ͳͬͨ •

    msgpack-javascript͸·ͩAS 0.7ʹରԠͰ͖ ͍ͯͳ͍ɺͱ͍͏͔͍ͬͦASΛࣺͯͯRust ʹ͠Α͏ͱࢥ͍ͬͯΔ
  23. ϕϯνϚʔΫ

  24. ؀ڥ • macOS 10.14 • NodeJS 12.6.0 • v8 7.5

    ʢChrome 75૬౰ʣ • ࠓճ͸NodeJSͷΈͰϕϯνϚʔΫΛͨ͠
  25. ϕϯνϚʔΫίʔυ • https://gist.github.com/gfx/ e3e33c80848f734a81dbd030fca16230 • “A”.repeat(N) ʢN͸σʔλαΠζʣͱ͍͏ σʔλΛUTF-8Τϯίʔυͨ͠όΠτྻΛɺ JS൛ /

    WASM൛ / ωΠςΟϒίʔυ൛ͷؔ਺ Ͱจࣈྻʹσίʔυ͢Δ
  26. νϟʔτͷݟํ • ॎ࣠͸ log10 (ops per sec) • ͦͷ··ͩͱݟͮΒ͍ͷͰର਺ʹͯ͋͠Δ •

    ஋͕େ͖͍΄Ͳੑೳ͕Α͍ • ԣ࣠͸σʔλαΠζ • ಉ͡σʔλαΠζಉ࢜Ͱൺֱ͢Δ͜ͱ • σʔλαΠζ͕ҟͳΔσʔλͷൺֱ͸ແҙຯ
  27. ϕϯνϚʔΫ݁Ռ 0 2 4 6 8 10 100 200 500

    1000 10000 utf8DecodeJs utf8DecodeWasm TextDecoder default, NodeJS/v12.6.0, v8/7.5
  28. νϟʔτ͔ΒಡΈऔΕΔ͜ͱ • σʔλαΠζ͕খ͍͞ͱ͖͸JS൛͕࠷଎ • WASM൛ / ωΠςΟϒίʔυ൛͸ॲཧࣗମ͸ߴ ଎͕ͩݺͼग़͠ͷΦʔόʔϔου͕େ͖͍ͨΊ • σʔλαΠζ͕େ͖͘ͳΔͱ

    ωΠςΟϒίʔυ൛ >> WASM൛ > JS൛ • ͦ΋ͦ΋JS൛ͱWASM൛ͰͦΕ΄Ͳࠩ͸ͳ͍
  29. JS൛ͱWASM൛ͷ͕ࠩͳ͍ʁʁ • ͔֬ʹσʔλαΠζ͕େ͖͘ͳΔͱWASM൛ͷ΄͏͕গͩ͠ ͚ͱ͸͍ܾ͑ఆతʹ଎͍ • ͔ͦ͠͠ͷࠩ͸͍͍ͤͥ਺े%Ͱɺ։ൃ޻਺Λߟ͑ΔͱWASM ൛͸ίεύ͕ѱ͍ • AssemblyScript͕ͭΒ͍ͱ͍͏͜ͱ΋͋Δ͕ɺͦ΋ͦ΋ݴޠ Λ·͍ͨͩϒϦοδΛϝϯς͢Δͷ͸ٕज़తͳ೉қ౓͕ߴ͍

    • ݁ہɺV8ͷ࠷దԽJITίϯύΠϥ͕ڧ͗͢ΔͷͰ଎͍JSίʔυ ͷॻ͖ํΛֶͿͷ͕ίεύ͕Α͍
  30. ୈҰ෦׬

  31. ୈೋ෦

  32. ~ v8 —no-opt ฤ ~

  33. V8ͷΞʔΩςΫνϟ(2017)

  34. ࠷దԽJITίϯύΠϥ TurboFan • ͔ͨ͠ʹV8͸TurboFan͕ޮ͚͹ര଎ • ͔͠͠΢ΣϒϖʔδͷॳճಡΈࠐΈ͔࣌Βૢ ࡞ՄೳʹͳΔ·Ͱͷؒ͸ɺTurboFanʹΑΔ࠷ దԽ͕·ͩޮ͍ͯͳ͍͔΋͠Εͳ͍ • ͭ·ΓTurboFanͷੑೳΛଌΔϕϯνϚʔΫ͕

    ͋ͳͨͷέʔεʹద߹͢Δͱ͸ݶΒͳ͍
  35. v8 —no-opt • v8ͷ࠷దԽΛແޮʹ࣮ͯ͠ߦ͢ΔΦϓγϣϯ • nodejsͰ΋͜ͷΦϓγϣϯ͕࢖͑Δ • ͜ͷΦϓγϣϯ෇͖ͰϕϯνϚʔΫΛ͢Δ ͱɺͨͱ͑͹ҰॠͰ࣮ߦΛऴ͑ΔίϚϯυϥ Πϯπʔϧ΍΢ΣϒϖʔδͷॳظԽίʔυͷ

    ࣮ߦͳͲͷ؀ڥΛΤϛϡϨʔτͰ͖Δ
  36. nodejs —no-opt Ͱ࠶ܭଌ

  37. ϕϯνϚʔΫ݁Ռ (—no-opt) 0 2 4 6 8 10 100 200

    500 1000 10000 utf8DecodeJs utf8DecodeWasm TextDecoder --no-opt NodeJS v12.6.0, v8 7.5 on macOS
  38. ϕϯνϚʔΫ݁Ռ (default) 0 2 4 6 8 10 100 200

    500 1000 10000 utf8DecodeJs utf8DecodeWasm TextDecoder default, NodeJS/v12.6.0, v8/7.5
  39. νϟʔτ͔ΒಡΈऔΕΔ͜ͱ • ࠷దԽ͕ޮ͔ͳ͍ͱ͖͸͔ͳΓখ͍͞σʔλα ΠζͰ΋ WASM൛ >> JS൛ • ࠷దԽ͕ޮ͍͍ͯͯ΋WASM൛ͷ΄͏͕JS൛Α Γগ͠଎͍

    • ىಈ࣌ͷϘτϧωοΫ͕WASMͷಘҙͦ͏ͳλ εΫͰ͋Ε͹WASMԽΛݕ౼ͯ͠΋Αͦ͞͏
  40. ·ͱΊ: ʮ·ͩૣ͍ʯ • ʮAssemblyScriptͰϥΠϒϥϦίʔυͷߴ଎ ԽΛͯ͠ΈΔʯͷ͸Մೳ͕ͩίεύ͸ѱ͍ • WASM͸JITʹΑΔ࠷దԽ͕ޮ͍͍ͯͳͯ͘΋ ଎͍ͷͰঢ়گʹΑͬͯ͸ޮՌ͸͋Γͦ͏ • WASMࣗମ͕·ͩ੒ख़͍ͯ͠ͳ͍ͷͰɺ݁࿦

    ͱͯ͠͸ʮ·ͩૣ͍ʯͱ͍͏͜ͱʹ͓ͯ͘͠
  41. Appendix

  42. ݺͼग़͠ͷΦʔόʔϔου • WASMݺͼग़͠ͷΦʔόʔϔου͸ೖྗΛArrayBuffer΁ίϐʔ ͨ͠Γग़ྗΛArrayBuffer͔Βίϐʔͨ͠Γ͢Δͷ͕΄ͱΜͲ • ݱࡏWASMʹఏҊ͞Ε͍ͯΔ reference-types ͸ɺanyrefͱ͍͏ ܕͰJSͷΦϒδΣΫτΛ௚઀WASMʹ౉ͤΔΑ͏ʹͳΔ࢓༷ •

    anyrefࣗମ͸Կ΋ૢ࡞Ͱ͖ͳ͍஋͕ͩɺ ͨͱ͑͹ࠓճͷ৔߹͸ WASM moduleʹରͯ͠ readU8FromUint8Array(u8array: anyref, offset: usize): u8 Έ͍ͨͳؔ਺Λexport͢Δ͜ͱͰೖྗ ΛArrayBufferʹίϐʔͤͣʹࢀরͰ͖ΔΑ͏ʹͳΔʢ͸ͣʣ
  43. ύϑΥʔϚϯεͷࠓޙ • WASM͸ॴḨόΠτίʔυͳͷͰ࣮ߦ଎౓͕͜Ε͔ ΒܶతʹมΘΔͱ͍͏͜ͱ͸ͳͦ͞͏ • ͍·͸ωΠςΟϒίʔυͷ30%-50%΄Ͳͷ଎౓ • ͨͩ͠ɺωΠςΟϒίʔυΛࣗ༝ʹ࢖͑ͳ͍؀ڥ ʢϒϥ΢βʣͰ͸ɺ͜Ε͔ΒWASMʹ͘Δػೳʢͨ ͱ͑͹SIMDʣ

    ʹΑͬͯҰ෦ͷλεΫ͕ܶతʹߴ଎ Խ͢ΔՄೳੑ͸͋Γ