ウェブブラウザ向け深層学習モデル高速実行フレームワーク「WebDNN」

B483da9e26625c3437c3cc20250b2888?s=47 Kiikurage
September 03, 2017

 ウェブブラウザ向け深層学習モデル高速実行フレームワーク「WebDNN」

2017/09/03 Deep Learning Acceleration勉強会@DeNAでの発表資料です。

B483da9e26625c3437c3cc20250b2888?s=128

Kiikurage

September 03, 2017
Tweet

Transcript

  1. ΢Σϒϒϥ΢β޲͚ਂ૚ֶशϞσϧߴ଎࣮ߦϑϨʔϜϫʔΫ WebDNN ౦ژେֶ େֶӃ৘ใཧ޻ֶܥݚڀՊ ૑଄৘ใֶઐ߈ ݪాɾڇٱݚڀࣨ म࢜೥ ໦૔ ༔Ұ࿠ !,JJLVSBHF

  2. ໦૔ ༔Ұ࿠:VJDIJSP,JLVSB ౦ژେֶ େֶӃ৘ใཧ޻ֶܥݚڀՊ ૑଄৘ใֶઐ߈ ݪాɾڇٱݚڀࣨ म࢜೥ ΢Σϒϒϥ΢β޲͚ਂ૚ֶशϞσϧߴ଎࣮ߦϑϨʔϜϫʔΫʮ8FC%//ʯͷ͝঺հ • 8FC%//ͱ͸

    • 8FC%//ͷ࢖͍ํ • 8FC%//͕ߦ͍ͬͯΔߴ଎ԽΞϓϩʔν ͸͡Ίʹ
  3. 8FC%// ΢Σϒϒϥ΢β޲͚ਂ૚ֶशϞσϧߴ଎࣮ߦϑϨʔϜϫʔΫ ֶशࡁΈϞσϧΛਪ࿦ϑΣʔζʹ࠷దԽ͠΢Σϒϒϥ΢β্Ͱ࣮ߦ 5IJTXPSLXBTQBSUJBMMZTVQQPSUFECZ+45 $3&45 (SBOU/VNCFS+1.+$3 +BQBO5IJTXPSLXBTBMTP QBSUJBMMZTVQQPSUFECZUIF.JOJTUSZPG&EVDBUJPO $VMUVSF 4QPSUT

    4DJFODFBOE5FDIOPMPHZ .&95 BT l4FNJOBM*TTVFPO1PTU,$PNQVUFSz .BTBUPTIJ)JEBLB :VJDIJSP,JLVSB :PTIJUBLB6TIJLV 5BUTVZB)BSBEB ౦ژେֶ ৘ใཧ޻ֶܥݚڀՊ ݪాɾڇٱݚڀࣨ .BDIJOF*OUFMMJHFODF-BC.*-
  4. 8FC%// ΢Σϒϒϥ΢β޲͚ਂ૚ֶशϞσϧߴ଎࣮ߦϑϨʔϜϫʔΫ ֶशࡁΈϞσϧΛਪ࿦ϑΣʔζʹ࠷దԽ͠΢Σϒϒϥ΢β্Ͱ࣮ߦ Ϟσϧఆٛ ม ׵ ࠷ ద Խ ੜ

    ੒ (16#BDLFOE ֶशࡁΈ ύϥϝʔλ ύϥϝʔλ (SBQI5SBOTQJMFS ݚڀ੒Ռͷਝ଎ͳެ։ %//Λ༻͍ͨΞϓϦͷσϞ ར༻ྫ IUUQTNJMUPLZPHJUIVCJPXFCEOO %FTDSJQUPS3VOOFS $16#BDLFOE (SBQI %FTDSJQUPS
  5. 8FC%//ͷར༻ྫ .BLF(JSMT.PF <:+JOBOE+;IBOH> IUUQNBLFHJSMTNPF ("/ʹΑΔೋ࣍ݩΩϟϥը૾ੜ੒

  6. ΢Σϒϒϥ΢βͰͷਂ૚ֶश %//Λ࣮ΞϓϦέʔγϣϯ΁Ԡ༻͢Δࡍͷ՝୊ αʔόʔαΠυίϯϐϡʔςΟϯά • ΫϥΠΞϯτ਺ʹର͢ΔεέʔϧΞ΢τ͕༰қͰ͸ͳ͍ • ϓϥΠόγʔʹ఍৮͢ΔσʔλΛѻ͍ͮΒ͍ ΫϥΠΞϯταΠυίϯϐϡʔςΟϯά • ܭࢉ؀ڥηοτΞοϓ͕೉͍͠

  7. ΢Σϒϒϥ΢βͰͷਂ૚ֶश %//Λ࣮ΞϓϦέʔγϣϯ΁Ԡ༻͢Δࡍͷ՝୊ ΢ΣϒΞϓϦέʔγϣϯ • ΫϥΠΞϯτଆͰܭࢉ͢ΔͨΊεέʔϧΞ΢τ͕༰қ • αʔόʔ΁ͷσʔλΞοϓϩʔυ͕ෆཁͳͨΊಗ໊ੑͷ୲อ͕༰қ • ηοτΞοϓ͸ෆཁ ΢ΣϒαΠτʹΞΫηε͢Δ͚ͩͰར༻Մೳ

    "JYJMF͞Μ !OBNBOJLV .BLF(JSTU.PFͷ࡞ऀ
  8. ΢Σϒϒϥ΢βͰͷػցֶश • .*-+4 IUUQNJMUPLZPHJUIVCJPNJMKTIUNM ౰ݚڀࣨͰ։ൃ͍ͯ͠Δ+BWB4DSJQUϕʔεػցֶशϥΠϒϥϦ܈ • 4VTIJ +BWB4DSJQUͰಈ࡞͢ΔߦྻԋࢉϥΠϒϥϦ <,.JVSB ><.)JEBLB

    > • 4VLJZBLJ 4VTIJΛܭࢉόοΫΤϯυͱ͢ΔػցֶशϥΠϒϥϦ <,.JVSB ><.)JEBLB >
  9. 8FC%//ͷಛ௃  ΢Σϒ࠷৽࢓༷Λ༻͍ͨߴ଎ͳܭࢉόοΫΤϯυ 8FC(16ɾ8FC(-ɾ8FC"TTFNCMZʹΑΔ$16(16྆ํͰͷߴ଎ͳ࣮ߦ  ܭࢉάϥϑ࠷దԽʹΑΔܭࢉͷߴ଎Խ ਪ࿦ϑΣʔζʹಛԽͨ͠࠷దԽʹΑΓܭࢉྔɾϞσϧύϥϝʔλαΠζΛ࡟ݮ  طଘͷਂ૚ֶशϑϨʔϜϫʔΫͷ޿ൣғͳαϙʔτ ,FSBT

    $BGGF $IBJOFS 5FOTPS'MPXʹରԠ
  10. 8FC%//ͷಛ௃  ΢Σϒ࠷৽࢓༷Λ༻͍ͨߴ଎ͳܭࢉόοΫΤϯυ • ༷ʑͳ؀ڥʹରԠՄೳͳछྨͷܭࢉόοΫΤϯυΛ࣮૷ • ϒϥ΢βʹ࣮૷͞Ε͍ͯΔ"1*ʹԠͯࣗ͡ಈతʹ࠷దͳόοΫΤϯυΛબ୒ • ΄΅શͯͷ؀ڥͰ(16ʹΑΔϋʔυ΢ΣΞΞΫηϥϨʔγϣϯ͕ར༻Մೳ 8FC(16

    #BDLFOE 8FC(- #BDLFOE 8FC"TTFNCMZ #BDLFOE 'BMMCBDL #BDLFOE
  11. 8FC%//ͷಛ௃  ܭࢉάϥϑ࠷దԽʹΑΔܭࢉͷߴ଎Խ • ਪ࿦ϑΣʔζʹಛԽ͢Δ͜ͱͰɺΑΓੵۃతͳܭࢉάϥϑͷ࠷దԽΛ࣮ࢪ

  12. 8FC%//ͷಛ௃  طଘͷਂ૚ֶशϑϨʔϜϫʔΫͷ޿ൣғͳαϙʔτ • ,FSBT $BGGF $IBJOFS 5FOTPS'MPXͷֶशࡁΈϞσϧ͕ม׵Մೳ • ಠ࣮ࣗ૷ͨؔ͠਺΁ͷରԠʢτϥϯεύΠϥͷ֦ுʣ΋༰қ

    $IBJOFS chainer.Chain ,FSBT keras.Model $BGGF .proto 5FOTPS'MPX tf.Session
  13. ϕϯνϚʔΫ 3FT/FU<)F > ը૾ຕͷਪ࿦ʹཁ͢Δ࣌ؒ • ೖྗαΠζ • ൺֱର৅ • ,FSBTKT

    IUUQTHJUIVCDPNUSBOTDSBOJBMLFSBTKT LFSBTͷֶशࡁΈϞσϧΛ࣮ߦՄೳͳϥΠϒϥϦɻ8FC(-ʹΑΔ(16ར༻ɻ (1,224,224,3)
  14. ϕϯνϚʔΫ 3FT/FUը૾ຕͷਪ࿦ʹཁ͢Δ࣌ؒ       ,FSBTKT 8FC%//

    ॴཁ࣌ؒ <NTຕ> $ISPNF 'JSF'PY 4BGBSJ51         ,FSBTKT 8FC%// ॴཁ࣌ؒ<NTຕ> $ISPNF 'JSF'PY 4BGBSJ51 8FC%// 8FC%// .BD#PPL1SP&BSMZ()[*OUFM$PSFJ$16*OUFM*SJT(SBQIJDT(16 $16 (16 1$
  15. ϕϯνϚʔΫ 3FT/FUը૾ຕͷਪ࿦ʹཁ͢Δ࣌ؒ        

     ,FSBTKT 8FC%// ॴཁ࣌ؒ <NTຕ> $ISPNF 'JSF'PY 4BGBSJ         ,FSBTKT 8FC%// ॴཁ࣌ؒ <NTຕ> $ISPNF 'JSF'PY 4BGBSJ 8FC%// 8FC%// ˞ܽଛՕॴ͸࣮ߦෆՄೳɾαϙʔτ֎ 91&3*"9;"OESPJEY4OBQESBHPO()[Y$16"ESFOP(16 $ISPNF 'JSF'PY J1IPOFJ04"QQMF" 4BGBSJ εϚʔτϑΥϯ $16 (16
  16. ࢖༻ํ๏ ,FSBTɾ$IBJOFS෇ଐͷ3FT/FUֶशࡁΈϞσϧͷม׵ IUUQTNJMUPLZPHJUIVCJPXFCEOOEPDTUVUPSJBMLFSBTIUNM IUUQTNJMUPLZPHJUIVCJPXFCEOOEPDTUVUPSJBMDIBJOFSIUNM

  17. ࢖༻ํ๏ ֶशࡁΈϞσϧΛม׵͢Δ,FSBT from keras.applications import resnet50 from webdnn.frontend.keras import KerasConverter

    from webdnn.backend import generate_descriptor graph = KerasConverter().convert(resnet50.ResNet50()) # (1) generate_descriptor('webgl', graph).save('model') # (2)   ,FSBT keras.Model (SBQI%FTDSJQUPS GPS8FC(- #BDLFOE 8FC%//*3 1ZUIPO
  18. ࢖༻ํ๏ ֶशࡁΈϞσϧΛม׵͢Δ$IBJOFS import numpy as np from chainer import Variable,

    links as L model = L.ResNet50Layers() ܭࢉάϥϑΛ࡞ΔͨΊʹμϛʔσʔλΛྲྀ͢ x = Variable(np.zeros((1, 3, 224, 224), dtype=np.float32)) y = model(x, layers=['prob'])['prob'] $IBJOFS chainer.Chain (SBQI%FTDSJQUPS GPS8FC(- #BDLFOE   8FC%//*3 1ZUIPO
  19. ࢖༻ํ๏ ֶशࡁΈϞσϧΛม׵͢Δ$IBJOFS from webdnn.frontend.chainer import ChainerConverter from webdnn.backend import generate_descriptor

    graph = ChainerConverter().convert([x], [y]) # (1) generate_descriptor('webgl', graph).save('model') # (2) $IBJOFS chainer.Chain (SBQI%FTDSJQUPS GPS8FC(- #BDLFOE   8FC%//*3 1ZUIPO
  20. ࢖༻ํ๏ ม׵݁ՌΛϒϥ΢β͔ΒಡΈࠐΈ࣮ߦ͢Δ let runner = await WebDNN.load('model'); let x =

    runner.getInputViews()[0]; let y = runner.getOutputViews()[0];  جຊతͳલɾޙॲཧʹؔ͢ΔαϙʔτϥΠϒϥϦ͕෇ଐ x.set(await WebDNN.Image.getImageArray('cat.jpg', { dstW: 224, dstH: 224 })); await runner.run(); ࣮ߦ console.log(y.toActual()); JavaScript +BWB4DSJQU
  21. ߴ଎Խ ΢Σϒϒϥ΢βίϯϐϡʔςΟϯάʹ͓͍ͯ཯଎ͱͳΔཁҼ +BWB4DSJQUͷΦʔόʔϔου ԋࢉίΞෆ଍ ϝϞϦଳҬෆ଍

  22. ߴ଎Խ ΢Σϒϒϥ΢βίϯϐϡʔςΟϯάʹ͓͍ͯ཯଎ͱͳΔཁҼ +BWB4DSJQUͷΦʔόʔϔου ԋࢉίΞෆ଍ ϝϞϦଳҬෆ଍

  23. +BWB4DSJQUͷΦʔόʔϔου • +BWB4DSJQUͦͷ΋ͷͷ࣮ߦ଎౓͕஗͍ • ௿ϨΠϠΛ৮ΕΔ"1*͕૿͖͑ͯͨ • 7BOJMMB+BWB4DSJQUˠ8FC"TTFNCMZ • 8FC(-ˠ 8FC(-ˠ

    8FC(16
  24. 8FC"TTFNCMZ ΢Σϒϒϥ΢β͔ΒΞηϯϒϥϨϕϧͰͷ໋ྩΛѻ͑Δ࢓༷ • +BWB4DSJQUͷΠϯλϓϦλ࣮ߦʹΑΔΦʔόʔϔου͕ͳͤ͘Δ • 7BOJMMB+BWB4DSJQUͰ͸Ͱ͖ͳ͔༷ͬͨʑͳ࠷దԽ͕Մೳ • 4*.%΍ϚϧνεϨου΋Մೳʹ $ISPNF'JSF'PYͰळʙౙTIJQ͞Εͦ͏ C

    C++ Rust .wasm ίϯύΠϧ ϒϥ΢β͔ΒόΠφϦΛ ௚઀ಡΈࠐΈ࣮ߦ ϒϥ΢β
  25. 8FC(- • ΢Σϒϒϥ΢βʹ͸(1(16༻"1*͕ଘࡏ͠ͳ͍ • 8FC(- • ը૾ॲཧ༻"1*Ͱ͋Γ$PNQVUJOH4IBEFS͸αϙʔτ͞Εͯͳ͍ • ؤுΕ͹(1(16΋Ͳ͖΋Մೳ •

    ༷ʑͳ੍໿ͷ͍ͤͰΦʔόʔϔου͕େ͖͍ • 8FC(- • (1(16͠΍͘͢ͳΔػೳ͕௥Ճ • Φʔόʔϔου͕͔ͳΓݮΒͤΔ
  26. 8FC(16 • IUUQTXFCLJUPSHXQDPOUFOUVQMPBETXFCHQVBQJQSPQPTBMIUNM • "QQMF͕த৺ͱͳͬͯٞ࿦தͷ࣍ظ"1* • $PNQVUJOH4IBEFSΛαϙʔτɺඈ༂తͳύϑΥʔϚϯε޲্ • 8FC,JUʹ࣮૷ͨ͠ •

    NBD04)JHI4JFSSBJ04ͷTBGBSJ͔Β࢖༻Մೳ 8(ͷٞࣄ࿥Λݟ͍ͯΔͱ ʮ΍ͬͺ$PNQVUJOH4IBEFSෆཁͰ͸ʁʯ ͱ͍͏ٞ࿦͕ͳ͞Ε͓ͯΓɺएׯӢߦ͖͕ո͍͠
  27. ߴ଎Խ ΢Σϒϒϥ΢βίϯϐϡʔςΟϯάʹ͓͍ͯ཯଎ͱͳΔཁҼ +BWB4DSJQUͷΦʔόʔϔου ԋࢉίΞෆ଍ ϝϞϦଳҬෆ଍

  28. ϝϞϦଳҬෆ଍ͷରࡦ ,FSOFM.FSHJOH • ແବͳಡΈॻ͖ΛݮΒ͠ԋࢉີ౓ΛߴΊ • ෳ਺ͷΧʔωϧΛ̍ͭʹϚʔδ͢Δ • ൃߦͷΦʔόʔϔου΋ݮΒͤΔ float x1

    = X1[i]; float x2 = X2[i]; H[i] = x1 + x2; float h = H[i]; Y[i] = h>0?h:0; float x1 = X1[i]; float x2 = X2[i]; float h = x1 + x2; Y[i] = h>0?h:0; 3FBE 3FBE 3FBE 8SJUF 8SJUF 3FBE 3FBE 8SJUF
  29. άϥϑߏ଄࠷దԽ ࣄલʹొ࿥͞ΕͨύλʔϯʹԊͬͯάϥϑΛม׵ $POTUBOU'PMEJOH Add Var Const1 Const2 Const1 + Const2

    ύλʔϯ ม׵ ఆ਺ ม਺
  30. άϥϑߏ଄࠷దԽ ࣄલʹొ࿥͞ΕͨύλʔϯʹԊͬͯάϥϑΛม׵ Add x1 Add x3 x2 c1 c2

  31. άϥϑߏ଄࠷దԽ ࣄલʹొ࿥͞ΕͨύλʔϯʹԊͬͯάϥϑΛม׵ Add x1 Add x3 x2 c1 c2 $POTUBOU'PMEJOH

    Add x3 x2 c1 + c2
  32. &YQSFTTJPO4JNQMJGJDBUJPO ఆ਺߲Λ·ͱΊɺࣜΛ୯७Խ͠ଞͷ࠷దԽΛ༰қʹ͢Δ x1 c1 c2 Add Add x2 Add (x1

    + c1) + (x2 + c2)
  33. &YQSFTTJPO4JNQMJGJDBUJPO ఆ਺߲Λ·ͱΊɺࣜΛ୯७Խ͠ଞͷ࠷దԽΛ༰қʹ͢Δ x1 x2 c2 Add Add c1 Add (x1

    + c1) + (x2 + c2) = (x1 + x2) + (c1 + c2)
  34. &YQSFTTJPO4JNQMJGJDBUJPO ఆ਺߲Λ·ͱΊɺࣜΛ୯७Խ͠ଞͷ࠷దԽΛ༰қʹ͢Δ x1 x2 c2 Add Add c1 Add $POTUBOU'PMEJOH

  35. &YQSFTTJPO4JNQMJGJDBUJPO ఆ਺߲Λ·ͱΊɺࣜΛ୯७Խ͠ଞͷ࠷దԽΛ༰қʹ͢Δ x1 Add Add c1 + c2 x2

  36. /PSNBMJ[BUJPO &YQSFTTJPO4JNQMJGJDBUJPO ఆ਺߲Λ·ͱΊɺࣜΛ୯७Խ͠ଞͷ࠷దԽΛ༰қʹ͢Δ x1 w Conv2D x2 s Mul x3

    b Add
  37. &YQSFTTJPO4JNQMJGJDBUJPO ఆ਺߲Λ·ͱΊɺࣜΛ୯७Խ͠ଞͷ࠷దԽΛ༰қʹ͢Δ x1 w Conv2D x2 s Mul x3 b

    Add &YQSFTTJPO4JNQMJGJDBUJPO
  38. &YQSFTTJPO4JNQMJGJDBUJPO ఆ਺߲Λ·ͱΊɺࣜΛ୯७Խ͠ଞͷ࠷దԽΛ༰қʹ͢Δ x1 Conv2D x3 b Add x4 s Mul

    w
  39. &YQSFTTJPO4JNQMJGJDBUJPO ఆ਺߲Λ·ͱΊɺࣜΛ୯७Խ͠ଞͷ࠷దԽΛ༰қʹ͢Δ x1 Conv2D x3 b Add x4 s Mul

    w $POTUBOU'PMEJOH
  40. &YQSFTTJPO4JNQMJGJDBUJPO ఆ਺߲Λ·ͱΊɺࣜΛ୯७Խ͠ଞͷ࠷దԽΛ༰қʹ͢Δ x1 Conv2D x3 b Add w * s

  41. ,FSOFM.FSHJOH &MFNFOUXJTFͳΧʔωϧΛ̍ͭʹ·ͱΊɺϝϞϦ΁ͷॻ͖ࠐΈ΍ ΧʔωϧͷσΟεύονʹΑΔΦʔόʔϔουΛݮΒ͢ Conv2D x2 Add Mul Add x3 x4

    x9 Conv2D x6 Mul Add x7 x8 x10 Relu s1 b1 b2 s2 x1 w1 x5 w2
  42. ,FSOFM.FSHJOH &MFNFOUXJTFͳΧʔωϧΛ̍ͭʹ·ͱΊɺϝϞϦ΁ͷॻ͖ࠐΈ΍ ΧʔωϧͷσΟεύονʹΑΔΦʔόʔϔουΛݮΒ͢ Conv2D x2 Add Mul Add x3 x4

    x9 Conv2D x6 Mul Add x7 x8 x10 Relu s1 b1 b2 s2 x1 w1 x5 w2 &MFNFOUXJTF
  43. &MFNFOUXJTF ,FSOFM.FSHJOH &MFNFOUXJTFͳΧʔωϧΛ̍ͭʹ·ͱΊɺϝϞϦ΁ͷॻ͖ࠐΈ΍ ΧʔωϧͷσΟεύονʹΑΔΦʔόʔϔουΛݮΒ͢ Conv2D x2 Add Mul Add x3

    x4 x9 Conv2D x6 Mul Add x7 x8 x10 Relu s1 b1 b2 s2 x1 w1 x5 w2 ճ 3FBEɾ8SJUF
  44. &YQSFTTJPO4JNQMJGJDBUJPO ,FSOFM.FSHJOH &MFNFOUXJTFͳΧʔωϧΛ̍ͭʹ·ͱΊɺϝϞϦ΁ͷॻ͖ࠐΈ΍ ΧʔωϧͷσΟεύονʹΑΔΦʔόʔϔουΛݮΒ͢ Conv2D x2 Add Mul Add x3

    x4 x9 Conv2D x6 Mul Add x7 x8 x10 Relu s1 b1 b2 s2 x1 w1 x5 w2 &YQSFTTJPO4JNQMJGJDBUJPO
  45. ,FSOFM.FSHJOH &MFNFOUXJTFͳΧʔωϧΛ̍ͭʹ·ͱΊɺϝϞϦ΁ͷॻ͖ࠐΈ΍ ΧʔωϧͷσΟεύονʹΑΔΦʔόʔϔουΛݮΒ͢ Conv2D Add Add x3 x4 x9 Conv2D

    Add x7 x8 x10 Relu b1 b2 x1 w1 * s1 x5 w2 * s2
  46. &YQSFTTJPO4JNQMJGJDBUJPO ,FSOFM.FSHJOH &MFNFOUXJTFͳΧʔωϧΛ̍ͭʹ·ͱΊɺϝϞϦ΁ͷॻ͖ࠐΈ΍ ΧʔωϧͷσΟεύονʹΑΔΦʔόʔϔουΛݮΒ͢ Conv2D Add Add x3 x4 x9

    Conv2D Add x7 x8 x10 Relu b1 b2 x1 w1 * s1 x5 w2 * s2
  47. $POTUBOU'PMEJOH ,FSOFM.FSHJOH &MFNFOUXJTFͳΧʔωϧΛ̍ͭʹ·ͱΊɺϝϞϦ΁ͷॻ͖ࠐΈ΍ ΧʔωϧͷσΟεύονʹΑΔΦʔόʔϔουΛݮΒ͢ Conv2D Add Add x3 x4 x9

    Conv2D Add x8 x10 Relu b2 x1 w1 * s1 x5 w2 * s2 x7 b1
  48. ,FSOFM.FSHJOH &MFNFOUXJTFͳΧʔωϧΛ̍ͭʹ·ͱΊɺϝϞϦ΁ͷॻ͖ࠐΈ΍ ΧʔωϧͷσΟεύονʹΑΔΦʔόʔϔουΛݮΒ͢ Conv2D Add Add x3 x4 x9 Conv2D

    x10 Relu x1 w1 * s1 x5 w2 * s2 x7 b1 + b2
  49. ,FSOFM.FSHJOH &MFNFOUXJTFͳΧʔωϧΛ̍ͭʹ·ͱΊɺϝϞϦ΁ͷॻ͖ࠐΈ΍ ΧʔωϧͷσΟεύονʹΑΔΦʔόʔϔουΛݮΒ͢ Conv2D Add Add x3 x4 x9 Conv2D

    x10 Relu x1 w1 * s1 x5 w2 * s2 x7 b1 + b2 &MFNFOUXJTF,FSOFM.FSHJOH
  50. ,FSOFM.FSHJOH &MFNFOUXJTFͳΧʔωϧΛ̍ͭʹ·ͱΊɺϝϞϦ΁ͷॻ͖ࠐΈ΍ ΧʔωϧͷσΟεύονʹΑΔΦʔόʔϔουΛݮΒ͢ Conv2D MergedElementwise x3 Conv2D x10 x1 w1

    * s1 x5 w2 * s2 x7 b1 + b2 ճ ˠճ 3FBEɾ8SJUF
  51. ,FSOFM.FSHJOH ,FSOFM.FSHJOHʹΑΔ࠷దԽͷޮՌ 4BGBSJ51.BD#PPL1SP&BSMZ 3FT/FU࣮ߦ࣌ؒ <NTJNBHF> WebGPU backend 最適化なし  WebGPU

    backend ,FSOFM.FSHJOH 
  52. ϝϞϦଳҬෆ଍΁ͷରࡦ %BUB'PSNBU0QUJNJ[BUJPO ܭࢉάϥϑதͷม਺ΛɺϝϞϦ্ʹͲͷΑ͏ʹ֨ೲ͢Δ͔Λ࠷దԽ • %BUB0SEFS 0QFSBUPSͷछྨɾೖग़ྗม਺ͷܗঢ়ͳͲʹΑΓదͨ͠σʔλΦʔμʔ͸ҟͳ Δ • ͋Δ0QFSBUPS͸/)8$Ͱɺผͷ0QFSBUPS͸/$)8ͰσʔλΛѻ͍͍ͨɺ ͱ͍ͬͨ͜ͱ͕͋Δ

    • લޙͷ0QFSBUPS΋Ճຯͯ͠ೖग़ྗม਺ͷ%BUB 0SEFSΛܾΊΔ
  53. ϝϞϦଳҬෆ଍΁ͷରࡦ %BUB'PSNBU0QUJNJ[BUJPO ܭࢉάϥϑதͷม਺ΛɺϝϞϦ্ʹͲͷΑ͏ʹ֨ೲ͢Δ͔Λ࠷దԽ • "MJHONFOUBOE1BEEJOH 4*.%໋ྩ΍(16ΧʔωϧதͰͷॲཧൣғͷ෼ׂׂΓ౰ͯʢ5JMJOHʣΛ༰қʹ ͢ΔͨΊʹߦྻͷαΠζɾΦϑηοτΛἧ͑Δ

  54. ϝϞϦଳҬෆ଍΁ͷରࡦ ྫ (&..$"# ߦྻͷαΠζ͕λΠϧαΠζͰ ៉ྷʹׂΓ੾ΕΔΑ͏ ༧ΊύσΟϯά͓ͯ͘͠ " # $ 5JMF

  55. ϝϞϦଳҬෆ଍΁ͷରࡦ ྫ (&..$"# ߦྻͷαΠζ͕λΠϧαΠζͰ ៉ྷʹׂΓ੾ΕΔΑ͏ ༧ΊύσΟϯά͓ͯ͘͠ " 1BEEJOH # $

  56. %BUB'PSNBU 0QUJNJ[BUJPO %BUB'PSNBU 0QUJNJ[BUJPOʹΑΔ࠷దԽͷޮՌ 4BGBSJ51.BD#PPL1SP&BSMZ 3FT/FU࣮ߦ࣌ؒ <NTJNBHF> WebGPU backend 最適化なし

     WebGPU backend + Elementwise Kernel Merging  WebGPU backend + Elementwise Kernel Merging %BUB'PSNBU 0QUJNJ[BUJPO 
  57. ·ͱΊ 8FC%//΢Σϒϒϥ΢β޲͚ਂ૚ֶशϞσϧߴ଎࣮ߦϑϨʔϜϫʔΫ IUUQTNJMUPLZPHJUIVCJPXFCEOO  ΢Σϒ࠷৽࢓༷Λ༻͍ͨߴ଎ͳܭࢉόοΫΤϯυ 8FC(16ɾ8FC(-ɾ8FC"TTFNCMZʹΑΔ$16(16྆ํͰͷߴ଎ͳ࣮ߦ  ܭࢉάϥϑ࠷దԽʹΑΔܭࢉͷߴ଎Խ ਪ࿦ϑΣʔζʹಛԽͨ͠࠷దԽʹΑΓܭࢉྔɾϞσϧύϥϝʔλαΠζΛ࡟ݮ 

    طଘͷਂ૚ֶशϑϨʔϜϫʔΫͷ޿ൣғͳαϙʔτ ,FSBT $BGGF $IBJOFS 5FOTPS'MPXʹରԠ