Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
ウェブブラウザ向け深層学習モデル高速実行フレームワーク「WebDNN」
Search
Kiikurage
September 03, 2017
Programming
6
16k
ウェブブラウザ向け深層学習モデル高速実行フレームワーク「WebDNN」
2017/09/03 Deep Learning Acceleration勉強会@DeNAでの発表資料です。
Kiikurage
September 03, 2017
Tweet
Share
Other Decks in Programming
See All in Programming
『毎日の移動』を支えるGoバックエンド内製開発
yutautsugi
2
270
Things You Thought You Didn’t Need To Care About That Have a Big Impact On Your Job
hollycummins
0
250
O Que É e Como Funciona o PHP-FPM?
marcelgsantos
0
110
Writing Better Go: Lessons from 10 Code Reviews
konradreiche
2
5.2k
なぜGoのジェネリクスはこの形なのか? - Featherweight Goが明かす設計の核心
qualiarts
0
220
Range on Rails ―「多重範囲型」という新たな選択肢が、複雑ロジックを劇的にシンプルにしたワケ
rizap_tech
0
6.7k
ALL CODE BASE ARE BELONG TO STUDY
uzulla
26
6.6k
SODA - FACT BOOK(JP)
sodainc
1
8.6k
bootcamp2025_バックエンド研修_WebAPIサーバ作成.pdf
geniee_inc
0
120
overlayPreferenceValue で実現する ピュア SwiftUI な AdMob ネイティブ広告
uhucream
0
200
Go言語の特性を活かした公式MCP SDKの設計
hond0413
1
400
CSC305 Lecture 08
javiergs
PRO
0
270
Featured
See All Featured
Imperfection Machines: The Place of Print at Facebook
scottboms
269
13k
Code Review Best Practice
trishagee
72
19k
How To Stay Up To Date on Web Technology
chriscoyier
791
250k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
10
880
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.5k
It's Worth the Effort
3n
187
28k
Site-Speed That Sticks
csswizardry
13
920
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
Agile that works and the tools we love
rasmusluckow
331
21k
A Tale of Four Properties
chriscoyier
161
23k
Facilitating Awesome Meetings
lara
56
6.6k
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
194
16k
Transcript
Σϒϒϥβ͚ਂֶशϞσϧߴ࣮ߦϑϨʔϜϫʔΫ WebDNN ౦ژେֶ େֶӃใཧֶܥݚڀՊ ใֶઐ߈ ݪాɾڇٱݚڀࣨ म࢜ ༔Ұ !,JJLVSBHF
༔Ұ:VJDIJSP,JLVSB ౦ژେֶ େֶӃใཧֶܥݚڀՊ ใֶઐ߈ ݪాɾڇٱݚڀࣨ म࢜ Σϒϒϥβ͚ਂֶशϞσϧߴ࣮ߦϑϨʔϜϫʔΫʮ8FC%//ʯͷ͝հ • 8FC%//ͱ
• 8FC%//ͷ͍ํ • 8FC%//͕ߦ͍ͬͯΔߴԽΞϓϩʔν ͡Ίʹ
8FC%// Σϒϒϥβ͚ਂֶशϞσϧߴ࣮ߦϑϨʔϜϫʔΫ ֶशࡁΈϞσϧΛਪϑΣʔζʹ࠷దԽ͠Σϒϒϥβ্Ͱ࣮ߦ 5IJTXPSLXBTQBSUJBMMZTVQQPSUFECZ+45 $3&45 (SBOU/VNCFS+1.+$3 +BQBO5IJTXPSLXBTBMTP QBSUJBMMZTVQQPSUFECZUIF.JOJTUSZPG&EVDBUJPO $VMUVSF 4QPSUT
4DJFODFBOE5FDIOPMPHZ .&95 BT l4FNJOBM*TTVFPO1PTU,$PNQVUFSz .BTBUPTIJ)JEBLB :VJDIJSP,JLVSB :PTIJUBLB6TIJLV 5BUTVZB)BSBEB ౦ژେֶ ใཧֶܥݚڀՊ ݪాɾڇٱݚڀࣨ .BDIJOF*OUFMMJHFODF-BC.*-
8FC%// Σϒϒϥβ͚ਂֶशϞσϧߴ࣮ߦϑϨʔϜϫʔΫ ֶशࡁΈϞσϧΛਪϑΣʔζʹ࠷దԽ͠Σϒϒϥβ্Ͱ࣮ߦ Ϟσϧఆٛ ม ࠷ ద Խ ੜ
(16#BDLFOE ֶशࡁΈ ύϥϝʔλ ύϥϝʔλ (SBQI5SBOTQJMFS ݚڀՌͷਝͳެ։ %//Λ༻͍ͨΞϓϦͷσϞ ར༻ྫ IUUQTNJMUPLZPHJUIVCJPXFCEOO %FTDSJQUPS3VOOFS $16#BDLFOE (SBQI %FTDSJQUPS
8FC%//ͷར༻ྫ .BLF(JSMT.PF <:+JOBOE+;IBOH> IUUQNBLFHJSMTNPF ("/ʹΑΔೋ࣍ݩΩϟϥը૾ੜ
ΣϒϒϥβͰͷਂֶश %//Λ࣮ΞϓϦέʔγϣϯԠ༻͢Δࡍͷ՝ αʔόʔαΠυίϯϐϡʔςΟϯά • ΫϥΠΞϯτʹର͢ΔεέʔϧΞτ͕༰қͰͳ͍ • ϓϥΠόγʔʹ৮͢ΔσʔλΛѻ͍ͮΒ͍ ΫϥΠΞϯταΠυίϯϐϡʔςΟϯά • ܭࢉڥηοτΞοϓ͕͍͠
ΣϒϒϥβͰͷਂֶश %//Λ࣮ΞϓϦέʔγϣϯԠ༻͢Δࡍͷ՝ ΣϒΞϓϦέʔγϣϯ • ΫϥΠΞϯτଆͰܭࢉ͢ΔͨΊεέʔϧΞτ͕༰қ • αʔόʔͷσʔλΞοϓϩʔυ͕ෆཁͳͨΊಗ໊ੑͷ୲อ͕༰қ • ηοτΞοϓෆཁ ΣϒαΠτʹΞΫηε͢Δ͚ͩͰར༻Մೳ
"JYJMF͞Μ !OBNBOJLV .BLF(JSTU.PFͷ࡞ऀ
ΣϒϒϥβͰͷػցֶश • .*-+4 IUUQNJMUPLZPHJUIVCJPNJMKTIUNM ݚڀࣨͰ։ൃ͍ͯ͠Δ+BWB4DSJQUϕʔεػցֶशϥΠϒϥϦ܈ • 4VTIJ +BWB4DSJQUͰಈ࡞͢ΔߦྻԋࢉϥΠϒϥϦ <,.JVSB ><.)JEBLB
> • 4VLJZBLJ 4VTIJΛܭࢉόοΫΤϯυͱ͢ΔػցֶशϥΠϒϥϦ <,.JVSB ><.)JEBLB >
8FC%//ͷಛ Σϒ࠷৽༷Λ༻͍ͨߴͳܭࢉόοΫΤϯυ 8FC(16ɾ8FC(-ɾ8FC"TTFNCMZʹΑΔ$16(16྆ํͰͷߴͳ࣮ߦ ܭࢉάϥϑ࠷దԽʹΑΔܭࢉͷߴԽ ਪϑΣʔζʹಛԽͨ͠࠷దԽʹΑΓܭࢉྔɾϞσϧύϥϝʔλαΠζΛݮ طଘͷਂֶशϑϨʔϜϫʔΫͷൣғͳαϙʔτ ,FSBT
$BGGF $IBJOFS 5FOTPS'MPXʹରԠ
8FC%//ͷಛ Σϒ࠷৽༷Λ༻͍ͨߴͳܭࢉόοΫΤϯυ • ༷ʑͳڥʹରԠՄೳͳछྨͷܭࢉόοΫΤϯυΛ࣮ • ϒϥβʹ࣮͞Ε͍ͯΔ"1*ʹԠͯࣗ͡ಈతʹ࠷దͳόοΫΤϯυΛબ • ΄΅શͯͷڥͰ(16ʹΑΔϋʔυΣΞΞΫηϥϨʔγϣϯ͕ར༻Մೳ 8FC(16
#BDLFOE 8FC(- #BDLFOE 8FC"TTFNCMZ #BDLFOE 'BMMCBDL #BDLFOE
8FC%//ͷಛ ܭࢉάϥϑ࠷దԽʹΑΔܭࢉͷߴԽ • ਪϑΣʔζʹಛԽ͢Δ͜ͱͰɺΑΓੵۃతͳܭࢉάϥϑͷ࠷దԽΛ࣮ࢪ
8FC%//ͷಛ طଘͷਂֶशϑϨʔϜϫʔΫͷൣғͳαϙʔτ • ,FSBT $BGGF $IBJOFS 5FOTPS'MPXͷֶशࡁΈϞσϧ͕มՄೳ • ಠ࣮ࣗͨؔ͠ͷରԠʢτϥϯεύΠϥͷ֦ுʣ༰қ
$IBJOFS chainer.Chain ,FSBT keras.Model $BGGF .proto 5FOTPS'MPX tf.Session
ϕϯνϚʔΫ 3FT/FU<)F > ը૾ຕͷਪʹཁ͢Δ࣌ؒ • ೖྗαΠζ • ൺֱର • ,FSBTKT
IUUQTHJUIVCDPNUSBOTDSBOJBMLFSBTKT LFSBTͷֶशࡁΈϞσϧΛ࣮ߦՄೳͳϥΠϒϥϦɻ8FC(-ʹΑΔ(16ར༻ɻ (1,224,224,3)
ϕϯνϚʔΫ 3FT/FUը૾ຕͷਪʹཁ͢Δ࣌ؒ ,FSBTKT 8FC%//
ॴཁ࣌ؒ <NTຕ> $ISPNF 'JSF'PY 4BGBSJ51 ,FSBTKT 8FC%// ॴཁ࣌ؒ<NTຕ> $ISPNF 'JSF'PY 4BGBSJ51 8FC%// 8FC%// .BD#PPL1SP&BSMZ()[*OUFM$PSFJ$16*OUFM*SJT(SBQIJDT(16 $16 (16 1$
ϕϯνϚʔΫ 3FT/FUը૾ຕͷਪʹཁ͢Δ࣌ؒ
,FSBTKT 8FC%// ॴཁ࣌ؒ <NTຕ> $ISPNF 'JSF'PY 4BGBSJ ,FSBTKT 8FC%// ॴཁ࣌ؒ <NTຕ> $ISPNF 'JSF'PY 4BGBSJ 8FC%// 8FC%// ˞ܽଛՕॴ࣮ߦෆՄೳɾαϙʔτ֎ 91&3*"9;"OESPJEY4OBQESBHPO()[Y$16"ESFOP(16 $ISPNF 'JSF'PY J1IPOFJ04"QQMF" 4BGBSJ εϚʔτϑΥϯ $16 (16
༻ํ๏ ,FSBTɾ$IBJOFSଐͷ3FT/FUֶशࡁΈϞσϧͷม IUUQTNJMUPLZPHJUIVCJPXFCEOOEPDTUVUPSJBMLFSBTIUNM IUUQTNJMUPLZPHJUIVCJPXFCEOOEPDTUVUPSJBMDIBJOFSIUNM
༻ํ๏ ֶशࡁΈϞσϧΛม͢Δ,FSBT from keras.applications import resnet50 from webdnn.frontend.keras import KerasConverter
from webdnn.backend import generate_descriptor graph = KerasConverter().convert(resnet50.ResNet50()) # (1) generate_descriptor('webgl', graph).save('model') # (2) ,FSBT keras.Model (SBQI%FTDSJQUPS GPS8FC(- #BDLFOE 8FC%//*3 1ZUIPO
༻ํ๏ ֶशࡁΈϞσϧΛม͢Δ$IBJOFS import numpy as np from chainer import Variable,
links as L model = L.ResNet50Layers() ܭࢉάϥϑΛ࡞ΔͨΊʹμϛʔσʔλΛྲྀ͢ x = Variable(np.zeros((1, 3, 224, 224), dtype=np.float32)) y = model(x, layers=['prob'])['prob'] $IBJOFS chainer.Chain (SBQI%FTDSJQUPS GPS8FC(- #BDLFOE 8FC%//*3 1ZUIPO
༻ํ๏ ֶशࡁΈϞσϧΛม͢Δ$IBJOFS from webdnn.frontend.chainer import ChainerConverter from webdnn.backend import generate_descriptor
graph = ChainerConverter().convert([x], [y]) # (1) generate_descriptor('webgl', graph).save('model') # (2) $IBJOFS chainer.Chain (SBQI%FTDSJQUPS GPS8FC(- #BDLFOE 8FC%//*3 1ZUIPO
༻ํ๏ ม݁ՌΛϒϥβ͔ΒಡΈࠐΈ࣮ߦ͢Δ let runner = await WebDNN.load('model'); let x =
runner.getInputViews()[0]; let y = runner.getOutputViews()[0]; جຊతͳલɾޙॲཧʹؔ͢ΔαϙʔτϥΠϒϥϦ͕ଐ x.set(await WebDNN.Image.getImageArray('cat.jpg', { dstW: 224, dstH: 224 })); await runner.run(); ࣮ߦ console.log(y.toActual()); JavaScript +BWB4DSJQU
ߴԽ ΣϒϒϥβίϯϐϡʔςΟϯάʹ͓͍ͯͱͳΔཁҼ +BWB4DSJQUͷΦʔόʔϔου ԋࢉίΞෆ ϝϞϦଳҬෆ
ߴԽ ΣϒϒϥβίϯϐϡʔςΟϯάʹ͓͍ͯͱͳΔཁҼ +BWB4DSJQUͷΦʔόʔϔου ԋࢉίΞෆ ϝϞϦଳҬෆ
+BWB4DSJQUͷΦʔόʔϔου • +BWB4DSJQUͦͷͷͷ࣮ߦ͕͍ • ϨΠϠΛ৮ΕΔ"1*͕૿͖͑ͯͨ • 7BOJMMB+BWB4DSJQUˠ8FC"TTFNCMZ • 8FC(-ˠ 8FC(-ˠ
8FC(16
8FC"TTFNCMZ Σϒϒϥβ͔ΒΞηϯϒϥϨϕϧͰͷ໋ྩΛѻ͑Δ༷ • +BWB4DSJQUͷΠϯλϓϦλ࣮ߦʹΑΔΦʔόʔϔου͕ͳͤ͘Δ • 7BOJMMB+BWB4DSJQUͰͰ͖ͳ͔༷ͬͨʑͳ࠷దԽ͕Մೳ • 4*.%ϚϧνεϨουՄೳʹ $ISPNF'JSF'PYͰळʙౙTIJQ͞Εͦ͏ C
C++ Rust .wasm ίϯύΠϧ ϒϥβ͔ΒόΠφϦΛ ಡΈࠐΈ࣮ߦ ϒϥβ
8FC(- • Σϒϒϥβʹ(1(16༻"1*͕ଘࡏ͠ͳ͍ • 8FC(- • ը૾ॲཧ༻"1*Ͱ͋Γ$PNQVUJOH4IBEFSαϙʔτ͞Εͯͳ͍ • ؤுΕ(1(16Ͳ͖Մೳ •
༷ʑͳ੍ͷ͍ͤͰΦʔόʔϔου͕େ͖͍ • 8FC(- • (1(16͘͢͠ͳΔػೳ͕Ճ • Φʔόʔϔου͕͔ͳΓݮΒͤΔ
8FC(16 • IUUQTXFCLJUPSHXQDPOUFOUVQMPBETXFCHQVBQJQSPQPTBMIUNM • "QQMF͕த৺ͱͳͬͯٞதͷ࣍ظ"1* • $PNQVUJOH4IBEFSΛαϙʔτɺඈ༂తͳύϑΥʔϚϯε্ • 8FC,JUʹ࣮ͨ͠ •
NBD04)JHI4JFSSBJ04ͷTBGBSJ͔Β༻Մೳ 8(ͷٞࣄΛݟ͍ͯΔͱ ʮͬͺ$PNQVUJOH4IBEFSෆཁͰʁʯ ͱ͍͏͕ٞͳ͞Ε͓ͯΓɺएׯӢߦ͖͕ո͍͠
ߴԽ ΣϒϒϥβίϯϐϡʔςΟϯάʹ͓͍ͯͱͳΔཁҼ +BWB4DSJQUͷΦʔόʔϔου ԋࢉίΞෆ ϝϞϦଳҬෆ
ϝϞϦଳҬෆͷରࡦ ,FSOFM.FSHJOH • ແବͳಡΈॻ͖ΛݮΒ͠ԋࢉີΛߴΊ • ෳͷΧʔωϧΛ̍ͭʹϚʔδ͢Δ • ൃߦͷΦʔόʔϔουݮΒͤΔ float x1
= X1[i]; float x2 = X2[i]; H[i] = x1 + x2; float h = H[i]; Y[i] = h>0?h:0; float x1 = X1[i]; float x2 = X2[i]; float h = x1 + x2; Y[i] = h>0?h:0; 3FBE 3FBE 3FBE 8SJUF 8SJUF 3FBE 3FBE 8SJUF
άϥϑߏ࠷దԽ ࣄલʹొ͞ΕͨύλʔϯʹԊͬͯάϥϑΛม $POTUBOU'PMEJOH Add Var Const1 Const2 Const1 + Const2
ύλʔϯ ม ఆ ม
άϥϑߏ࠷దԽ ࣄલʹొ͞ΕͨύλʔϯʹԊͬͯάϥϑΛม Add x1 Add x3 x2 c1 c2
άϥϑߏ࠷దԽ ࣄલʹొ͞ΕͨύλʔϯʹԊͬͯάϥϑΛม Add x1 Add x3 x2 c1 c2 $POTUBOU'PMEJOH
Add x3 x2 c1 + c2
&YQSFTTJPO4JNQMJGJDBUJPO ఆ߲Λ·ͱΊɺࣜΛ୯७Խ͠ଞͷ࠷దԽΛ༰қʹ͢Δ x1 c1 c2 Add Add x2 Add (x1
+ c1) + (x2 + c2)
&YQSFTTJPO4JNQMJGJDBUJPO ఆ߲Λ·ͱΊɺࣜΛ୯७Խ͠ଞͷ࠷దԽΛ༰қʹ͢Δ x1 x2 c2 Add Add c1 Add (x1
+ c1) + (x2 + c2) = (x1 + x2) + (c1 + c2)
&YQSFTTJPO4JNQMJGJDBUJPO ఆ߲Λ·ͱΊɺࣜΛ୯७Խ͠ଞͷ࠷దԽΛ༰қʹ͢Δ x1 x2 c2 Add Add c1 Add $POTUBOU'PMEJOH
&YQSFTTJPO4JNQMJGJDBUJPO ఆ߲Λ·ͱΊɺࣜΛ୯७Խ͠ଞͷ࠷దԽΛ༰қʹ͢Δ x1 Add Add c1 + c2 x2
/PSNBMJ[BUJPO &YQSFTTJPO4JNQMJGJDBUJPO ఆ߲Λ·ͱΊɺࣜΛ୯७Խ͠ଞͷ࠷దԽΛ༰қʹ͢Δ x1 w Conv2D x2 s Mul x3
b Add
&YQSFTTJPO4JNQMJGJDBUJPO ఆ߲Λ·ͱΊɺࣜΛ୯७Խ͠ଞͷ࠷దԽΛ༰қʹ͢Δ x1 w Conv2D x2 s Mul x3 b
Add &YQSFTTJPO4JNQMJGJDBUJPO
&YQSFTTJPO4JNQMJGJDBUJPO ఆ߲Λ·ͱΊɺࣜΛ୯७Խ͠ଞͷ࠷దԽΛ༰қʹ͢Δ x1 Conv2D x3 b Add x4 s Mul
w
&YQSFTTJPO4JNQMJGJDBUJPO ఆ߲Λ·ͱΊɺࣜΛ୯७Խ͠ଞͷ࠷దԽΛ༰қʹ͢Δ x1 Conv2D x3 b Add x4 s Mul
w $POTUBOU'PMEJOH
&YQSFTTJPO4JNQMJGJDBUJPO ఆ߲Λ·ͱΊɺࣜΛ୯७Խ͠ଞͷ࠷దԽΛ༰қʹ͢Δ x1 Conv2D x3 b Add w * s
,FSOFM.FSHJOH &MFNFOUXJTFͳΧʔωϧΛ̍ͭʹ·ͱΊɺϝϞϦͷॻ͖ࠐΈ ΧʔωϧͷσΟεύονʹΑΔΦʔόʔϔουΛݮΒ͢ Conv2D x2 Add Mul Add x3 x4
x9 Conv2D x6 Mul Add x7 x8 x10 Relu s1 b1 b2 s2 x1 w1 x5 w2
,FSOFM.FSHJOH &MFNFOUXJTFͳΧʔωϧΛ̍ͭʹ·ͱΊɺϝϞϦͷॻ͖ࠐΈ ΧʔωϧͷσΟεύονʹΑΔΦʔόʔϔουΛݮΒ͢ Conv2D x2 Add Mul Add x3 x4
x9 Conv2D x6 Mul Add x7 x8 x10 Relu s1 b1 b2 s2 x1 w1 x5 w2 &MFNFOUXJTF
&MFNFOUXJTF ,FSOFM.FSHJOH &MFNFOUXJTFͳΧʔωϧΛ̍ͭʹ·ͱΊɺϝϞϦͷॻ͖ࠐΈ ΧʔωϧͷσΟεύονʹΑΔΦʔόʔϔουΛݮΒ͢ Conv2D x2 Add Mul Add x3
x4 x9 Conv2D x6 Mul Add x7 x8 x10 Relu s1 b1 b2 s2 x1 w1 x5 w2 ճ 3FBEɾ8SJUF
&YQSFTTJPO4JNQMJGJDBUJPO ,FSOFM.FSHJOH &MFNFOUXJTFͳΧʔωϧΛ̍ͭʹ·ͱΊɺϝϞϦͷॻ͖ࠐΈ ΧʔωϧͷσΟεύονʹΑΔΦʔόʔϔουΛݮΒ͢ Conv2D x2 Add Mul Add x3
x4 x9 Conv2D x6 Mul Add x7 x8 x10 Relu s1 b1 b2 s2 x1 w1 x5 w2 &YQSFTTJPO4JNQMJGJDBUJPO
,FSOFM.FSHJOH &MFNFOUXJTFͳΧʔωϧΛ̍ͭʹ·ͱΊɺϝϞϦͷॻ͖ࠐΈ ΧʔωϧͷσΟεύονʹΑΔΦʔόʔϔουΛݮΒ͢ Conv2D Add Add x3 x4 x9 Conv2D
Add x7 x8 x10 Relu b1 b2 x1 w1 * s1 x5 w2 * s2
&YQSFTTJPO4JNQMJGJDBUJPO ,FSOFM.FSHJOH &MFNFOUXJTFͳΧʔωϧΛ̍ͭʹ·ͱΊɺϝϞϦͷॻ͖ࠐΈ ΧʔωϧͷσΟεύονʹΑΔΦʔόʔϔουΛݮΒ͢ Conv2D Add Add x3 x4 x9
Conv2D Add x7 x8 x10 Relu b1 b2 x1 w1 * s1 x5 w2 * s2
$POTUBOU'PMEJOH ,FSOFM.FSHJOH &MFNFOUXJTFͳΧʔωϧΛ̍ͭʹ·ͱΊɺϝϞϦͷॻ͖ࠐΈ ΧʔωϧͷσΟεύονʹΑΔΦʔόʔϔουΛݮΒ͢ Conv2D Add Add x3 x4 x9
Conv2D Add x8 x10 Relu b2 x1 w1 * s1 x5 w2 * s2 x7 b1
,FSOFM.FSHJOH &MFNFOUXJTFͳΧʔωϧΛ̍ͭʹ·ͱΊɺϝϞϦͷॻ͖ࠐΈ ΧʔωϧͷσΟεύονʹΑΔΦʔόʔϔουΛݮΒ͢ Conv2D Add Add x3 x4 x9 Conv2D
x10 Relu x1 w1 * s1 x5 w2 * s2 x7 b1 + b2
,FSOFM.FSHJOH &MFNFOUXJTFͳΧʔωϧΛ̍ͭʹ·ͱΊɺϝϞϦͷॻ͖ࠐΈ ΧʔωϧͷσΟεύονʹΑΔΦʔόʔϔουΛݮΒ͢ Conv2D Add Add x3 x4 x9 Conv2D
x10 Relu x1 w1 * s1 x5 w2 * s2 x7 b1 + b2 &MFNFOUXJTF,FSOFM.FSHJOH
,FSOFM.FSHJOH &MFNFOUXJTFͳΧʔωϧΛ̍ͭʹ·ͱΊɺϝϞϦͷॻ͖ࠐΈ ΧʔωϧͷσΟεύονʹΑΔΦʔόʔϔουΛݮΒ͢ Conv2D MergedElementwise x3 Conv2D x10 x1 w1
* s1 x5 w2 * s2 x7 b1 + b2 ճ ˠճ 3FBEɾ8SJUF
,FSOFM.FSHJOH ,FSOFM.FSHJOHʹΑΔ࠷దԽͷޮՌ 4BGBSJ51.BD#PPL1SP&BSMZ 3FT/FU࣮ߦ࣌ؒ <NTJNBHF> WebGPU backend 最適化なし WebGPU
backend ,FSOFM.FSHJOH
ϝϞϦଳҬෆͷରࡦ %BUB'PSNBU0QUJNJ[BUJPO ܭࢉάϥϑதͷมΛɺϝϞϦ্ʹͲͷΑ͏ʹ֨ೲ͢Δ͔Λ࠷దԽ • %BUB0SEFS 0QFSBUPSͷछྨɾೖग़ྗมͷܗঢ়ͳͲʹΑΓదͨ͠σʔλΦʔμʔҟͳ Δ • ͋Δ0QFSBUPS/)8$Ͱɺผͷ0QFSBUPS/$)8ͰσʔλΛѻ͍͍ͨɺ ͱ͍ͬͨ͜ͱ͕͋Δ
• લޙͷ0QFSBUPSՃຯͯ͠ೖग़ྗมͷ%BUB 0SEFSΛܾΊΔ
ϝϞϦଳҬෆͷରࡦ %BUB'PSNBU0QUJNJ[BUJPO ܭࢉάϥϑதͷมΛɺϝϞϦ্ʹͲͷΑ͏ʹ֨ೲ͢Δ͔Λ࠷దԽ • "MJHONFOUBOE1BEEJOH 4*.%໋ྩ(16ΧʔωϧதͰͷॲཧൣғͷׂׂΓͯʢ5JMJOHʣΛ༰қʹ ͢ΔͨΊʹߦྻͷαΠζɾΦϑηοτΛἧ͑Δ
ϝϞϦଳҬෆͷରࡦ ྫ (&..$"# ߦྻͷαΠζ͕λΠϧαΠζͰ ៉ྷʹׂΓΕΔΑ͏ ༧ΊύσΟϯά͓ͯ͘͠ " # $ 5JMF
ϝϞϦଳҬෆͷରࡦ ྫ (&..$"# ߦྻͷαΠζ͕λΠϧαΠζͰ ៉ྷʹׂΓΕΔΑ͏ ༧ΊύσΟϯά͓ͯ͘͠ " 1BEEJOH # $
%BUB'PSNBU 0QUJNJ[BUJPO %BUB'PSNBU 0QUJNJ[BUJPOʹΑΔ࠷దԽͷޮՌ 4BGBSJ51.BD#PPL1SP&BSMZ 3FT/FU࣮ߦ࣌ؒ <NTJNBHF> WebGPU backend 最適化なし
WebGPU backend + Elementwise Kernel Merging WebGPU backend + Elementwise Kernel Merging %BUB'PSNBU 0QUJNJ[BUJPO
·ͱΊ 8FC%//Σϒϒϥβ͚ਂֶशϞσϧߴ࣮ߦϑϨʔϜϫʔΫ IUUQTNJMUPLZPHJUIVCJPXFCEOO Σϒ࠷৽༷Λ༻͍ͨߴͳܭࢉόοΫΤϯυ 8FC(16ɾ8FC(-ɾ8FC"TTFNCMZʹΑΔ$16(16྆ํͰͷߴͳ࣮ߦ ܭࢉάϥϑ࠷దԽʹΑΔܭࢉͷߴԽ ਪϑΣʔζʹಛԽͨ͠࠷దԽʹΑΓܭࢉྔɾϞσϧύϥϝʔλαΠζΛݮ
طଘͷਂֶशϑϨʔϜϫʔΫͷൣғͳαϙʔτ ,FSBT $BGGF $IBJOFS 5FOTPS'MPXʹରԠ