Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ウェブブラウザ向け深層学習モデル高速実行フレームワーク「WebDNN」

Kiikurage
September 03, 2017

 ウェブブラウザ向け深層学習モデル高速実行フレームワーク「WebDNN」

2017/09/03 Deep Learning Acceleration勉強会@DeNAでの発表資料です。

Kiikurage

September 03, 2017
Tweet

Other Decks in Programming

Transcript

  1. 8FC%// ΢Σϒϒϥ΢β޲͚ਂ૚ֶशϞσϧߴ଎࣮ߦϑϨʔϜϫʔΫ ֶशࡁΈϞσϧΛਪ࿦ϑΣʔζʹ࠷దԽ͠΢Σϒϒϥ΢β্Ͱ࣮ߦ 5IJTXPSLXBTQBSUJBMMZTVQQPSUFECZ+45 $3&45 (SBOU/VNCFS+1.+$3 +BQBO5IJTXPSLXBTBMTP QBSUJBMMZTVQQPSUFECZUIF.JOJTUSZPG&EVDBUJPO $VMUVSF 4QPSUT

    4DJFODFBOE5FDIOPMPHZ .&95 BT l4FNJOBM*TTVFPO1PTU,$PNQVUFSz .BTBUPTIJ)JEBLB :VJDIJSP,JLVSB :PTIJUBLB6TIJLV 5BUTVZB)BSBEB ౦ژେֶ ৘ใཧ޻ֶܥݚڀՊ ݪాɾڇٱݚڀࣨ .BDIJOF*OUFMMJHFODF-BC.*-
  2. 8FC%// ΢Σϒϒϥ΢β޲͚ਂ૚ֶशϞσϧߴ଎࣮ߦϑϨʔϜϫʔΫ ֶशࡁΈϞσϧΛਪ࿦ϑΣʔζʹ࠷దԽ͠΢Σϒϒϥ΢β্Ͱ࣮ߦ Ϟσϧఆٛ ม ׵ ࠷ ద Խ ੜ

    ੒ (16#BDLFOE ֶशࡁΈ ύϥϝʔλ ύϥϝʔλ (SBQI5SBOTQJMFS ݚڀ੒Ռͷਝ଎ͳެ։ %//Λ༻͍ͨΞϓϦͷσϞ ར༻ྫ IUUQTNJMUPLZPHJUIVCJPXFCEOO %FTDSJQUPS3VOOFS $16#BDLFOE (SBQI %FTDSJQUPS
  3. ϕϯνϚʔΫ 3FT/FU<)F > ը૾ຕͷਪ࿦ʹཁ͢Δ࣌ؒ • ೖྗαΠζ • ൺֱର৅ • ,FSBTKT

    IUUQTHJUIVCDPNUSBOTDSBOJBMLFSBTKT LFSBTͷֶशࡁΈϞσϧΛ࣮ߦՄೳͳϥΠϒϥϦɻ8FC(-ʹΑΔ(16ར༻ɻ (1,224,224,3)
  4. ϕϯνϚʔΫ 3FT/FUը૾ຕͷਪ࿦ʹཁ͢Δ࣌ؒ       ,FSBTKT 8FC%//

    ॴཁ࣌ؒ <NTຕ> $ISPNF 'JSF'PY 4BGBSJ51         ,FSBTKT 8FC%// ॴཁ࣌ؒ<NTຕ> $ISPNF 'JSF'PY 4BGBSJ51 8FC%// 8FC%// .BD#PPL1SP&BSMZ()[*OUFM$PSFJ$16*OUFM*SJT(SBQIJDT(16 $16 (16 1$
  5. ϕϯνϚʔΫ 3FT/FUը૾ຕͷਪ࿦ʹཁ͢Δ࣌ؒ        

     ,FSBTKT 8FC%// ॴཁ࣌ؒ <NTຕ> $ISPNF 'JSF'PY 4BGBSJ         ,FSBTKT 8FC%// ॴཁ࣌ؒ <NTຕ> $ISPNF 'JSF'PY 4BGBSJ 8FC%// 8FC%// ˞ܽଛՕॴ͸࣮ߦෆՄೳɾαϙʔτ֎ 91&3*"9;"OESPJEY4OBQESBHPO()[Y$16"ESFOP(16 $ISPNF 'JSF'PY J1IPOFJ04"QQMF" 4BGBSJ εϚʔτϑΥϯ $16 (16
  6. ࢖༻ํ๏ ֶशࡁΈϞσϧΛม׵͢Δ,FSBT from keras.applications import resnet50 from webdnn.frontend.keras import KerasConverter

    from webdnn.backend import generate_descriptor graph = KerasConverter().convert(resnet50.ResNet50()) # (1) generate_descriptor('webgl', graph).save('model') # (2)   ,FSBT keras.Model (SBQI%FTDSJQUPS GPS8FC(- #BDLFOE 8FC%//*3 1ZUIPO
  7. ࢖༻ํ๏ ֶशࡁΈϞσϧΛม׵͢Δ$IBJOFS import numpy as np from chainer import Variable,

    links as L model = L.ResNet50Layers() ܭࢉάϥϑΛ࡞ΔͨΊʹμϛʔσʔλΛྲྀ͢ x = Variable(np.zeros((1, 3, 224, 224), dtype=np.float32)) y = model(x, layers=['prob'])['prob'] $IBJOFS chainer.Chain (SBQI%FTDSJQUPS GPS8FC(- #BDLFOE   8FC%//*3 1ZUIPO
  8. ࢖༻ํ๏ ֶशࡁΈϞσϧΛม׵͢Δ$IBJOFS from webdnn.frontend.chainer import ChainerConverter from webdnn.backend import generate_descriptor

    graph = ChainerConverter().convert([x], [y]) # (1) generate_descriptor('webgl', graph).save('model') # (2) $IBJOFS chainer.Chain (SBQI%FTDSJQUPS GPS8FC(- #BDLFOE   8FC%//*3 1ZUIPO
  9. ࢖༻ํ๏ ม׵݁ՌΛϒϥ΢β͔ΒಡΈࠐΈ࣮ߦ͢Δ let runner = await WebDNN.load('model'); let x =

    runner.getInputViews()[0]; let y = runner.getOutputViews()[0];  جຊతͳલɾޙॲཧʹؔ͢ΔαϙʔτϥΠϒϥϦ͕෇ଐ x.set(await WebDNN.Image.getImageArray('cat.jpg', { dstW: 224, dstH: 224 })); await runner.run(); ࣮ߦ console.log(y.toActual()); JavaScript +BWB4DSJQU
  10. 8FC(- • ΢Σϒϒϥ΢βʹ͸(1(16༻"1*͕ଘࡏ͠ͳ͍ • 8FC(- • ը૾ॲཧ༻"1*Ͱ͋Γ$PNQVUJOH4IBEFS͸αϙʔτ͞Εͯͳ͍ • ؤுΕ͹(1(16΋Ͳ͖΋Մೳ •

    ༷ʑͳ੍໿ͷ͍ͤͰΦʔόʔϔου͕େ͖͍ • 8FC(- • (1(16͠΍͘͢ͳΔػೳ͕௥Ճ • Φʔόʔϔου͕͔ͳΓݮΒͤΔ
  11. 8FC(16 • IUUQTXFCLJUPSHXQDPOUFOUVQMPBETXFCHQVBQJQSPQPTBMIUNM • "QQMF͕த৺ͱͳͬͯٞ࿦தͷ࣍ظ"1* • $PNQVUJOH4IBEFSΛαϙʔτɺඈ༂తͳύϑΥʔϚϯε޲্ • 8FC,JUʹ࣮૷ͨ͠ •

    NBD04)JHI4JFSSBJ04ͷTBGBSJ͔Β࢖༻Մೳ 8(ͷٞࣄ࿥Λݟ͍ͯΔͱ ʮ΍ͬͺ$PNQVUJOH4IBEFSෆཁͰ͸ʁʯ ͱ͍͏ٞ࿦͕ͳ͞Ε͓ͯΓɺएׯӢߦ͖͕ո͍͠
  12. ϝϞϦଳҬෆ଍ͷରࡦ ,FSOFM.FSHJOH • ແବͳಡΈॻ͖ΛݮΒ͠ԋࢉີ౓ΛߴΊ • ෳ਺ͷΧʔωϧΛ̍ͭʹϚʔδ͢Δ • ൃߦͷΦʔόʔϔου΋ݮΒͤΔ float x1

    = X1[i]; float x2 = X2[i]; H[i] = x1 + x2; float h = H[i]; Y[i] = h>0?h:0; float x1 = X1[i]; float x2 = X2[i]; float h = x1 + x2; Y[i] = h>0?h:0; 3FBE 3FBE 3FBE 8SJUF 8SJUF 3FBE 3FBE 8SJUF
  13. %BUB'PSNBU 0QUJNJ[BUJPO %BUB'PSNBU 0QUJNJ[BUJPOʹΑΔ࠷దԽͷޮՌ 4BGBSJ51.BD#PPL1SP&BSMZ 3FT/FU࣮ߦ࣌ؒ <NTJNBHF> WebGPU backend 最適化なし

     WebGPU backend + Elementwise Kernel Merging  WebGPU backend + Elementwise Kernel Merging %BUB'PSNBU 0QUJNJ[BUJPO