Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Tensorコアを使った PyTorch の高速化について
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
fam_taro
April 08, 2019
Technology
4.1k
4
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Tensorコアを使った PyTorch の高速化について
fam_taro
April 08, 2019
More Decks by fam_taro
See All by fam_taro
NeRFの概要と 派生系についてのふんわり紹介
fam_taro
3
4.4k
実践 PyTorch Lightning (2019/11/30 分析コンペLT会 #1)
fam_taro
3
4.6k
Paper:ShapeMask
fam_taro
0
89
Summary: Objects as Points
fam_taro
0
3.2k
Paper-Survey: Objects as Points
fam_taro
0
2.4k
Sequence to Sequence Learning with Neural Networks
fam_taro
1
1.1k
Other Decks in Technology
See All in Technology
10年間のブログ発信を振り返って見えたWebアプリケーションエンジニアとしての軌跡
stefafafan
0
180
「ビジネスがわかるエンジニア」とは何か?
ryooob
0
230
Lightning近況報告
kozy4324
0
220
螺旋型キャリアの生存戦略 / kinoko-conf2026
rakus_dev
1
820
ぼっちではじめた登壇が「51名」「241件」の発信に化けた
subroh0508
1
300
気軽に使える"情報のハブ"としてのNotion活用 〜フロー情報の集積点 と、 Claude Code × Notion AI〜
syucream
1
180
AI時代のコスト管理を考えよう〜明日から使える実践AWSノウハウ~
yoshimi0227
0
820
現場のトークンマネジメント
dak2
1
170
IaC コードを資産へ:AWS CDK 社内ライブラリと横断展開 / aws-summit-japan-2026
gotok365
10
1.5k
コミットの「なぜ」を読む
ota1022
0
120
Oracle AI Database@Azure:サービス概要のご紹介
oracle4engineer
PRO
6
2k
クラウドファンディング版StackChan 3体(4体)をインタラクティブな体験型作品にして展示もした話 / スタックチャンお誕生日会2026
you
PRO
0
170
Featured
See All Featured
HU Berlin: Industrial-Strength Natural Language Processing with spaCy and Prodigy
inesmontani
PRO
0
410
Measuring & Analyzing Core Web Vitals
bluesmoon
9
870
YesSQL, Process and Tooling at Scale
rocio
174
15k
New Earth Scene 8
popppiees
3
2.4k
Java REST API Framework Comparison - PWX 2021
mraible
34
9.4k
Ten Tips & Tricks for a 🌱 transition
stuffmc
0
140
How People are Using Generative and Agentic AI to Supercharge Their Products, Projects, Services and Value Streams Today
helenjbeal
1
220
Efficient Content Optimization with Google Search Console & Apps Script
katarinadahlin
PRO
1
630
Technical Leadership for Architectural Decision Making
baasie
3
420
How to Align SEO within the Product Triangle To Get Buy-In & Support - #RIMC
aleyda
2
1.5k
Fashionably flexible responsive web design (full day workshop)
malarkey
408
66k
Design of three-dimensional binary manipulators for pick-and-place task avoiding obstacles (IECON2024)
konakalab
0
470
Transcript
'BTUFS1ZUIPO.FFUVQ-5ձ 5FOTPSίΞΛͬͨ 1Z5PSDIͷߴԽ !GBN@UBSP
"HFOEB 5FOTPSίΞ JTԿ 1Z5PSDIͰ 5FOTPSίΞ͏ʹ ਪ ༧ଌ
ͰͬͯΈͨ .%FU ֶशͰ 5FOTPSίΞΛ͍͍ͨ߹
5FOTPSίΞ JTԿ CJUුಈখ '1 ͷߦྻԋࢉʹಛԽͨ͠ܭࢉ༻ίΞ 1Z5PSDI ͷਂֶशϑϨʔϜϫʔΫΛ͏ͱɺσϑΥϧτͰ CJUුಈখ
ʢ'1 Ͱॾʑͷܭࢉ͕ߦΘΕΔ /7*%*"(16ͷ͏ͪ 7PMUBੈҎ߱ͷҰ෦ʹ͍ͬͯΔ 5VSJOHͰ 359͡Όͳ͍ͱͬͯͳ͍ (59 5Jʹ͍ͬͯͳ͍ 1BTDBM .BYXFMM ,FQMFSʹແ͍ ແ࣊൵ (16ྫ 5*5"/7 (F'PSDF359γϦʔζ 5J 5*5"/359 5FTMB7 ($1ͰࢼͤΔ ࠓޙΑΓීٴ͢Δͱݸਓతʹࢥͬͯ·͢ ࢀߟϦϯΫ IUUQTXJLJXJLJKQOWJEJBWHB(16&##&"&#&""&"" IUUQTKBXJLJQFEJBPSHXJLJ/7*%*"@(F'PSDF https://www.nvidia.com/content/apac/gtc/ja/pdf/2017/1040.pdf
5FOTPSίΞ JTԿ 5FOTPSίΞΘͳ͍ͱ '1ʹͯ͠ ഒఔ 5FOTPSίΞ͏ͱ '1ʹͨ͠Β ഒ
Ҏ্ʂ ͨͩ͠͏ͨΊʹ੍͕ଟ͍ʜ $6%"Ҏ߱ DV%// Ҏ߱ '1Ͱѻ͏͜ͱΛ໌ه ϥΠϒϥϦͷίϚϯυͰ $POWͷೖྗɾग़ྗνϟϯωϧ͕ ͷഒ શͯͷ͕ରԠ͍ͯ͠ΔΘ͚Ͱͳ͍ Ή͠ΖҰ෦͔͠ ͋͘·Ͱݱ࣌ͳͷͰࠓޙ؇͞ΕΔ͔ʜ https://www.slideshare.net/NVIDIAJapan/chainer-tensor-fp16
1Z5PSDIͰ 5FOTPSίΞ͏ʹ '1Λ͏͜ͱΛ໌ه͢ΕϑϨʔϜϫʔΫ͕উखʹ ͬͯ͘ΕΔ ͜ͱ͕ଟ͍ ࠷ۙͷόʔδϣϯʹ͠ͳ͍ͱ͍͚ͳ͍͕ʜ 1Z5PSDIͰʜ
.PEFMͱ *OQVUʹର͠ lIBMG zΛ͚Δ ਫ਼ʹ͢Δͱ͍͏ҙຯ '1ʹ͢Δ 0VUQVU '1ͱݶΒͳ͍ͷͰҙ $POWͷೖྗͱग़ྗνϟϯωϧΛ ͷഒʹ͢Δ ຯʹ੍͖͍ͭ 1SFUSBJONPEFMͩͱ͏·͑͘ͳ͍έʔε͕ग़ͯ͘Δ ໌ࣔతʹ 5FOTPSίΞΛ͏Α͏ʹͰ͖ͳ͍ Ά͍ʁ
ਪ ༧ଌ ͰͬͯΈͨ .%FU .%FUͱ จIUUQTRJKJF[IBPHJUIVCJPJNHTNEFUQEG ࣮IUUQTHJUIVCDPNRJKJF[IBP.%FU
࠷ۙग़͖ͯͨݕग़Ϟσϧ :0-0WΑΓͯ͘ਫ਼͕ߴ͍ Β͍͠ ࣗͷσʔλͰֶशશ͘Ͱ͖ͯͳ͍Ͱ͕͢! ࣮ݧ݅ $0$0EBUBTFU %FUFDUJPOͰϝδϟʔͳͭ ֶशࡁΈϞσϧ Y Λ༻ ެࣜϦϙδτϦʹ͋Δ UFTUQZ Λ༻ NPEFM JOQVU ͷ࣌ؒͱޙॲཧ /.4 ͷ࣌ؒΛܭଌ ͨ࣌ؒ͠Λͬͯ '14Λग़͢ ߴ͍΄Ͳྑ͍ ຕͷσʔλʹରͯ͠ਪ ༧ଌ Λͯ͠ฏۉ࣌ؒΛͱΔ ݩจ ຕͰऔ͍ͬͯͨ Ϟσϧͷਫ਼ N"1
ਪ ༧ଌ ͰͬͯΈͨ .%FU ࣮ߦ࣌ؒ 5FOTPSίΞΛ͍ͬͯΔ͔֬ೝ͢ΔͨΊʹͬͨίϚϯυ UJNFUJNF ؔ
(16བྷΉͨΊԼهͷํ͕ྑͦ͞͏Ͱ͕͢ࠓճ UJNF ؔͰ࣌ؒͱΓ·ͨ͠! UPSDIDVEB&WFOU FOBCMF@UJNJOH5SVF QSPGJMFS QZUIPONUPSDIVUJMTCPUUMFOFDL UFTUQZ ddd 1Z5PSDI ͷఏڙ͢Δ QSPGJMFS ݕग़ͷޙॲཧ ࠓճ /.4 ͕ϘτϧωοΫʹͳ͍ͬͯͳ͍͔֬ೝ͢ΔͨΊʹ༻ OWQSPG QZUIPOddd /7*%*"ͷఏڙ͢Δ QSPGJMFSίϚϯυɻ5FOTPSίΞ͍ͬͯΔ͔ݟΕΔ OWDD ddd ະ༻ OWQSPG ͷίϚϯυΛ (6*ͰϦονʹݟΕΔΒ͍͠ɻͨͩ͠ΞϓϦαΠζσΧ͍ (#
ਪ ༧ଌ ͰͬͯΈͨ .%FU ࣮ߦ݁Ռ (16 N"1 ਫ਼
%FUFDUUJNF QFS JNBHF<NT> /NT UJNF 1FS JNBHF<NT> 5PUBM<NT> '14 ݩจ ϦϙδτϦ্ͷ 5JUBO9 1"4$"- 1Z5PSDI ެࣜࢦఆόʔδϣϯ 5JUBO7 1Z5PSDI '1 5JUBO7 1Z5PSDIQPTU 5JUBO7 1Z5PSDIQPTU '1 5JUBO7
ਪ ༧ଌ ͰͬͯΈͨ .%FU ࣮ߦ݁Ռ (16 N"1 ਫ਼
%FUFDUUJNF QFS JNBHF<NT> /NT UJNF 1FS JNBHF<NT> 5PUBM<NT> '14 ݩจ ϦϙδτϦ্ͷ 5JUBO9 1"4$"- 1Z5PSDI ެࣜࢦఆόʔδϣϯ 5JUBO7 1Z5PSDI '1 5JUBO7 1Z5PSDIQPTU 5JUBO7 1Z5PSDIQPTU '1 5JUBO7 l1Z5PSDIʹͯ͠z ͔ͭ l'1͏zͱ࠷͘ͳΔ ˠ ࠷ॳͷ ഒ ͔Ζ͏ͯ͡ݩจͷ '14Λ͑ͨ ༧ଌ࣌ͷΈͳΒ '1ʹͯ͠ ਫ਼มΘΒͳ͍ 1Z5PSDIͷόʔδϣϯ্͛ͨΒ͘ ͳΔ͠ɺ'1ʹͯ͘͠ͳΔ
ֶशͰ 5FOTPSίΞΛ͍͍ͨ߹ ͱΓ͋͑ͣશ෦ '1ʹ͢Ε͍͍ΜͰ͠ΐʁˠ /Pʂ ޯܭࢉ࣌ʹ͔ͳΓӨڹ͢Δ ޯ͕ফ͑Δ߹͋Δ
ˠ ਫ਼͕େ͖͘Լ͕Δ߹͕͋Δ .JYFE1SFDJTJPO5SBJOJOH ࢀߟϦϯΫIUUQTXXXTMJEFTIBSFOFU/7*%*"+BQBODIBJOFSUFOTPSGQ '1ͱ '1ʹΑΔܭࢉΛ NJYֶͨ͠शํ๏ ۩ମతʹҎԼͷΑ͏ͳςΫχοΫ͕ඞཁ ϩεεέʔϦϯά ϩεΛཁॴཁॴͰεέʔϧ͋ͬͯ͠ޯফࣦΛ؇ '1ΣΠτߋ৽ 'PSXBSEͱ #BDLXBSE '1 6QEBUFͰ '1Λ༻ ্هΛશͯࣗͰΔͱେมʂ ֶͼ͋Δͱࢥ͍·͕͢ʜ ˠ BQFYΛ͏ͱൺֱతखܰʹͰ͖Δ IUUQTHJUIVCDPN/7*%*"BQFY /7*%*"͕ఏڙ͢Δ 1Z5PSDI༻ "VUPNBUJD.JYFE1SFDJTJPO ".1 πʔϧ ݩͷίʔυʹର͠ߦ͚ͩ͢Ͱ .JYFE1SFDJTJPO5SBJOJOHͰ͖Δͱͷ͜ͱ ͨͩ͠ JOTUBMM࣌ $6%" 1Z5PSDIͷόʔδϣϯʹؾΛ͚ͭͳ͍ͱ͍͚ͳ͍
3FGFSFODFT <൛ ػցֶशϋʔυΣΞͷ4UBUFPGUIF"SUΛߟ͑Δ d$16 (16 516Λఴ͑ͯd 2JJUB> IUUQTRJJUBDPNBSVUFNBJUFNTGCB
<5SBJOJOH/FVSBM/FUXPSLTXJUI.JYFE1SFDJTJPO /7*%*"> IUUQPOEFNBOEHQVUFDIDPOGDPNHUDUBJXBOQEG @*OUFSOBM
[email protected]
$BSJMMJ@1%''PS4IBSJOHQEG <$IBJOFS Ͱ 5FOTPSίΞ GQ Λ͍͜ͳ͢> IUUQTXXXTMJEFTIBSFOFU/7*%*"+BQBODIBJOFSUFOTPSGQ <$IBJOFSʹ͓͚ΔਂֶशͷߴԽ> IUUQTXXXOWJEJBDPNDPOUFOUBQBDHUDKBQEGQEG <70-5""/%563*/("3$)*5&$563&"/%1&3'03."/$&015*.*;"5*0/ /7*%*"> IUUQTXXXOWJEJBDPNDPOUFOUBQBDHUDKBQEGQEG <5SBJOJOHXJUI.JYFE1SFDJTJPO%FFQ-FBSOJOH4%,%PDVNFOUBUJPO> IUUQTEPDTOWJEJBDPNEFFQMFBSOJOHTELNJYFE QSFDJTJPOUSBJOJOHJOEFYIUNMQZUPSDI <(16༷Ұཡද /7*%*"(F'PSDF 8JLJ > IUUQTXJLJXJLJKQOWJEJBWHB(16&##&"&#&""&""
͓ΘΓ ͋Γ͕ͱ͏͍͟͝·ͨ͠
ิOWQSPG ͷ݁Ռ 1Z5PSDI
ิOWQSPG ͷ݁Ռ 1Z5PSDI