Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Tensorコアを使った PyTorch の高速化について
Search
fam_taro
April 08, 2019
Technology
4.1k
4
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Tensorコアを使った PyTorch の高速化について
fam_taro
April 08, 2019
More Decks by fam_taro
See All by fam_taro
NeRFの概要と 派生系についてのふんわり紹介
fam_taro
3
4.4k
実践 PyTorch Lightning (2019/11/30 分析コンペLT会 #1)
fam_taro
3
4.6k
Paper:ShapeMask
fam_taro
0
89
Summary: Objects as Points
fam_taro
0
3.2k
Paper-Survey: Objects as Points
fam_taro
0
2.4k
Sequence to Sequence Learning with Neural Networks
fam_taro
1
1.1k
Other Decks in Technology
See All in Technology
新しいUbuntu/GNOMEが使いたいからXからWaylandへ移行頑張ってるの巻 2026-06-20
nobutomurata
0
160
Oracle Cloud Infrastructure:2026年6月度サービス・アップデート
oracle4engineer
PRO
0
270
40代で“やっとエンジニアになれた”――閉じた学びを開き、空の青さを知る / 20260628 Naoki Takahashi
shift_evolve
PRO
4
680
コミットの「なぜ」を読む
ota1022
0
120
SONiCの統計情報を取得したい
sonic
0
290
Agile and AI Redmine Japan 2026
hiranabe
4
460
「勝手に広まる」人気 AI エージェントを爆速で作ろう!(AWS Summit Japan 2026講演資料)
minorun365
PRO
10
2.4k
AWS Security Agent といっしょに脅威モデリングをやってみよう
amarelo_n24
1
200
クレデンシャル流出 ― 攻撃 3 時間 vs 復旧 10 時間。この非対称性にどう備えるか
kazzpapa3
3
530
Oracle AI Database@Azure:サービス概要のご紹介
oracle4engineer
PRO
6
2k
技術・能力を向上する原理原則 #きのこセッションa #きのこ2026
bash0c7
0
110
AIはどのように 組織のアジリティを変えるのか?
junki
4
1.1k
Featured
See All Featured
Building AI with AI
inesmontani
PRO
1
1.1k
Kristin Tynski - Automating Marketing Tasks With AI
techseoconnect
PRO
0
270
Rebuilding a faster, lazier Slack
samanthasiow
85
9.5k
Build your cross-platform service in a week with App Engine
jlugia
234
18k
Leveraging Curiosity to Care for An Aging Population
cassininazir
1
270
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
47
8.2k
How to build an LLM SEO readiness audit: a practical framework
nmsamuel
1
780
Hiding What from Whom? A Critical Review of the History of Programming languages for Music
tomoyanonymous
2
870
Tell your own story through comics
letsgokoyo
1
960
Getting science done with accelerated Python computing platforms
jacobtomlinson
2
240
Navigating Weather and Climate Data
rabernat
0
230
How Fast Is Fast Enough? [PerfNow 2025]
tammyeverts
3
610
Transcript
'BTUFS1ZUIPO.FFUVQ-5ձ 5FOTPSίΞΛͬͨ 1Z5PSDIͷߴԽ !GBN@UBSP
"HFOEB 5FOTPSίΞ JTԿ 1Z5PSDIͰ 5FOTPSίΞ͏ʹ ਪ ༧ଌ
ͰͬͯΈͨ .%FU ֶशͰ 5FOTPSίΞΛ͍͍ͨ߹
5FOTPSίΞ JTԿ CJUුಈখ '1 ͷߦྻԋࢉʹಛԽͨ͠ܭࢉ༻ίΞ 1Z5PSDI ͷਂֶशϑϨʔϜϫʔΫΛ͏ͱɺσϑΥϧτͰ CJUුಈখ
ʢ'1 Ͱॾʑͷܭࢉ͕ߦΘΕΔ /7*%*"(16ͷ͏ͪ 7PMUBੈҎ߱ͷҰ෦ʹ͍ͬͯΔ 5VSJOHͰ 359͡Όͳ͍ͱͬͯͳ͍ (59 5Jʹ͍ͬͯͳ͍ 1BTDBM .BYXFMM ,FQMFSʹແ͍ ແ࣊൵ (16ྫ 5*5"/7 (F'PSDF359γϦʔζ 5J 5*5"/359 5FTMB7 ($1ͰࢼͤΔ ࠓޙΑΓීٴ͢Δͱݸਓతʹࢥͬͯ·͢ ࢀߟϦϯΫ IUUQTXJLJXJLJKQOWJEJBWHB(16&##&"&#&""&"" IUUQTKBXJLJQFEJBPSHXJLJ/7*%*"@(F'PSDF https://www.nvidia.com/content/apac/gtc/ja/pdf/2017/1040.pdf
5FOTPSίΞ JTԿ 5FOTPSίΞΘͳ͍ͱ '1ʹͯ͠ ഒఔ 5FOTPSίΞ͏ͱ '1ʹͨ͠Β ഒ
Ҏ্ʂ ͨͩ͠͏ͨΊʹ੍͕ଟ͍ʜ $6%"Ҏ߱ DV%// Ҏ߱ '1Ͱѻ͏͜ͱΛ໌ه ϥΠϒϥϦͷίϚϯυͰ $POWͷೖྗɾग़ྗνϟϯωϧ͕ ͷഒ શͯͷ͕ରԠ͍ͯ͠ΔΘ͚Ͱͳ͍ Ή͠ΖҰ෦͔͠ ͋͘·Ͱݱ࣌ͳͷͰࠓޙ؇͞ΕΔ͔ʜ https://www.slideshare.net/NVIDIAJapan/chainer-tensor-fp16
1Z5PSDIͰ 5FOTPSίΞ͏ʹ '1Λ͏͜ͱΛ໌ه͢ΕϑϨʔϜϫʔΫ͕উखʹ ͬͯ͘ΕΔ ͜ͱ͕ଟ͍ ࠷ۙͷόʔδϣϯʹ͠ͳ͍ͱ͍͚ͳ͍͕ʜ 1Z5PSDIͰʜ
.PEFMͱ *OQVUʹର͠ lIBMG zΛ͚Δ ਫ਼ʹ͢Δͱ͍͏ҙຯ '1ʹ͢Δ 0VUQVU '1ͱݶΒͳ͍ͷͰҙ $POWͷೖྗͱग़ྗνϟϯωϧΛ ͷഒʹ͢Δ ຯʹ੍͖͍ͭ 1SFUSBJONPEFMͩͱ͏·͑͘ͳ͍έʔε͕ग़ͯ͘Δ ໌ࣔతʹ 5FOTPSίΞΛ͏Α͏ʹͰ͖ͳ͍ Ά͍ʁ
ਪ ༧ଌ ͰͬͯΈͨ .%FU .%FUͱ จIUUQTRJKJF[IBPHJUIVCJPJNHTNEFUQEG ࣮IUUQTHJUIVCDPNRJKJF[IBP.%FU
࠷ۙग़͖ͯͨݕग़Ϟσϧ :0-0WΑΓͯ͘ਫ਼͕ߴ͍ Β͍͠ ࣗͷσʔλͰֶशશ͘Ͱ͖ͯͳ͍Ͱ͕͢! ࣮ݧ݅ $0$0EBUBTFU %FUFDUJPOͰϝδϟʔͳͭ ֶशࡁΈϞσϧ Y Λ༻ ެࣜϦϙδτϦʹ͋Δ UFTUQZ Λ༻ NPEFM JOQVU ͷ࣌ؒͱޙॲཧ /.4 ͷ࣌ؒΛܭଌ ͨ࣌ؒ͠Λͬͯ '14Λग़͢ ߴ͍΄Ͳྑ͍ ຕͷσʔλʹରͯ͠ਪ ༧ଌ Λͯ͠ฏۉ࣌ؒΛͱΔ ݩจ ຕͰऔ͍ͬͯͨ Ϟσϧͷਫ਼ N"1
ਪ ༧ଌ ͰͬͯΈͨ .%FU ࣮ߦ࣌ؒ 5FOTPSίΞΛ͍ͬͯΔ͔֬ೝ͢ΔͨΊʹͬͨίϚϯυ UJNFUJNF ؔ
(16བྷΉͨΊԼهͷํ͕ྑͦ͞͏Ͱ͕͢ࠓճ UJNF ؔͰ࣌ؒͱΓ·ͨ͠! UPSDIDVEB&WFOU FOBCMF@UJNJOH5SVF QSPGJMFS QZUIPONUPSDIVUJMTCPUUMFOFDL UFTUQZ ddd 1Z5PSDI ͷఏڙ͢Δ QSPGJMFS ݕग़ͷޙॲཧ ࠓճ /.4 ͕ϘτϧωοΫʹͳ͍ͬͯͳ͍͔֬ೝ͢ΔͨΊʹ༻ OWQSPG QZUIPOddd /7*%*"ͷఏڙ͢Δ QSPGJMFSίϚϯυɻ5FOTPSίΞ͍ͬͯΔ͔ݟΕΔ OWDD ddd ະ༻ OWQSPG ͷίϚϯυΛ (6*ͰϦονʹݟΕΔΒ͍͠ɻͨͩ͠ΞϓϦαΠζσΧ͍ (#
ਪ ༧ଌ ͰͬͯΈͨ .%FU ࣮ߦ݁Ռ (16 N"1 ਫ਼
%FUFDUUJNF QFS JNBHF<NT> /NT UJNF 1FS JNBHF<NT> 5PUBM<NT> '14 ݩจ ϦϙδτϦ্ͷ 5JUBO9 1"4$"- 1Z5PSDI ެࣜࢦఆόʔδϣϯ 5JUBO7 1Z5PSDI '1 5JUBO7 1Z5PSDIQPTU 5JUBO7 1Z5PSDIQPTU '1 5JUBO7
ਪ ༧ଌ ͰͬͯΈͨ .%FU ࣮ߦ݁Ռ (16 N"1 ਫ਼
%FUFDUUJNF QFS JNBHF<NT> /NT UJNF 1FS JNBHF<NT> 5PUBM<NT> '14 ݩจ ϦϙδτϦ্ͷ 5JUBO9 1"4$"- 1Z5PSDI ެࣜࢦఆόʔδϣϯ 5JUBO7 1Z5PSDI '1 5JUBO7 1Z5PSDIQPTU 5JUBO7 1Z5PSDIQPTU '1 5JUBO7 l1Z5PSDIʹͯ͠z ͔ͭ l'1͏zͱ࠷͘ͳΔ ˠ ࠷ॳͷ ഒ ͔Ζ͏ͯ͡ݩจͷ '14Λ͑ͨ ༧ଌ࣌ͷΈͳΒ '1ʹͯ͠ ਫ਼มΘΒͳ͍ 1Z5PSDIͷόʔδϣϯ্͛ͨΒ͘ ͳΔ͠ɺ'1ʹͯ͘͠ͳΔ
ֶशͰ 5FOTPSίΞΛ͍͍ͨ߹ ͱΓ͋͑ͣશ෦ '1ʹ͢Ε͍͍ΜͰ͠ΐʁˠ /Pʂ ޯܭࢉ࣌ʹ͔ͳΓӨڹ͢Δ ޯ͕ফ͑Δ߹͋Δ
ˠ ਫ਼͕େ͖͘Լ͕Δ߹͕͋Δ .JYFE1SFDJTJPO5SBJOJOH ࢀߟϦϯΫIUUQTXXXTMJEFTIBSFOFU/7*%*"+BQBODIBJOFSUFOTPSGQ '1ͱ '1ʹΑΔܭࢉΛ NJYֶͨ͠शํ๏ ۩ମతʹҎԼͷΑ͏ͳςΫχοΫ͕ඞཁ ϩεεέʔϦϯά ϩεΛཁॴཁॴͰεέʔϧ͋ͬͯ͠ޯফࣦΛ؇ '1ΣΠτߋ৽ 'PSXBSEͱ #BDLXBSE '1 6QEBUFͰ '1Λ༻ ্هΛશͯࣗͰΔͱେมʂ ֶͼ͋Δͱࢥ͍·͕͢ʜ ˠ BQFYΛ͏ͱൺֱతखܰʹͰ͖Δ IUUQTHJUIVCDPN/7*%*"BQFY /7*%*"͕ఏڙ͢Δ 1Z5PSDI༻ "VUPNBUJD.JYFE1SFDJTJPO ".1 πʔϧ ݩͷίʔυʹର͠ߦ͚ͩ͢Ͱ .JYFE1SFDJTJPO5SBJOJOHͰ͖Δͱͷ͜ͱ ͨͩ͠ JOTUBMM࣌ $6%" 1Z5PSDIͷόʔδϣϯʹؾΛ͚ͭͳ͍ͱ͍͚ͳ͍
3FGFSFODFT <൛ ػցֶशϋʔυΣΞͷ4UBUFPGUIF"SUΛߟ͑Δ d$16 (16 516Λఴ͑ͯd 2JJUB> IUUQTRJJUBDPNBSVUFNBJUFNTGCB
<5SBJOJOH/FVSBM/FUXPSLTXJUI.JYFE1SFDJTJPO /7*%*"> IUUQPOEFNBOEHQVUFDIDPOGDPNHUDUBJXBOQEG @*OUFSOBM
[email protected]
$BSJMMJ@1%''PS4IBSJOHQEG <$IBJOFS Ͱ 5FOTPSίΞ GQ Λ͍͜ͳ͢> IUUQTXXXTMJEFTIBSFOFU/7*%*"+BQBODIBJOFSUFOTPSGQ <$IBJOFSʹ͓͚ΔਂֶशͷߴԽ> IUUQTXXXOWJEJBDPNDPOUFOUBQBDHUDKBQEGQEG <70-5""/%563*/("3$)*5&$563&"/%1&3'03."/$&015*.*;"5*0/ /7*%*"> IUUQTXXXOWJEJBDPNDPOUFOUBQBDHUDKBQEGQEG <5SBJOJOHXJUI.JYFE1SFDJTJPO%FFQ-FBSOJOH4%,%PDVNFOUBUJPO> IUUQTEPDTOWJEJBDPNEFFQMFBSOJOHTELNJYFE QSFDJTJPOUSBJOJOHJOEFYIUNMQZUPSDI <(16༷Ұཡද /7*%*"(F'PSDF 8JLJ > IUUQTXJLJXJLJKQOWJEJBWHB(16&##&"&#&""&""
͓ΘΓ ͋Γ͕ͱ͏͍͟͝·ͨ͠
ิOWQSPG ͷ݁Ռ 1Z5PSDI
ิOWQSPG ͷ݁Ռ 1Z5PSDI