Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
【動画あり】Transformer論文解説
Search
数理の弾丸
July 16, 2024
Technology
0
140
【動画あり】Transformer論文解説
下記YouTube動画で使用したスライド資料です。
https://youtu.be/6tcjwdanedU
数理の弾丸
July 16, 2024
Tweet
Share
More Decks by 数理の弾丸
See All by 数理の弾丸
RAG:チャットボットの能力を底上げする技術
mathbullet
0
190
ゼロから始める大規模言語モデル入門
mathbullet
0
140
[動画あり] 線形回帰を題材に汎用的な理解を身につける:座学編
mathbullet
0
69
[動画あり] AI入門特急コース
mathbullet
0
150
Other Decks in Technology
See All in Technology
第64回コンピュータビジョン勉強会「The PanAf-FGBG Dataset: Understanding the Impact of Backgrounds in Wildlife Behaviour Recognition」
x_ttyszk
0
250
Transformerを用いたアイテム間の 相互影響を考慮したレコメンドリスト生成
recruitengineers
PRO
2
480
QAを早期に巻き込む”って どうやるの? モヤモヤから抜け出す実践知
moritamasami
2
100
AIコードアシスタントとiOS開発
jollyjoester
0
140
Deep Security Conference 2025:生成AI時代のセキュリティ監視 /dsc2025-genai-secmon
mizutani
4
3.1k
スタックチャン家庭用アシスタントへの道
kanekoh
0
130
LLM拡張解体新書/llm-extension-deep-dive
oracle4engineer
PRO
24
6.5k
〜『世界中の家族のこころのインフラ』を目指して”次の10年”へ〜 SREが導いたグローバルサービスの信頼性向上戦略とその舞台裏 / Towards the Next Decade: Enhancing Global Service Reliability
kohbis
3
1.5k
60以上のプロダクトを持つ組織における開発者体験向上への取り組み - チームAPIとBackstageで構築する組織の可視化基盤 - / sre next 2025 Efforts to Improve Developer Experience in an Organization with Over 60 Products
vtryo
3
2k
All About Sansan – for New Global Engineers
sansan33
PRO
1
1.2k
Contract One Engineering Unit 紹介資料
sansan33
PRO
0
6.9k
ABEMAの本番環境負荷試験への挑戦
mk2taiga
5
1.3k
Featured
See All Featured
How to train your dragon (web standard)
notwaldorf
96
6.1k
Build your cross-platform service in a week with App Engine
jlugia
231
18k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
10
980
Raft: Consensus for Rubyists
vanstee
140
7k
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
31
2.4k
A better future with KSS
kneath
238
17k
Fireside Chat
paigeccino
37
3.5k
Building a Scalable Design System with Sketch
lauravandoore
462
33k
Making the Leap to Tech Lead
cromwellryan
134
9.4k
The Pragmatic Product Professional
lauravandoore
35
6.7k
What’s in a name? Adding method to the madness
productmarketing
PRO
23
3.5k
VelocityConf: Rendering Performance Case Studies
addyosmani
332
24k
Transcript
ࠓճͷ༰ ࠷ॳͷϖʔδ ༰ղઆ จಡΉͱ͖ʹԿΛߟ͍͑ͯΔ͔ʁ ͦͷޙͷల։ ͜ͷจ୯ମͷཧղʹͱͲ·Βͣ จͷಡΈํɾͰͷҐஔ͚ΛΔ 5SBOTGPSNFSఏҊจΛಡΉ 7BTXBOJ
"TIJTI FUBM"UUFOUJPOJTBMMZPVOFFE"EWBODFTJOOFVSBMJOGPSNBUJPOQSPDFTTJOHTZTUFNT
ͳͥ͜ͷจ͕ॏཁͳͷ͔ʁ
5SBOTGPSNFSͷԠ༻ൣғ ※: https://blog.google/products/search/search-language-understanding-bert/ FUD 5SBOTGPSNFS ςΩετ༁Λओ؟ͱͯ͠ఏҊ #&35 (15 ςΩετྨFUD
ςΩετੜ ෦ΞʔΩςΫνϟͷ࠾༻ ը૾ͷద༻ 7J5 %JGGVTJPO 5SBOTGPSNFS ը૾ྨFUD ը૾ੜ $IBU(15 -MBNB 4UBCMF%JGGVTJPO 4PSB (PPHMFݕࡧ˞ $-*1 ۃΊͯൣғʹج൫ٕज़ͱͯ͠׆༂
ਓೳͷจΛಡΉΓޱ
ਓೳͷจΛಡΉΓޱ ର߅അԿ͔ ओఏҊԿ͔ ධՁͷ8IBU3FTVMU
ਓೳͷจΛಡΉΓޱ ର߅അԿ͔ ओఏҊԿ͔ ධՁͷ8IBU3FTVMU *OUSPEVDUJPO #BDLHSPVOE .PEFM"SDIJUFDUVSF
8IZ4FMG"UUFOUJPO 5SBJOJOH 3FTVMUT
จͷߏ ΞϒετϥΫτ ΠϯτϩμΫγϣϯ ؔ࿈ݚڀ ఏҊख๏ ࣮ݧઃఆɾ݁Ռɾٞ ݁
จͷߏ ΞϒετϥΫτ ΠϯτϩμΫγϣϯ ؔ࿈ݚڀ ఏҊख๏ ࣮ݧઃఆɾ݁Ռɾٞ ݁ ⁞֓ཁΛ௫Ή
จͷߏ ΞϒετϥΫτ ΠϯτϩμΫγϣϯ ؔ࿈ݚڀ ఏҊख๏ ࣮ݧઃఆɾ݁Ռɾٞ ݁ ⁞֓ཁΛ௫Ή ओுΛ௫Ή
จͷߏ ΞϒετϥΫτ ΠϯτϩμΫγϣϯ ؔ࿈ݚڀ ఏҊख๏ ࣮ݧઃఆɾ݁Ռɾٞ ݁ ⁞֓ཁΛ௫Ή ओுΛ௫Ή
ॏΈ͚ͯ͠ಡΉ
ਓೳͷจΛಡΉΓޱ ର߅അԿ͔ ओఏҊԿ͔ ධՁͷ8IBU3FTVMU *OUSPEVDUJPO #BDLHSPVOE .PEFM"SDIJUFDUVSF
8IZ4FMG"UUFOUJPO 5SBJOJOH 3FTVMUT
ܥྻϞσϦϯά
ܥྻϞσϦϯά ॱংͷ͋Δཁૉͷ࿈ͳΓͱΈͳͤΔͷΛܥྻʢTFRVFODFʣͱݺͼɺ ͜ΕΛରͱ͢ΔϞσϦϯάΛܥྻϞσϦϯάͱݺͿ
ܥྻϞσϦϯά ॱংͷ͋Δཁૉͷ࿈ͳΓͱΈͳͤΔͷΛܥྻʢTFRVFODFʣͱݺͼɺ ͜ΕΛରͱ͢ΔϞσϦϯάΛܥྻϞσϦϯάͱݺͿ ྫ ༁ ҙػߏ͑͋͞Εे "UUFOUJPOJTBMMZPVOFFE ྫ ςΩετੜ
Ͳ͏ͧΑΖ͓͘͠ئ͍͠·͢ɻ Կ͔࣭͝ϦΫΤετ͕͋Εڭ͍͑ͯͩ͘͞ɻ ΑΖ͘͠པΉ ྫ ߏจղੳ 4 /1ΑΖ͘͠ 71པΉ ΑΖ͘͠པΉ
ܥྻϞσϦϯά ॱংͷ͋Δཁૉͷ࿈ͳΓͱΈͳͤΔͷΛܥྻʢTFRVFODFʣͱݺͼɺ ͜ΕΛରͱ͢ΔϞσϦϯάΛܥྻϞσϦϯάͱݺͿ ྫ ༁ ҙػߏ͑͋͞Εे "UUFOUJPOJTBMMZPVOFFE ྫ ςΩετੜ
Ͳ͏ͧΑΖ͓͘͠ئ͍͠·͢ɻ Կ͔࣭͝ϦΫΤετ͕͋Εڭ͍͑ͯͩ͘͞ɻ ΑΖ͘͠པΉ ྫ ߏจղੳ 4 /1ΑΖ͘͠ 71པΉ ΑΖ͘͠པΉ TPVSDF
ܥྻϞσϦϯά ॱংͷ͋Δཁૉͷ࿈ͳΓͱΈͳͤΔͷΛܥྻʢTFRVFODFʣͱݺͼɺ ͜ΕΛରͱ͢ΔϞσϦϯάΛܥྻϞσϦϯάͱݺͿ ྫ ༁ ҙػߏ͑͋͞Εे "UUFOUJPOJTBMMZPVOFFE ྫ ςΩετੜ
Ͳ͏ͧΑΖ͓͘͠ئ͍͠·͢ɻ Կ͔࣭͝ϦΫΤετ͕͋Εڭ͍͑ͯͩ͘͞ɻ ΑΖ͘͠པΉ ྫ ߏจղੳ 4 /1ΑΖ͘͠ 71པΉ ΑΖ͘͠པΉ UBSHFU
ॏཁ՝ɿڑґଘͷཧղ ൴͕ॻ͍ͨͦͷຊΛɺࢲҰಡΜͩ͜ͱ͕͋Γ·ͤΜɻ తޠ ओޠ ҐஔతʹΕͨܥྻཁૉؒͷґଘؔ
ॏཁ՝ɿڑґଘͷཧղ ൴͕ॻ͍ͨͦͷຊΛɺࢲҰಡΜͩ͜ͱ͕͋Γ·ͤΜɻ తޠ ओޠ ҐஔతʹΕͨܥྻཁૉؒͷґଘؔ ڑґଘΛѲͰ͖ͳ͍ͱେͷλεΫղ͚ͳ͍
ର߅അԿ͔ *OUSPEVDUJPO#BDLHSPVOE
ର߅അԿ͔ ࠶ؼܕχϡʔϥϧωοτϫʔΫ ΈࠐΈχϡʔϥϧωοτϫʔΫ -45.<)PDISFJUFS > (36<$IVOH > FUD #ZUF/FU<,BMDICSFOOFS
> $POW44<(FISJOH > FUD ܥྻͷཁૉΛॱʹೖྗ͍ͯ͘͠ ฒྻܭࢉ͕Ͱ͖ͳ͍ ཁૉؒڑʹԠͨ͡ܭࢉྔ૿Ճ͕ݦஶ ڑґଘͷֶश͕ࠔ
ର߅അԿ͔ ࠶ؼܕχϡʔϥϧωοτϫʔΫ ΈࠐΈχϡʔϥϧωοτϫʔΫ -45.<)PDISFJUFS > (36<$IVOH > FUD #ZUF/FU<,BMDICSFOOFS
> $POW44<(FISJOH > FUD ܥྻͷཁૉΛॱʹೖྗ͍ͯ͘͠ ฒྻܭࢉ͕Ͱ͖ͳ͍ ཁૉؒڑʹԠͨ͡ܭࢉྔ૿Ճ͕ݦஶ ڑґଘͷֶश͕ࠔ ฒྻԽ͕Մೳ͔ͭڑґଘΛֶशͰ͖ΔϞσϧͱͯ͠ 5SBOTGPSNFSΛఏҊʢ4FD ʣ
طଘݚڀ͔ΒҾ͖ܧ͙ͷ Τϯίʔμɾσίʔμػߏ FODPEFSEFDPEFSNFDIBOJTN ࣗݾҙػߏ TFMGBUUFOUJPONFDIBOJTN Τϯίʔμ σίʔμ ೖྗ ग़ྗ
ಛநग़ɾܥྻੜͷೋஈߏ͑ ࢲ ٢ా ࢲ ٢ా ࣗܥྻؒͰͷॏΈ͚
طଘݚڀ͔ΒҾ͖ܧ͙ͷ Τϯίʔμɾσίʔμػߏ FODPEFSEFDPEFSNFDIBOJTN ࣗݾҙػߏ TFMGBUUFOUJPONFDIBOJTN Τϯίʔμ σίʔμ ೖྗ ग़ྗ
ಛநग़ɾܥྻੜͷೋஈߏ͑ ࢲ ٢ా ࢲ ٢ా ࣗܥྻؒͰͷॏΈ͚ Τϯίʔμɾσίʔμͷ༗༻ੑΛੜ͔ͭͭ͠ ࣗݾҙػߏͰ݁͢ΔॳΊͯͷϞσϧʢ4FDʣ
·ͱΊ ର߅അԿ͔ ओఏҊԿ͔ ධՁͷ8IBU3FTVMU ࠶ؼܕωοτϫʔΫ ΈࠐΈωοτϫʔΫ ࣗݾҙͰ݁ͨ͠ ΤϯίʔμɾσίʔμΛఏҊ w
ฒྻԽ͕༰қ w ڑґଘΛଊ͑Δ .PEFM"SDIJUFDUVSF 8IZ4FMG"UUFOUJPO 5SBJOJOH 3FTVMUT
͜͜·ͰಡΉͱओ؟͕Θ͔Δ ฒྻԽՄೳͰɺ͔ͭڑͷґଘؔΛ ଊ͑ΒΕΔϝΧχζϜͱʁ
ओఏҊԿ͔ .PEFM"SDIJUFDUVSF8IZ4FMG"UUFOUJPO
ఏҊϞσϧͷΞʔΩςΫνϟ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ 'FFE'PSXBSE /FUXPSL Ճࢉਖ਼نԽ ϚεΫ͖
Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ճࢉਖ਼نԽ ೖྗςΩετ ग़ྗςΩετ 'FFE'PSXBSE /FUXPSL ઢܗม ιϑτϚοΫεؔ ֬ Ґஔූ߸Խ Ґஔූ߸Խ 🌟 🌟 🌟 /ʷ ʷ/ ˞ਤ7BTXBOJ ͷ'JHVSFΛϕʔεͱͯ͠࡞
Ґஔූ߸Խ ఏҊϞσϧͷΞʔΩςΫνϟ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ 'FFE'PSXBSE /FUXPSL Ճࢉਖ਼نԽ
ϚεΫ͖ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ճࢉਖ਼نԽ ೖྗςΩετ ग़ྗςΩετ 'FFE'PSXBSE /FUXPSL ઢܗม ιϑτϚοΫεؔ ֬ 🌟 🌟 🌟 /ʷ ʷ/ Τϯίʔμ Ґஔූ߸Խ
Ґஔූ߸Խ ఏҊϞσϧͷΞʔΩςΫνϟ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ 'FFE'PSXBSE /FUXPSL Ճࢉਖ਼نԽ
ϚεΫ͖ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ճࢉਖ਼نԽ ೖྗςΩετ ग़ྗςΩετ 'FFE'PSXBSE /FUXPSL ઢܗม ιϑτϚοΫεؔ ֬ 🌟 🌟 🌟 /ʷ ʷ/ σίʔμ Ґஔූ߸Խ
Ґஔූ߸Խ ఏҊϞσϧͷΞʔΩςΫνϟ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ 'FFE'PSXBSE /FUXPSL Ճࢉਖ਼نԽ
ϚεΫ͖ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ճࢉਖ਼نԽ ೖྗςΩετ ग़ྗςΩετ 'FFE'PSXBSE /FUXPSL ઢܗม ιϑτϚοΫεؔ ֬ 🌟 🌟 🌟 /ʷ ʷ/ σίʔμ Ґஔූ߸Խ ࣗݾճؼ BVUPSFHSFTTJPO ࣌ࠁ ͷग़ྗ͕ ͷೖྗʹͳΔػߏ t t + 1
ఏҊϞσϧͷΞʔΩςΫνϟ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ 'FFE'PSXBSE /FUXPSL Ճࢉਖ਼نԽ ϚεΫ͖
Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ճࢉਖ਼نԽ ೖྗςΩετ ग़ྗςΩετ 'FFE'PSXBSE /FUXPSL ઢܗม ιϑτϚοΫεؔ ֬ 🌟 🌟 🌟 /ʷ ʷ/ ˞ਤ7BTXBOJ ͷ'JHVSFΛϕʔεͱͯ͠࡞ Ґஔූ߸Խ Ґஔූ߸Խ
ఏҊϞσϧͷΞʔΩςΫνϟ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ 'FFE'PSXBSE /FUXPSL Ճࢉਖ਼نԽ ϚεΫ͖
Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ճࢉਖ਼نԽ ೖྗςΩετ ग़ྗςΩετ 'FFE'PSXBSE /FUXPSL ઢܗม ιϑτϚοΫεؔ ֬ 🌟 🌟 🌟 /ʷ ʷ/ ˞ਤ7BTXBOJ ͷ'JHVSFΛϕʔεͱͯ͠࡞ Ґஔූ߸Խ Ґஔූ߸Խ
ఏҊϞσϧͷΞʔΩςΫνϟ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ 'FFE'PSXBSE /FUXPSL Ճࢉਖ਼نԽ ϚεΫ͖
Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ճࢉਖ਼نԽ ೖྗςΩετ ग़ྗςΩετ 'FFE'PSXBSE /FUXPSL ઢܗม ιϑτϚοΫεؔ ֬ 🌟 🌟 🌟 /ʷ ʷ/ ˞ਤ7BTXBOJ ͷ'JHVSFΛϕʔεͱͯ͠࡞ ͔͜͜ΒઌͷॲཧೖྗςΩετͷ ޠॱΛೝࣝͰ͖ͳ͍ Ґஔූ߸Խ Ґஔූ߸Խ
ຒΊࠐΈɾҐஔූ߸Խ ς Ω ε τ τ Ϋ ϯ Խ
ࢲ ٢ా ʜ ʜ ʜ ຒΊࠐΈ e1 e2 e3 ࣍ݩͷϕΫτϧ dmodel
ຒΊࠐΈɾҐஔූ߸Խ ς Ω ε τ τ Ϋ ϯ Խ ࢲ
٢ా ʜ ʜ ʜ ຒΊࠐΈ e1 e2 e3 ʜ ʜ ʜ Ґஔූ߸Խ p1 p2 p3 ppos [2i] = sin ( pos 10000 2i dmodel ) ppos [2i + 1] = cos ( pos 10000 2i dmodel )
ຒΊࠐΈɾҐஔූ߸Խ ς Ω ε τ τ Ϋ ϯ Խ
ࢲ ٢ా ʜ ʜ ʜ ຒΊࠐΈ e1 e2 e3 ʜ ʜ ʜ Ґஔූ߸Խ p1 p2 p3 ppos [2i] = sin ( pos 10000 2i dmodel ) ppos [2i + 1] = cos ( pos 10000 2i dmodel ) x1 x2 x3 = de1 + p1 = de2 + p2 = de3 + p3
ఏҊϞσϧͷΞʔΩςΫνϟ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ 'FFE'PSXBSE /FUXPSL Ճࢉਖ਼نԽ ϚεΫ͖
Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ճࢉਖ਼نԽ ೖྗςΩετ ग़ྗςΩετ 'FFE'PSXBSE /FUXPSL ઢܗม ιϑτϚοΫεؔ ֬ 🌟 🌟 🌟 /ʷ ʷ/ ˞ਤ7BTXBOJ ͷ'JHVSFΛϕʔεͱͯ͠࡞ ࣗݾҙ TFMGBUUFOUJPO w ϕΫτϧྻΛจ຺Λߟྀ͠ͳ͕Βม w 5SBOTGPSNFSͷ࠷ॏཁͳ෦ Ґஔූ߸Խ Ґஔූ߸Խ
ࣗݾҙ x1 x2 x3 Q = [ q1 ]
[ q2 ] [ q3 ] K = [ k1 ] [ k2 ] [ k3 ] V = [ v1 ] [ v2 ] [ v3 ] qi = Wq xi , Wq ∈ ℝdk ×dmodel ki = Wk xi , Wk ∈ ℝdk ×dmodel vi = Wv xi , Wv ∈ ℝdv ×dmodel
ࣗݾҙ x1 x2 x3 Q = [ q1 ]
[ q2 ] [ q3 ] K = [ k1 ] [ k2 ] [ k3 ] V = [ v1 ] [ v2 ] [ v3 ] h1 h2 h3 = ◯v1 + ◯v2 + ◯v3 = ◯v1 + ◯v2 + ◯v3 = ◯v1 + ◯v2 + ◯v3 ͜Ε͔Βܭࢉ͍ͨ͠ͷ पลจ຺ͷࠞͥ߹Θͤ۩߹Λ˓ͷ͕ܾΊΔ ͜ΕΛ Λ༻͍ͯٻΊΔ Q, K
ࣗݾҙ h1 h2 h3 = ◯v1 + ◯v2 +
◯v3 = ◯v1 + ◯v2 + ◯v3 = ◯v1 + ◯v2 + ◯v3 ͜Ε͔Βܭࢉ͍ͨ͠ͷ पลจ຺ͷࠞͥ߹Θͤ۩߹Λ˓ͷ͕ܾΊΔ ͜ΕΛ Λ༻͍ͯٻΊΔ Q, K QK⊤ = [ q1 ] [ q2 ] [ q3 ] [ k1 ] [ k2 ] [ k3 ] = q1 ⋅ k1 q1 ⋅ k2 q1 ⋅ k3 q2 ⋅ k1 q2 ⋅ k2 q2 ⋅ k3 q3 ⋅ k1 q3 ⋅ k2 q3 ⋅ k3
ࣗݾҙ h1 h2 h3 = ◯v1 + ◯v2 +
◯v3 = ◯v1 + ◯v2 + ◯v3 = ◯v1 + ◯v2 + ◯v3 ͜Ε͔Βܭࢉ͍ͨ͠ͷ पลจ຺ͷࠞͥ߹Θͤ۩߹Λ˓ͷ͕ܾΊΔ ͜ΕΛ Λ༻͍ͯٻΊΔ Q, K QK⊤ = [ q1 ] [ q2 ] [ q3 ] [ k1 ] [ k2 ] [ k3 ] = q1 ⋅ k1 q1 ⋅ k2 q1 ⋅ k3 q2 ⋅ k1 q2 ⋅ k2 q2 ⋅ k3 q3 ⋅ k1 q3 ⋅ k2 q3 ⋅ k3 ʮ٢ాʯ͔Βݟͨʮࢲʯͷॏཁ
ࣗݾҙ h1 h2 h3 = ◯v1 + ◯v2 +
◯v3 = ◯v1 + ◯v2 + ◯v3 = ◯v1 + ◯v2 + ◯v3 ͜Ε͔Βܭࢉ͍ͨ͠ͷ पลจ຺ͷࠞͥ߹Θͤ۩߹Λ˓ͷ͕ܾΊΔ ͜ΕΛ Λ༻͍ͯٻΊΔ Q, K QK⊤ = [ q1 ] [ q2 ] [ q3 ] [ k1 ] [ k2 ] [ k3 ] = q1 ⋅ k1 q1 ⋅ k2 q1 ⋅ k3 q2 ⋅ k1 q2 ⋅ k2 q2 ⋅ k3 q3 ⋅ k1 q3 ⋅ k2 q3 ⋅ k3 ʮ٢ాʯ͔Βݟͨʮʯͷॏཁ
ࣗݾҙ h1 h2 h3 = ◯v1 + ◯v2 +
◯v3 = ◯v1 + ◯v2 + ◯v3 = ◯v1 + ◯v2 + ◯v3 ͜Ε͔Βܭࢉ͍ͨ͠ͷ पลจ຺ͷࠞͥ߹Θͤ۩߹Λ˓ͷ͕ܾΊΔ ͜ΕΛ Λ༻͍ͯٻΊΔ Q, K QK⊤ = [ q1 ] [ q2 ] [ q3 ] [ k1 ] [ k2 ] [ k3 ] = q1 ⋅ k1 q1 ⋅ k2 q1 ⋅ k3 q2 ⋅ k1 q2 ⋅ k2 q2 ⋅ k3 q3 ⋅ k1 q3 ⋅ k2 q3 ⋅ k3 ʮ٢ాʯ͔Βݟͨʮ٢ాʯͷॏཁ
ࣗݾҙ h1 h2 h3 = ◯v1 + ◯v2 +
◯v3 = ◯v1 + ◯v2 + ◯v3 = ◯v1 + ◯v2 + ◯v3 ͜Ε͔Βܭࢉ͍ͨ͠ͷ पลจ຺ͷࠞͥ߹Θͤ۩߹Λ˓ͷ͕ܾΊΔ ͜ΕΛ Λ༻͍ͯٻΊΔ Q, K softmax ( QK⊤ dk ) = a11 a12 a13 a21 a22 a23 a31 a32 a33 εέʔϦϯά ิ ͜ͷߦྻΛҙߦྻ BUUFOUJPONBUSJY ͱݺͿ ֤ ҙॏΈ BUUFOUJPOXFJHIU ͱݺͿ aij
ࣗݾҙ h1 h2 h3 = a21 v1 + a22
v2 + a23 v3 ࣗݾҙͷ࠷ऴग़ྗ पลจ຺ͷࠞͥ߹Θͤ۩߹Λ˓ͷ͕ܾΊΔ ͜ΕΛ Λ༻͍ͯٻΊΔ Q, K softmax ( QK⊤ dk ) = a11 a12 a13 a21 a22 a23 a31 a32 a33 εέʔϦϯά = a11 v1 + a12 v2 + a13 v3 = a31 v1 + a32 v2 + a33 v3
ࣗݾҙ h1 h2 h3 = a21 v1 + a22
v2 + a23 v3 ࣗݾҙͷ࠷ऴग़ྗ पลจ຺ͷࠞͥ߹Θͤ۩߹Λ˓ͷ͕ܾΊΔ ͜ΕΛ Λ༻͍ͯٻΊΔ Q, K softmax ( QK⊤ dk ) = a11 a12 a13 a21 a22 a23 a31 a32 a33 εέʔϦϯά = a11 v1 + a12 v2 + a13 v3 = a31 v1 + a32 v2 + a33 v3 ࣗݾҙ ʹपลจ຺Λߟྀ͢Δػߏ
ఏҊϞσϧͷΞʔΩςΫνϟ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ 'FFE'PSXBSE /FUXPSL Ճࢉਖ਼نԽ ϚεΫ͖
Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ճࢉਖ਼نԽ ೖྗςΩετ ग़ྗςΩετ 'FFE'PSXBSE /FUXPSL ઢܗม ιϑτϚοΫεؔ ֬ 🌟 🌟 🌟 /ʷ ʷ/ ˞ਤ7BTXBOJ ͷ'JHVSFΛϕʔεͱͯ͠࡞ Ґஔූ߸Խ Ґஔූ߸Խ
ϚεΫ͖ࣗݾҙ softmax ( QK⊤ dk ) = a11 a12
a13 a21 a22 a23 a31 a32 a33
ϚεΫ͖ࣗݾҙ softmax ( QK⊤ dk ) = a11 a12
a13 a21 a22 a23 a31 a32 a33 [ 1 0 0 1 1 0 1 1 1 ] ϚεΫߦྻ
ϚεΫ͖ࣗݾҙ softmax ( QK⊤ dk ) = a11 a12
a13 a21 a22 a23 a31 a32 a33 [ 1 0 0 1 1 0 1 1 1 ] ϚεΫߦྻ a11 a12 a13 a21 a22 a23 a31 a32 a33 ⊙ [ 1 0 0 1 1 0 1 1 1 ] = a11 0 0 a21 a22 0 a31 a32 a33
ϚεΫ͖ࣗݾҙ h1 h2 h3 = a21 v1 + a22
v2 + 0v3 ࣗݾҙͷ࠷ऴग़ྗ softmax ( QK⊤ dk ) = a11 a12 a13 a21 a22 a23 a31 a32 a33 = a11 v1 + 0v2 + 0v3 = a31 v1 + a32 v2 + a33 v3 [ 1 0 0 1 1 0 1 1 1 ] ϚεΫߦྻ a11 a12 a13 a21 a22 a23 a31 a32 a33 ⊙ [ 1 0 0 1 1 0 1 1 1 ] = a11 0 0 a21 a22 0 a31 a32 a33
ࣗݾҙʹ͍ͭͯཧ
ࣗݾҙ h1 h2 h3 = ◯v1 + ◯v2 +
◯v3 = ◯v1 + ◯v2 + ◯v3 = ◯v1 + ◯v2 + ◯v3 ܭࢉ͍ͨ͠ͷ QK⊤ = [ q1 ] [ q2 ] [ q3 ] [ k1 ] [ k2 ] [ k3 ] = q1 ⋅ k1 q1 ⋅ k2 q1 ⋅ k3 q2 ⋅ k1 q2 ⋅ k2 q2 ⋅ k3 q3 ⋅ k1 q3 ⋅ k2 q3 ⋅ k3 softmax ( QK⊤ dk ) = a11 a12 a13 a21 a22 a23 a31 a32 a33
Ϛϧνϔουࣗݾҙ 'JHΑΓൈਮ Λ ׂͯ͠ฒྻॲཧ ग़ྗΛܨ͛ͯͻͱͭʹ͢Δ Q, K, V h
จͰ Ͱ࣮ݧ h = 1,4,8,16,32
ఏҊϞσϧͷΞʔΩςΫνϟ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ 'FFE'PSXBSE /FUXPSL Ճࢉਖ਼نԽ ϚεΫ͖
Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ճࢉਖ਼نԽ ೖྗςΩετ ग़ྗςΩετ 'FFE'PSXBSE /FUXPSL ઢܗม ιϑτϚοΫεؔ ֬ 🌟 🌟 🌟 /ʷ ʷ/ ˞ਤ7BTXBOJ ͷ'JHVSFΛϕʔεͱͯ͠࡞ Ґஔූ߸Խ Ґஔූ߸Խ
1PTJUJPOXJTF'FFE'PSXBSE/FUXPSLT ࣗ ݾ ҙ ࢲ ٢ా ʜ
ʜ ʜ จ຺ԽຒΊࠐΈ h1 h2 h3 ReLU(h1 W1 + b1 )W2 + b2 ReLU(h2 W1 + b1 )W2 + b2 ReLU(h3 W1 + b1 )W2 + b2 ϕΫτϧͦΕͧΕʹରͯ͠'FFEGPSXBSE/FUXPSLΛద༻
1PTJUJPOXJTF'FFE'PSXBSE/FUXPSLT ࣗ ݾ ҙ ࢲ ٢ా ʜ
ʜ ʜ จ຺ԽຒΊࠐΈ h1 h2 h3 ReLU(h1 W1 + b1 )W2 + b2 ReLU(h2 W1 + b1 )W2 + b2 ReLU(h3 W1 + b1 )W2 + b2 ϕΫτϧͦΕͧΕʹରͯ͠'FFEGPSXBSE/FUXPSLΛద༻ 5SBOTGPSNFSʹ͓͚Δ''/ͷׂʹ͍ͭͯͦͷޙ͞·͟·ͳ͕ٞ͋Δ w (FWB .PS FUBM5SBOTGPSNFSGFFEGPSXBSEMBZFSTBSFLFZWBMVFNFNPSJFTBS9JWQSFQSJOUBS9JW w ;IBOH ;IFOHZBO FUBM.PF fi DBUJPO5SBOTGPSNFSGFFEGPSXBSEMBZFSTBSFNJYUVSFTPGFYQFSUTBS9JWQSFQSJOUBS9JW w (FWB .PS FUBM5SBOTGPSNFSGFFEGPSXBSEMBZFSTCVJMEQSFEJDUJPOTCZQSPNPUJOHDPODFQUTJOUIFWPDBCVMBSZTQBDFBS9JWQSFQSJOUBS9JW w FUD
ఏҊϞσϧͷΞʔΩςΫνϟ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ 'FFE'PSXBSE /FUXPSL Ճࢉਖ਼نԽ ϚεΫ͖
Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ճࢉਖ਼نԽ ೖྗςΩετ ग़ྗςΩετ 'FFE'PSXBSE /FUXPSL ઢܗม ιϑτϚοΫεؔ ֬ 🌟 🌟 🌟 /ʷ ʷ/ ˞ਤ7BTXBOJ ͷ'JHVSFΛϕʔεͱͯ͠࡞ Ґஔූ߸Խ Ґஔූ߸Խ
·ͱΊ ର߅അԿ͔ ओఏҊԿ͔ ධՁͷ8IBU3FTVMU ࠶ؼܕωοτϫʔΫ ΈࠐΈωοτϫʔΫ ࣗݾҙͰ݁ͨ͠ ΤϯίʔμɾσίʔμΛఏҊ w
ฒྻԽ͕༰қ w ڑґଘΛଊ͑Δ 5SBJOJOH 3FTVMUT 5SBOTGPSNFSͷػߏ Ґஔූ߸Խ
ධՁͷ8IBU3FTVMU 5SBJOJOH3FTVMUT
8IBU λεΫ ػց༁ σʔληοτ 8.5 w χϡʔε༁ͷֶशɾධՁσʔληοτ w FOEFNJMMJPOTFOUFODFQBJST
w FOGSNJMMJPOTFOUFODFQBJST ධՁࢦඪ w #-&6ʢ༁ੑೳʣ w '-01Tʢܭࢉྔʣ
3FTVMU ଞϞσϧʹඖఢ͢ΔੑೳΛΑΓগͳֶ͍शίετͰ࣮ݱ
·ͱΊ ର߅അԿ͔ ओఏҊԿ͔ ධՁͷ8IBU3FTVMU ࠶ؼܕωοτϫʔΫ ΈࠐΈωοτϫʔΫ ࣗݾҙͰ݁ͨ͠ ΤϯίʔμɾσίʔμΛఏҊ w
ฒྻԽ͕༰қ w ڑґଘΛଊ͑Δ 5SBOTGPSNFSͷػߏ Ґஔූ߸Խ 8IBUػց༁ 3FTVMU405" ଞϞσϧʹඖఢ͢ΔੑೳΛ ΑΓগͳֶ͍शίετͰ࣮ݱ
ࠓճͷ༰ ࠷ॳͷϖʔδ ༰ղઆ จಡΉͱ͖ʹԿΛߟ͍͑ͯΔ͔ʁ ͦͷޙͷల։ ͜ͷจ୯ମͷཧղʹͱͲ·Βͣ จͷಡΈํɾͰͷҐஔ͚ΛΔ 5SBOTGPSNFSఏҊจΛಡΉ 7BTXBOJ
"TIJTI FUBM"UUFOUJPOJTBMMZPVOFFE"EWBODFTJOOFVSBMJOGPSNBUJPOQSPDFTTJOHTZTUFNT
ͦͷޙͷల։
ੜϞσϧͷੜ Τϯίʔμͱσίʔμׂ͕ҟͳΔ Τϯίʔμ σίʔμ ೖྗܥྻͷ$POUFYUVBMJ[BUJPO ࣗݾճؼతͳܥྻੜ
ੜϞσϧͷੜ Τϯίʔμͱσίʔμׂ͕ҟͳΔ Τϯίʔμ σίʔμ ೖྗܥྻͷ$POUFYUVBMJ[BUJPO ࣗݾճؼతͳܥྻੜ ಛநग़ثͱͯ͠ͷ׆༻ #&357J5FUD ੜϞσϧͱͯ͠ͷ׆༻
(15-MBNBFUD ͦΕͧΕΛϕʔεͱͨ͠৽ͨͳϞσϧ͕ੜ
ֶशύϥμΠϜͷมભ ݱࡏ εΫϥονֶश ϑΝΠϯνϡʔχϯά *ODPOUFYUMFBSOJOH ಛఆͷλεΫʹಛԽͨ͠ϞσϧΛ ϥϯμϜͳΛͱΔύϥϝλ͔Βֶश
ࣄલֶशࡁΈϞσϧΛ ݸผλεΫ͚ʹඍௐ ϞσϧͦͷͷΛௐͤͣ ࢦࣔʹै༷ͬͯʑͳλεΫΛ͜ͳ͢ w ࠶ؼܕωοτϫʔΫ w ΈࠐΈωοτϫʔΫ w 5SBOTGPSNFS w #&35 w (15 w 3FT/FU w (15 w -MBNB w 1B-.
ֶशύϥμΠϜͷมભ ݱࡏ εΫϥονֶश ϑΝΠϯνϡʔχϯά *ODPOUFYUMFBSOJOH ಛఆͷλεΫʹಛԽͨ͠ϞσϧΛ ϥϯμϜͳΛͱΔύϥϝλ͔Βֶश
ࣄલֶशࡁΈϞσϧΛ ݸผλεΫ͚ʹඍௐ ϞσϧͦͷͷΛௐͤͣ ࢦࣔʹै༷ͬͯʑͳλεΫΛ͜ͳ͢ w ࠶ؼܕωοτϫʔΫ w ΈࠐΈωοτϫʔΫ w 5SBOTGPSNFS w #&35 w (15 w 3FT/FU w (15 w -MBNB w 1B-. #&35ʹΑΔϑΝΠϯνϡʔχϯά(15ʹΑΔ*ODPOUFYUMFBSOJOH ͕ಛʹΤϙοΫϝΠΩϯά
ࣗݾҙͰදݱ͞Ε͍ͯΔࣝͱʁ Ϟσϧ͕ͲͷΑ͏ͳࣝΛ͍࣋ͬͯΔ͔Λௐࠪ͢ΔݚڀΛ ϓϩʔϏϯά QSPCJOH ͱݺͿ
ࣗݾҙͰදݱ͞Ε͍ͯΔࣝͱʁ Ϟσϧ͕ͲͷΑ͏ͳࣝΛ͍࣋ͬͯΔ͔Λௐࠪ͢ΔݚڀΛ ϓϩʔϏϯά QSPCJOH ͱݺͿ #&35ͷ࡞ΔຒΊࠐΈ͔ΒΓड͚ߏ͕͓͓ΉͶநग़Ͱ͖Δʢࠇ͕ਖ਼ղɺ੨͕#&35͔Βநग़ͨ͠Γड͚ߏʣ<)FXJUU 'JH> ໌ࣔతʹֶश͍ͯ͠ͳ͍ࣝͷ֫ಘՄೳੑ<$MBSL
'JH>
ςΩετΛ͑ͨ׆༂ %PTPWJUTLJZ 'JH 3BEGPSE 'JH
7JTJPO5SBOTGPSNFS $-*1 ࣗݾҙͷՄೳੑΛ୳Δޙଓݚڀ͕ଟൃ
·ͱΊ ର߅അԿ͔ ओఏҊԿ͔ ධՁͷ8IBU3FTVMU ࠶ؼܕωοτϫʔΫ ΈࠐΈωοτϫʔΫ ࣗݾҙͰ݁ͨ͠ ΤϯίʔμɾσίʔμΛఏҊ w
ฒྻԽ͕༰қ w ڑґଘΛଊ͑Δ 5SBOTGPSNFSͷػߏ Ґஔූ߸Խ 8IBUػց༁ 3FTVMU405" ଞϞσϧʹඖఢ͢ΔੑೳΛ ΑΓগͳֶ͍शίετͰ࣮ݱ
จͷߏ ΞϒετϥΫτ ΠϯτϩμΫγϣϯ ؔ࿈ݚڀ ఏҊख๏ ࣮ݧઃఆɾ݁Ռɾٞ ݁ ⁞֓ཁΛ௫Ή ओுΛ௫Ή
ॏΈ͚ͯ͠ಡΉ
ࠓճͷ༰ ࠷ॳͷϖʔδ ༰ղઆ จಡΉͱ͖ʹԿΛߟ͍͑ͯΔ͔ʁ ͦͷޙͷల։ ͜ͷจ୯ମͷཧղʹͱͲ·Βͣ จͷಡΈํɾͰͷҐஔ͚ΛΔ 5SBOTGPSNFSఏҊจΛಡΉ 7BTXBOJ
"TIJTI FUBM"UUFOUJPOJTBMMZPVOFFE"EWBODFTJOOFVSBMJOGPSNBUJPOQSPDFTTJOHTZTUFNT