Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
【動画あり】Transformer論文解説
Search
数理の弾丸
July 16, 2024
Technology
0
200
【動画あり】Transformer論文解説
下記YouTube動画で使用したスライド資料です。
https://youtu.be/6tcjwdanedU
数理の弾丸
July 16, 2024
Tweet
Share
More Decks by 数理の弾丸
See All by 数理の弾丸
RAG:チャットボットの能力を底上げする技術
mathbullet
0
230
ゼロから始める大規模言語モデル入門
mathbullet
0
170
[動画あり] 線形回帰を題材に汎用的な理解を身につける:座学編
mathbullet
0
80
[動画あり] AI入門特急コース
mathbullet
0
170
Other Decks in Technology
See All in Technology
「AI駆動PO」を考えてみる - 作る速さから価値のスループットへ:検査・適応で未来を開発 / AI-driven product owner. scrummat2025
yosuke_nagai
3
750
そのWAFのブロック、どう活かす? サービスを守るための実践的多層防御と思考法 / WAF blocks defense decision
kaminashi
0
110
生成AIとM5Stack / M5 Japan Tour 2025 Autumn 東京
you
PRO
0
240
AI時代こそ求められる設計力- AWSクラウドデザインパターン3選で信頼性と拡張性を高める-
kenichirokimura
3
150
From Prompt to Product @ How to Web 2025, Bucharest, Romania
janwerner
0
120
【Oracle Cloud ウェビナー】クラウド導入に「専用クラウド」という選択肢、Oracle AlloyとOCI Dedicated Region とは
oracle4engineer
PRO
3
120
GC25 Recap+: Advancing Go Garbage Collection with Green Tea
logica0419
1
510
Goに育てられ開発者向けセキュリティ事業を立ち上げた僕が今向き合う、AI × セキュリティの最前線 / Go Conference 2025
flatt_security
0
360
神回のメカニズムと再現方法/Mechanisms and Playbook for Kamikai scrumat2025
moriyuya
4
670
生成AI_その前_に_マルチクラウド時代の信頼できるデータを支えるSnowflakeメタデータ活用術.pdf
cm_mikami
0
120
ガバメントクラウドの概要と自治体事例(名古屋市)
techniczna
2
200
カンファレンスに託児サポートがあるということ / Having Childcare Support at Conferences
nobu09
1
410
Featured
See All Featured
ReactJS: Keep Simple. Everything can be a component!
pedronauck
667
120k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
358
30k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
PRO
23
1.5k
The Power of CSS Pseudo Elements
geoffreycrofte
79
6k
Speed Design
sergeychernyshev
32
1.1k
How GitHub (no longer) Works
holman
315
140k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
162
15k
It's Worth the Effort
3n
187
28k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
27k
Learning to Love Humans: Emotional Interface Design
aarron
274
41k
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
Scaling GitHub
holman
463
140k
Transcript
ࠓճͷ༰ ࠷ॳͷϖʔδ ༰ղઆ จಡΉͱ͖ʹԿΛߟ͍͑ͯΔ͔ʁ ͦͷޙͷల։ ͜ͷจ୯ମͷཧղʹͱͲ·Βͣ จͷಡΈํɾͰͷҐஔ͚ΛΔ 5SBOTGPSNFSఏҊจΛಡΉ 7BTXBOJ
"TIJTI FUBM"UUFOUJPOJTBMMZPVOFFE"EWBODFTJOOFVSBMJOGPSNBUJPOQSPDFTTJOHTZTUFNT
ͳͥ͜ͷจ͕ॏཁͳͷ͔ʁ
5SBOTGPSNFSͷԠ༻ൣғ ※: https://blog.google/products/search/search-language-understanding-bert/ FUD 5SBOTGPSNFS ςΩετ༁Λओ؟ͱͯ͠ఏҊ #&35 (15 ςΩετྨFUD
ςΩετੜ ෦ΞʔΩςΫνϟͷ࠾༻ ը૾ͷద༻ 7J5 %JGGVTJPO 5SBOTGPSNFS ը૾ྨFUD ը૾ੜ $IBU(15 -MBNB 4UBCMF%JGGVTJPO 4PSB (PPHMFݕࡧ˞ $-*1 ۃΊͯൣғʹج൫ٕज़ͱͯ͠׆༂
ਓೳͷจΛಡΉΓޱ
ਓೳͷจΛಡΉΓޱ ର߅അԿ͔ ओఏҊԿ͔ ධՁͷ8IBU3FTVMU
ਓೳͷจΛಡΉΓޱ ର߅അԿ͔ ओఏҊԿ͔ ධՁͷ8IBU3FTVMU *OUSPEVDUJPO #BDLHSPVOE .PEFM"SDIJUFDUVSF
8IZ4FMG"UUFOUJPO 5SBJOJOH 3FTVMUT
จͷߏ ΞϒετϥΫτ ΠϯτϩμΫγϣϯ ؔ࿈ݚڀ ఏҊख๏ ࣮ݧઃఆɾ݁Ռɾٞ ݁
จͷߏ ΞϒετϥΫτ ΠϯτϩμΫγϣϯ ؔ࿈ݚڀ ఏҊख๏ ࣮ݧઃఆɾ݁Ռɾٞ ݁ ⁞֓ཁΛ௫Ή
จͷߏ ΞϒετϥΫτ ΠϯτϩμΫγϣϯ ؔ࿈ݚڀ ఏҊख๏ ࣮ݧઃఆɾ݁Ռɾٞ ݁ ⁞֓ཁΛ௫Ή ओுΛ௫Ή
จͷߏ ΞϒετϥΫτ ΠϯτϩμΫγϣϯ ؔ࿈ݚڀ ఏҊख๏ ࣮ݧઃఆɾ݁Ռɾٞ ݁ ⁞֓ཁΛ௫Ή ओுΛ௫Ή
ॏΈ͚ͯ͠ಡΉ
ਓೳͷจΛಡΉΓޱ ର߅അԿ͔ ओఏҊԿ͔ ධՁͷ8IBU3FTVMU *OUSPEVDUJPO #BDLHSPVOE .PEFM"SDIJUFDUVSF
8IZ4FMG"UUFOUJPO 5SBJOJOH 3FTVMUT
ܥྻϞσϦϯά
ܥྻϞσϦϯά ॱংͷ͋Δཁૉͷ࿈ͳΓͱΈͳͤΔͷΛܥྻʢTFRVFODFʣͱݺͼɺ ͜ΕΛରͱ͢ΔϞσϦϯάΛܥྻϞσϦϯάͱݺͿ
ܥྻϞσϦϯά ॱংͷ͋Δཁૉͷ࿈ͳΓͱΈͳͤΔͷΛܥྻʢTFRVFODFʣͱݺͼɺ ͜ΕΛରͱ͢ΔϞσϦϯάΛܥྻϞσϦϯάͱݺͿ ྫ ༁ ҙػߏ͑͋͞Εे "UUFOUJPOJTBMMZPVOFFE ྫ ςΩετੜ
Ͳ͏ͧΑΖ͓͘͠ئ͍͠·͢ɻ Կ͔࣭͝ϦΫΤετ͕͋Εڭ͍͑ͯͩ͘͞ɻ ΑΖ͘͠པΉ ྫ ߏจղੳ 4 /1ΑΖ͘͠ 71པΉ ΑΖ͘͠པΉ
ܥྻϞσϦϯά ॱংͷ͋Δཁૉͷ࿈ͳΓͱΈͳͤΔͷΛܥྻʢTFRVFODFʣͱݺͼɺ ͜ΕΛରͱ͢ΔϞσϦϯάΛܥྻϞσϦϯάͱݺͿ ྫ ༁ ҙػߏ͑͋͞Εे "UUFOUJPOJTBMMZPVOFFE ྫ ςΩετੜ
Ͳ͏ͧΑΖ͓͘͠ئ͍͠·͢ɻ Կ͔࣭͝ϦΫΤετ͕͋Εڭ͍͑ͯͩ͘͞ɻ ΑΖ͘͠པΉ ྫ ߏจղੳ 4 /1ΑΖ͘͠ 71པΉ ΑΖ͘͠པΉ TPVSDF
ܥྻϞσϦϯά ॱংͷ͋Δཁૉͷ࿈ͳΓͱΈͳͤΔͷΛܥྻʢTFRVFODFʣͱݺͼɺ ͜ΕΛରͱ͢ΔϞσϦϯάΛܥྻϞσϦϯάͱݺͿ ྫ ༁ ҙػߏ͑͋͞Εे "UUFOUJPOJTBMMZPVOFFE ྫ ςΩετੜ
Ͳ͏ͧΑΖ͓͘͠ئ͍͠·͢ɻ Կ͔࣭͝ϦΫΤετ͕͋Εڭ͍͑ͯͩ͘͞ɻ ΑΖ͘͠པΉ ྫ ߏจղੳ 4 /1ΑΖ͘͠ 71པΉ ΑΖ͘͠པΉ UBSHFU
ॏཁ՝ɿڑґଘͷཧղ ൴͕ॻ͍ͨͦͷຊΛɺࢲҰಡΜͩ͜ͱ͕͋Γ·ͤΜɻ తޠ ओޠ ҐஔతʹΕͨܥྻཁૉؒͷґଘؔ
ॏཁ՝ɿڑґଘͷཧղ ൴͕ॻ͍ͨͦͷຊΛɺࢲҰಡΜͩ͜ͱ͕͋Γ·ͤΜɻ తޠ ओޠ ҐஔతʹΕͨܥྻཁૉؒͷґଘؔ ڑґଘΛѲͰ͖ͳ͍ͱେͷλεΫղ͚ͳ͍
ର߅അԿ͔ *OUSPEVDUJPO#BDLHSPVOE
ର߅അԿ͔ ࠶ؼܕχϡʔϥϧωοτϫʔΫ ΈࠐΈχϡʔϥϧωοτϫʔΫ -45.<)PDISFJUFS > (36<$IVOH > FUD #ZUF/FU<,BMDICSFOOFS
> $POW44<(FISJOH > FUD ܥྻͷཁૉΛॱʹೖྗ͍ͯ͘͠ ฒྻܭࢉ͕Ͱ͖ͳ͍ ཁૉؒڑʹԠͨ͡ܭࢉྔ૿Ճ͕ݦஶ ڑґଘͷֶश͕ࠔ
ର߅അԿ͔ ࠶ؼܕχϡʔϥϧωοτϫʔΫ ΈࠐΈχϡʔϥϧωοτϫʔΫ -45.<)PDISFJUFS > (36<$IVOH > FUD #ZUF/FU<,BMDICSFOOFS
> $POW44<(FISJOH > FUD ܥྻͷཁૉΛॱʹೖྗ͍ͯ͘͠ ฒྻܭࢉ͕Ͱ͖ͳ͍ ཁૉؒڑʹԠͨ͡ܭࢉྔ૿Ճ͕ݦஶ ڑґଘͷֶश͕ࠔ ฒྻԽ͕Մೳ͔ͭڑґଘΛֶशͰ͖ΔϞσϧͱͯ͠ 5SBOTGPSNFSΛఏҊʢ4FD ʣ
طଘݚڀ͔ΒҾ͖ܧ͙ͷ Τϯίʔμɾσίʔμػߏ FODPEFSEFDPEFSNFDIBOJTN ࣗݾҙػߏ TFMGBUUFOUJPONFDIBOJTN Τϯίʔμ σίʔμ ೖྗ ग़ྗ
ಛநग़ɾܥྻੜͷೋஈߏ͑ ࢲ ٢ా ࢲ ٢ా ࣗܥྻؒͰͷॏΈ͚
طଘݚڀ͔ΒҾ͖ܧ͙ͷ Τϯίʔμɾσίʔμػߏ FODPEFSEFDPEFSNFDIBOJTN ࣗݾҙػߏ TFMGBUUFOUJPONFDIBOJTN Τϯίʔμ σίʔμ ೖྗ ग़ྗ
ಛநग़ɾܥྻੜͷೋஈߏ͑ ࢲ ٢ా ࢲ ٢ా ࣗܥྻؒͰͷॏΈ͚ Τϯίʔμɾσίʔμͷ༗༻ੑΛੜ͔ͭͭ͠ ࣗݾҙػߏͰ݁͢ΔॳΊͯͷϞσϧʢ4FDʣ
·ͱΊ ର߅അԿ͔ ओఏҊԿ͔ ධՁͷ8IBU3FTVMU ࠶ؼܕωοτϫʔΫ ΈࠐΈωοτϫʔΫ ࣗݾҙͰ݁ͨ͠ ΤϯίʔμɾσίʔμΛఏҊ w
ฒྻԽ͕༰қ w ڑґଘΛଊ͑Δ .PEFM"SDIJUFDUVSF 8IZ4FMG"UUFOUJPO 5SBJOJOH 3FTVMUT
͜͜·ͰಡΉͱओ؟͕Θ͔Δ ฒྻԽՄೳͰɺ͔ͭڑͷґଘؔΛ ଊ͑ΒΕΔϝΧχζϜͱʁ
ओఏҊԿ͔ .PEFM"SDIJUFDUVSF8IZ4FMG"UUFOUJPO
ఏҊϞσϧͷΞʔΩςΫνϟ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ 'FFE'PSXBSE /FUXPSL Ճࢉਖ਼نԽ ϚεΫ͖
Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ճࢉਖ਼نԽ ೖྗςΩετ ग़ྗςΩετ 'FFE'PSXBSE /FUXPSL ઢܗม ιϑτϚοΫεؔ ֬ Ґஔූ߸Խ Ґஔූ߸Խ 🌟 🌟 🌟 /ʷ ʷ/ ˞ਤ7BTXBOJ ͷ'JHVSFΛϕʔεͱͯ͠࡞
Ґஔූ߸Խ ఏҊϞσϧͷΞʔΩςΫνϟ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ 'FFE'PSXBSE /FUXPSL Ճࢉਖ਼نԽ
ϚεΫ͖ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ճࢉਖ਼نԽ ೖྗςΩετ ग़ྗςΩετ 'FFE'PSXBSE /FUXPSL ઢܗม ιϑτϚοΫεؔ ֬ 🌟 🌟 🌟 /ʷ ʷ/ Τϯίʔμ Ґஔූ߸Խ
Ґஔූ߸Խ ఏҊϞσϧͷΞʔΩςΫνϟ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ 'FFE'PSXBSE /FUXPSL Ճࢉਖ਼نԽ
ϚεΫ͖ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ճࢉਖ਼نԽ ೖྗςΩετ ग़ྗςΩετ 'FFE'PSXBSE /FUXPSL ઢܗม ιϑτϚοΫεؔ ֬ 🌟 🌟 🌟 /ʷ ʷ/ σίʔμ Ґஔූ߸Խ
Ґஔූ߸Խ ఏҊϞσϧͷΞʔΩςΫνϟ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ 'FFE'PSXBSE /FUXPSL Ճࢉਖ਼نԽ
ϚεΫ͖ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ճࢉਖ਼نԽ ೖྗςΩετ ग़ྗςΩετ 'FFE'PSXBSE /FUXPSL ઢܗม ιϑτϚοΫεؔ ֬ 🌟 🌟 🌟 /ʷ ʷ/ σίʔμ Ґஔූ߸Խ ࣗݾճؼ BVUPSFHSFTTJPO ࣌ࠁ ͷग़ྗ͕ ͷೖྗʹͳΔػߏ t t + 1
ఏҊϞσϧͷΞʔΩςΫνϟ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ 'FFE'PSXBSE /FUXPSL Ճࢉਖ਼نԽ ϚεΫ͖
Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ճࢉਖ਼نԽ ೖྗςΩετ ग़ྗςΩετ 'FFE'PSXBSE /FUXPSL ઢܗม ιϑτϚοΫεؔ ֬ 🌟 🌟 🌟 /ʷ ʷ/ ˞ਤ7BTXBOJ ͷ'JHVSFΛϕʔεͱͯ͠࡞ Ґஔූ߸Խ Ґஔූ߸Խ
ఏҊϞσϧͷΞʔΩςΫνϟ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ 'FFE'PSXBSE /FUXPSL Ճࢉਖ਼نԽ ϚεΫ͖
Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ճࢉਖ਼نԽ ೖྗςΩετ ग़ྗςΩετ 'FFE'PSXBSE /FUXPSL ઢܗม ιϑτϚοΫεؔ ֬ 🌟 🌟 🌟 /ʷ ʷ/ ˞ਤ7BTXBOJ ͷ'JHVSFΛϕʔεͱͯ͠࡞ Ґஔූ߸Խ Ґஔූ߸Խ
ఏҊϞσϧͷΞʔΩςΫνϟ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ 'FFE'PSXBSE /FUXPSL Ճࢉਖ਼نԽ ϚεΫ͖
Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ճࢉਖ਼نԽ ೖྗςΩετ ग़ྗςΩετ 'FFE'PSXBSE /FUXPSL ઢܗม ιϑτϚοΫεؔ ֬ 🌟 🌟 🌟 /ʷ ʷ/ ˞ਤ7BTXBOJ ͷ'JHVSFΛϕʔεͱͯ͠࡞ ͔͜͜ΒઌͷॲཧೖྗςΩετͷ ޠॱΛೝࣝͰ͖ͳ͍ Ґஔූ߸Խ Ґஔූ߸Խ
ຒΊࠐΈɾҐஔූ߸Խ ς Ω ε τ τ Ϋ ϯ Խ
ࢲ ٢ా ʜ ʜ ʜ ຒΊࠐΈ e1 e2 e3 ࣍ݩͷϕΫτϧ dmodel
ຒΊࠐΈɾҐஔූ߸Խ ς Ω ε τ τ Ϋ ϯ Խ ࢲ
٢ా ʜ ʜ ʜ ຒΊࠐΈ e1 e2 e3 ʜ ʜ ʜ Ґஔූ߸Խ p1 p2 p3 ppos [2i] = sin ( pos 10000 2i dmodel ) ppos [2i + 1] = cos ( pos 10000 2i dmodel )
ຒΊࠐΈɾҐஔූ߸Խ ς Ω ε τ τ Ϋ ϯ Խ
ࢲ ٢ా ʜ ʜ ʜ ຒΊࠐΈ e1 e2 e3 ʜ ʜ ʜ Ґஔූ߸Խ p1 p2 p3 ppos [2i] = sin ( pos 10000 2i dmodel ) ppos [2i + 1] = cos ( pos 10000 2i dmodel ) x1 x2 x3 = de1 + p1 = de2 + p2 = de3 + p3
ఏҊϞσϧͷΞʔΩςΫνϟ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ 'FFE'PSXBSE /FUXPSL Ճࢉਖ਼نԽ ϚεΫ͖
Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ճࢉਖ਼نԽ ೖྗςΩετ ग़ྗςΩετ 'FFE'PSXBSE /FUXPSL ઢܗม ιϑτϚοΫεؔ ֬ 🌟 🌟 🌟 /ʷ ʷ/ ˞ਤ7BTXBOJ ͷ'JHVSFΛϕʔεͱͯ͠࡞ ࣗݾҙ TFMGBUUFOUJPO w ϕΫτϧྻΛจ຺Λߟྀ͠ͳ͕Βม w 5SBOTGPSNFSͷ࠷ॏཁͳ෦ Ґஔූ߸Խ Ґஔූ߸Խ
ࣗݾҙ x1 x2 x3 Q = [ q1 ]
[ q2 ] [ q3 ] K = [ k1 ] [ k2 ] [ k3 ] V = [ v1 ] [ v2 ] [ v3 ] qi = Wq xi , Wq ∈ ℝdk ×dmodel ki = Wk xi , Wk ∈ ℝdk ×dmodel vi = Wv xi , Wv ∈ ℝdv ×dmodel
ࣗݾҙ x1 x2 x3 Q = [ q1 ]
[ q2 ] [ q3 ] K = [ k1 ] [ k2 ] [ k3 ] V = [ v1 ] [ v2 ] [ v3 ] h1 h2 h3 = ◯v1 + ◯v2 + ◯v3 = ◯v1 + ◯v2 + ◯v3 = ◯v1 + ◯v2 + ◯v3 ͜Ε͔Βܭࢉ͍ͨ͠ͷ पลจ຺ͷࠞͥ߹Θͤ۩߹Λ˓ͷ͕ܾΊΔ ͜ΕΛ Λ༻͍ͯٻΊΔ Q, K
ࣗݾҙ h1 h2 h3 = ◯v1 + ◯v2 +
◯v3 = ◯v1 + ◯v2 + ◯v3 = ◯v1 + ◯v2 + ◯v3 ͜Ε͔Βܭࢉ͍ͨ͠ͷ पลจ຺ͷࠞͥ߹Θͤ۩߹Λ˓ͷ͕ܾΊΔ ͜ΕΛ Λ༻͍ͯٻΊΔ Q, K QK⊤ = [ q1 ] [ q2 ] [ q3 ] [ k1 ] [ k2 ] [ k3 ] = q1 ⋅ k1 q1 ⋅ k2 q1 ⋅ k3 q2 ⋅ k1 q2 ⋅ k2 q2 ⋅ k3 q3 ⋅ k1 q3 ⋅ k2 q3 ⋅ k3
ࣗݾҙ h1 h2 h3 = ◯v1 + ◯v2 +
◯v3 = ◯v1 + ◯v2 + ◯v3 = ◯v1 + ◯v2 + ◯v3 ͜Ε͔Βܭࢉ͍ͨ͠ͷ पลจ຺ͷࠞͥ߹Θͤ۩߹Λ˓ͷ͕ܾΊΔ ͜ΕΛ Λ༻͍ͯٻΊΔ Q, K QK⊤ = [ q1 ] [ q2 ] [ q3 ] [ k1 ] [ k2 ] [ k3 ] = q1 ⋅ k1 q1 ⋅ k2 q1 ⋅ k3 q2 ⋅ k1 q2 ⋅ k2 q2 ⋅ k3 q3 ⋅ k1 q3 ⋅ k2 q3 ⋅ k3 ʮ٢ాʯ͔Βݟͨʮࢲʯͷॏཁ
ࣗݾҙ h1 h2 h3 = ◯v1 + ◯v2 +
◯v3 = ◯v1 + ◯v2 + ◯v3 = ◯v1 + ◯v2 + ◯v3 ͜Ε͔Βܭࢉ͍ͨ͠ͷ पลจ຺ͷࠞͥ߹Θͤ۩߹Λ˓ͷ͕ܾΊΔ ͜ΕΛ Λ༻͍ͯٻΊΔ Q, K QK⊤ = [ q1 ] [ q2 ] [ q3 ] [ k1 ] [ k2 ] [ k3 ] = q1 ⋅ k1 q1 ⋅ k2 q1 ⋅ k3 q2 ⋅ k1 q2 ⋅ k2 q2 ⋅ k3 q3 ⋅ k1 q3 ⋅ k2 q3 ⋅ k3 ʮ٢ాʯ͔Βݟͨʮʯͷॏཁ
ࣗݾҙ h1 h2 h3 = ◯v1 + ◯v2 +
◯v3 = ◯v1 + ◯v2 + ◯v3 = ◯v1 + ◯v2 + ◯v3 ͜Ε͔Βܭࢉ͍ͨ͠ͷ पลจ຺ͷࠞͥ߹Θͤ۩߹Λ˓ͷ͕ܾΊΔ ͜ΕΛ Λ༻͍ͯٻΊΔ Q, K QK⊤ = [ q1 ] [ q2 ] [ q3 ] [ k1 ] [ k2 ] [ k3 ] = q1 ⋅ k1 q1 ⋅ k2 q1 ⋅ k3 q2 ⋅ k1 q2 ⋅ k2 q2 ⋅ k3 q3 ⋅ k1 q3 ⋅ k2 q3 ⋅ k3 ʮ٢ాʯ͔Βݟͨʮ٢ాʯͷॏཁ
ࣗݾҙ h1 h2 h3 = ◯v1 + ◯v2 +
◯v3 = ◯v1 + ◯v2 + ◯v3 = ◯v1 + ◯v2 + ◯v3 ͜Ε͔Βܭࢉ͍ͨ͠ͷ पลจ຺ͷࠞͥ߹Θͤ۩߹Λ˓ͷ͕ܾΊΔ ͜ΕΛ Λ༻͍ͯٻΊΔ Q, K softmax ( QK⊤ dk ) = a11 a12 a13 a21 a22 a23 a31 a32 a33 εέʔϦϯά ิ ͜ͷߦྻΛҙߦྻ BUUFOUJPONBUSJY ͱݺͿ ֤ ҙॏΈ BUUFOUJPOXFJHIU ͱݺͿ aij
ࣗݾҙ h1 h2 h3 = a21 v1 + a22
v2 + a23 v3 ࣗݾҙͷ࠷ऴग़ྗ पลจ຺ͷࠞͥ߹Θͤ۩߹Λ˓ͷ͕ܾΊΔ ͜ΕΛ Λ༻͍ͯٻΊΔ Q, K softmax ( QK⊤ dk ) = a11 a12 a13 a21 a22 a23 a31 a32 a33 εέʔϦϯά = a11 v1 + a12 v2 + a13 v3 = a31 v1 + a32 v2 + a33 v3
ࣗݾҙ h1 h2 h3 = a21 v1 + a22
v2 + a23 v3 ࣗݾҙͷ࠷ऴग़ྗ पลจ຺ͷࠞͥ߹Θͤ۩߹Λ˓ͷ͕ܾΊΔ ͜ΕΛ Λ༻͍ͯٻΊΔ Q, K softmax ( QK⊤ dk ) = a11 a12 a13 a21 a22 a23 a31 a32 a33 εέʔϦϯά = a11 v1 + a12 v2 + a13 v3 = a31 v1 + a32 v2 + a33 v3 ࣗݾҙ ʹपลจ຺Λߟྀ͢Δػߏ
ఏҊϞσϧͷΞʔΩςΫνϟ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ 'FFE'PSXBSE /FUXPSL Ճࢉਖ਼نԽ ϚεΫ͖
Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ճࢉਖ਼نԽ ೖྗςΩετ ग़ྗςΩετ 'FFE'PSXBSE /FUXPSL ઢܗม ιϑτϚοΫεؔ ֬ 🌟 🌟 🌟 /ʷ ʷ/ ˞ਤ7BTXBOJ ͷ'JHVSFΛϕʔεͱͯ͠࡞ Ґஔූ߸Խ Ґஔූ߸Խ
ϚεΫ͖ࣗݾҙ softmax ( QK⊤ dk ) = a11 a12
a13 a21 a22 a23 a31 a32 a33
ϚεΫ͖ࣗݾҙ softmax ( QK⊤ dk ) = a11 a12
a13 a21 a22 a23 a31 a32 a33 [ 1 0 0 1 1 0 1 1 1 ] ϚεΫߦྻ
ϚεΫ͖ࣗݾҙ softmax ( QK⊤ dk ) = a11 a12
a13 a21 a22 a23 a31 a32 a33 [ 1 0 0 1 1 0 1 1 1 ] ϚεΫߦྻ a11 a12 a13 a21 a22 a23 a31 a32 a33 ⊙ [ 1 0 0 1 1 0 1 1 1 ] = a11 0 0 a21 a22 0 a31 a32 a33
ϚεΫ͖ࣗݾҙ h1 h2 h3 = a21 v1 + a22
v2 + 0v3 ࣗݾҙͷ࠷ऴग़ྗ softmax ( QK⊤ dk ) = a11 a12 a13 a21 a22 a23 a31 a32 a33 = a11 v1 + 0v2 + 0v3 = a31 v1 + a32 v2 + a33 v3 [ 1 0 0 1 1 0 1 1 1 ] ϚεΫߦྻ a11 a12 a13 a21 a22 a23 a31 a32 a33 ⊙ [ 1 0 0 1 1 0 1 1 1 ] = a11 0 0 a21 a22 0 a31 a32 a33
ࣗݾҙʹ͍ͭͯཧ
ࣗݾҙ h1 h2 h3 = ◯v1 + ◯v2 +
◯v3 = ◯v1 + ◯v2 + ◯v3 = ◯v1 + ◯v2 + ◯v3 ܭࢉ͍ͨ͠ͷ QK⊤ = [ q1 ] [ q2 ] [ q3 ] [ k1 ] [ k2 ] [ k3 ] = q1 ⋅ k1 q1 ⋅ k2 q1 ⋅ k3 q2 ⋅ k1 q2 ⋅ k2 q2 ⋅ k3 q3 ⋅ k1 q3 ⋅ k2 q3 ⋅ k3 softmax ( QK⊤ dk ) = a11 a12 a13 a21 a22 a23 a31 a32 a33
Ϛϧνϔουࣗݾҙ 'JHΑΓൈਮ Λ ׂͯ͠ฒྻॲཧ ग़ྗΛܨ͛ͯͻͱͭʹ͢Δ Q, K, V h
จͰ Ͱ࣮ݧ h = 1,4,8,16,32
ఏҊϞσϧͷΞʔΩςΫνϟ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ 'FFE'PSXBSE /FUXPSL Ճࢉਖ਼نԽ ϚεΫ͖
Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ճࢉਖ਼نԽ ೖྗςΩετ ग़ྗςΩετ 'FFE'PSXBSE /FUXPSL ઢܗม ιϑτϚοΫεؔ ֬ 🌟 🌟 🌟 /ʷ ʷ/ ˞ਤ7BTXBOJ ͷ'JHVSFΛϕʔεͱͯ͠࡞ Ґஔූ߸Խ Ґஔූ߸Խ
1PTJUJPOXJTF'FFE'PSXBSE/FUXPSLT ࣗ ݾ ҙ ࢲ ٢ా ʜ
ʜ ʜ จ຺ԽຒΊࠐΈ h1 h2 h3 ReLU(h1 W1 + b1 )W2 + b2 ReLU(h2 W1 + b1 )W2 + b2 ReLU(h3 W1 + b1 )W2 + b2 ϕΫτϧͦΕͧΕʹରͯ͠'FFEGPSXBSE/FUXPSLΛద༻
1PTJUJPOXJTF'FFE'PSXBSE/FUXPSLT ࣗ ݾ ҙ ࢲ ٢ా ʜ
ʜ ʜ จ຺ԽຒΊࠐΈ h1 h2 h3 ReLU(h1 W1 + b1 )W2 + b2 ReLU(h2 W1 + b1 )W2 + b2 ReLU(h3 W1 + b1 )W2 + b2 ϕΫτϧͦΕͧΕʹରͯ͠'FFEGPSXBSE/FUXPSLΛద༻ 5SBOTGPSNFSʹ͓͚Δ''/ͷׂʹ͍ͭͯͦͷޙ͞·͟·ͳ͕ٞ͋Δ w (FWB .PS FUBM5SBOTGPSNFSGFFEGPSXBSEMBZFSTBSFLFZWBMVFNFNPSJFTBS9JWQSFQSJOUBS9JW w ;IBOH ;IFOHZBO FUBM.PF fi DBUJPO5SBOTGPSNFSGFFEGPSXBSEMBZFSTBSFNJYUVSFTPGFYQFSUTBS9JWQSFQSJOUBS9JW w (FWB .PS FUBM5SBOTGPSNFSGFFEGPSXBSEMBZFSTCVJMEQSFEJDUJPOTCZQSPNPUJOHDPODFQUTJOUIFWPDBCVMBSZTQBDFBS9JWQSFQSJOUBS9JW w FUD
ఏҊϞσϧͷΞʔΩςΫνϟ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ 'FFE'PSXBSE /FUXPSL Ճࢉਖ਼نԽ ϚεΫ͖
Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ϛϧνϔου ࣗݾҙ Ճࢉਖ਼نԽ Ճࢉਖ਼نԽ ೖྗςΩετ ग़ྗςΩετ 'FFE'PSXBSE /FUXPSL ઢܗม ιϑτϚοΫεؔ ֬ 🌟 🌟 🌟 /ʷ ʷ/ ˞ਤ7BTXBOJ ͷ'JHVSFΛϕʔεͱͯ͠࡞ Ґஔූ߸Խ Ґஔූ߸Խ
·ͱΊ ର߅അԿ͔ ओఏҊԿ͔ ධՁͷ8IBU3FTVMU ࠶ؼܕωοτϫʔΫ ΈࠐΈωοτϫʔΫ ࣗݾҙͰ݁ͨ͠ ΤϯίʔμɾσίʔμΛఏҊ w
ฒྻԽ͕༰қ w ڑґଘΛଊ͑Δ 5SBJOJOH 3FTVMUT 5SBOTGPSNFSͷػߏ Ґஔූ߸Խ
ධՁͷ8IBU3FTVMU 5SBJOJOH3FTVMUT
8IBU λεΫ ػց༁ σʔληοτ 8.5 w χϡʔε༁ͷֶशɾධՁσʔληοτ w FOEFNJMMJPOTFOUFODFQBJST
w FOGSNJMMJPOTFOUFODFQBJST ධՁࢦඪ w #-&6ʢ༁ੑೳʣ w '-01Tʢܭࢉྔʣ
3FTVMU ଞϞσϧʹඖఢ͢ΔੑೳΛΑΓগͳֶ͍शίετͰ࣮ݱ
·ͱΊ ର߅അԿ͔ ओఏҊԿ͔ ධՁͷ8IBU3FTVMU ࠶ؼܕωοτϫʔΫ ΈࠐΈωοτϫʔΫ ࣗݾҙͰ݁ͨ͠ ΤϯίʔμɾσίʔμΛఏҊ w
ฒྻԽ͕༰қ w ڑґଘΛଊ͑Δ 5SBOTGPSNFSͷػߏ Ґஔූ߸Խ 8IBUػց༁ 3FTVMU405" ଞϞσϧʹඖఢ͢ΔੑೳΛ ΑΓগͳֶ͍शίετͰ࣮ݱ
ࠓճͷ༰ ࠷ॳͷϖʔδ ༰ղઆ จಡΉͱ͖ʹԿΛߟ͍͑ͯΔ͔ʁ ͦͷޙͷల։ ͜ͷจ୯ମͷཧղʹͱͲ·Βͣ จͷಡΈํɾͰͷҐஔ͚ΛΔ 5SBOTGPSNFSఏҊจΛಡΉ 7BTXBOJ
"TIJTI FUBM"UUFOUJPOJTBMMZPVOFFE"EWBODFTJOOFVSBMJOGPSNBUJPOQSPDFTTJOHTZTUFNT
ͦͷޙͷల։
ੜϞσϧͷੜ Τϯίʔμͱσίʔμׂ͕ҟͳΔ Τϯίʔμ σίʔμ ೖྗܥྻͷ$POUFYUVBMJ[BUJPO ࣗݾճؼతͳܥྻੜ
ੜϞσϧͷੜ Τϯίʔμͱσίʔμׂ͕ҟͳΔ Τϯίʔμ σίʔμ ೖྗܥྻͷ$POUFYUVBMJ[BUJPO ࣗݾճؼతͳܥྻੜ ಛநग़ثͱͯ͠ͷ׆༻ #&357J5FUD ੜϞσϧͱͯ͠ͷ׆༻
(15-MBNBFUD ͦΕͧΕΛϕʔεͱͨ͠৽ͨͳϞσϧ͕ੜ
ֶशύϥμΠϜͷมભ ݱࡏ εΫϥονֶश ϑΝΠϯνϡʔχϯά *ODPOUFYUMFBSOJOH ಛఆͷλεΫʹಛԽͨ͠ϞσϧΛ ϥϯμϜͳΛͱΔύϥϝλ͔Βֶश
ࣄલֶशࡁΈϞσϧΛ ݸผλεΫ͚ʹඍௐ ϞσϧͦͷͷΛௐͤͣ ࢦࣔʹै༷ͬͯʑͳλεΫΛ͜ͳ͢ w ࠶ؼܕωοτϫʔΫ w ΈࠐΈωοτϫʔΫ w 5SBOTGPSNFS w #&35 w (15 w 3FT/FU w (15 w -MBNB w 1B-.
ֶशύϥμΠϜͷมભ ݱࡏ εΫϥονֶश ϑΝΠϯνϡʔχϯά *ODPOUFYUMFBSOJOH ಛఆͷλεΫʹಛԽͨ͠ϞσϧΛ ϥϯμϜͳΛͱΔύϥϝλ͔Βֶश
ࣄલֶशࡁΈϞσϧΛ ݸผλεΫ͚ʹඍௐ ϞσϧͦͷͷΛௐͤͣ ࢦࣔʹै༷ͬͯʑͳλεΫΛ͜ͳ͢ w ࠶ؼܕωοτϫʔΫ w ΈࠐΈωοτϫʔΫ w 5SBOTGPSNFS w #&35 w (15 w 3FT/FU w (15 w -MBNB w 1B-. #&35ʹΑΔϑΝΠϯνϡʔχϯά(15ʹΑΔ*ODPOUFYUMFBSOJOH ͕ಛʹΤϙοΫϝΠΩϯά
ࣗݾҙͰදݱ͞Ε͍ͯΔࣝͱʁ Ϟσϧ͕ͲͷΑ͏ͳࣝΛ͍࣋ͬͯΔ͔Λௐࠪ͢ΔݚڀΛ ϓϩʔϏϯά QSPCJOH ͱݺͿ
ࣗݾҙͰදݱ͞Ε͍ͯΔࣝͱʁ Ϟσϧ͕ͲͷΑ͏ͳࣝΛ͍࣋ͬͯΔ͔Λௐࠪ͢ΔݚڀΛ ϓϩʔϏϯά QSPCJOH ͱݺͿ #&35ͷ࡞ΔຒΊࠐΈ͔ΒΓड͚ߏ͕͓͓ΉͶநग़Ͱ͖Δʢࠇ͕ਖ਼ղɺ੨͕#&35͔Βநग़ͨ͠Γड͚ߏʣ<)FXJUU 'JH> ໌ࣔతʹֶश͍ͯ͠ͳ͍ࣝͷ֫ಘՄೳੑ<$MBSL
'JH>
ςΩετΛ͑ͨ׆༂ %PTPWJUTLJZ 'JH 3BEGPSE 'JH
7JTJPO5SBOTGPSNFS $-*1 ࣗݾҙͷՄೳੑΛ୳Δޙଓݚڀ͕ଟൃ
·ͱΊ ର߅അԿ͔ ओఏҊԿ͔ ධՁͷ8IBU3FTVMU ࠶ؼܕωοτϫʔΫ ΈࠐΈωοτϫʔΫ ࣗݾҙͰ݁ͨ͠ ΤϯίʔμɾσίʔμΛఏҊ w
ฒྻԽ͕༰қ w ڑґଘΛଊ͑Δ 5SBOTGPSNFSͷػߏ Ґஔූ߸Խ 8IBUػց༁ 3FTVMU405" ଞϞσϧʹඖఢ͢ΔੑೳΛ ΑΓগͳֶ͍शίετͰ࣮ݱ
จͷߏ ΞϒετϥΫτ ΠϯτϩμΫγϣϯ ؔ࿈ݚڀ ఏҊख๏ ࣮ݧઃఆɾ݁Ռɾٞ ݁ ⁞֓ཁΛ௫Ή ओுΛ௫Ή
ॏΈ͚ͯ͠ಡΉ
ࠓճͷ༰ ࠷ॳͷϖʔδ ༰ղઆ จಡΉͱ͖ʹԿΛߟ͍͑ͯΔ͔ʁ ͦͷޙͷల։ ͜ͷจ୯ମͷཧղʹͱͲ·Βͣ จͷಡΈํɾͰͷҐஔ͚ΛΔ 5SBOTGPSNFSఏҊจΛಡΉ 7BTXBOJ
"TIJTI FUBM"UUFOUJPOJTBMMZPVOFFE"EWBODFTJOOFVSBMJOGPSNBUJPOQSPDFTTJOHTZTUFNT