Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Visualizing and Measuring the Geometry of BERT
Search
Asei Sugiyama
September 04, 2019
Technology
0
840
Visualizing and Measuring the Geometry of BERT
NN論文を肴に酒を飲む会 #9
https://tfug-tokyo.connpass.com/event/143283/
での発表用資料です
Asei Sugiyama
September 04, 2019
Tweet
Share
More Decks by Asei Sugiyama
See All by Asei Sugiyama
MLOps の現場から
asei
8
880
LLMOps: Eval-Centric を前提としたMLOps
asei
7
640
The Rise of LLMOps
asei
13
2.9k
生成AIの活用パターンと継続的評価
asei
15
2.6k
最近の Citadel AI の取り組みのご紹介 (Nov, 2024)
asei
2
120
仕事で取り組む 生成 AI 時代の対話の品質評価
asei
2
84
MLOps の処方箋ができるまで
asei
3
630
LLM を現場で評価する
asei
5
1k
生成 AI の評価方法
asei
8
2.4k
Other Decks in Technology
See All in Technology
クラウドサービス事業者におけるOSS
tagomoris
3
980
RemoveだらけのPHPUnit 12に備えよう
cocoeyes02
0
170
ESXi で仮想化した ARM 環境で LLM を動作させてみるぞ
unnowataru
0
150
【内製開発Summit 2025】イオンスマートテクノロジーの内製化組織の作り方/In-house-development-summit-AST
aeonpeople
1
500
【5分でわかる】セーフィー エンジニア向け会社紹介
safie_recruit
0
18k
コンテナサプライチェーンセキュリティ
kyohmizu
1
130
JavaにおけるNull非許容性
skrb
1
1.5k
CDKのコードを書く環境を作りました with Amazon Q
nobuhitomorioka
1
150
生成AI×財務経理:PoCで挑むSlack AI Bot開発と現場巻き込みのリアル
pohdccoe
0
440
EDRの検知の仕組みと検知回避について
chayakonanaika
11
4.3k
利用終了したドメイン名の最強終活〜観測環境を育てて、分析・供養している件〜 / The Ultimate End-of-Life Preparation for Discontinued Domain Names
nttcom
2
360
抽象化をするということ - 具体と抽象の往復を身につける / Abstraction and concretization
soudai
27
15k
Featured
See All Featured
StorybookのUI Testing Handbookを読んだ
zakiyama
28
5.5k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
160
15k
What's in a price? How to price your products and services
michaelherold
244
12k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
7
640
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
21
2.5k
The Cult of Friendly URLs
andyhume
78
6.2k
The Language of Interfaces
destraynor
156
24k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
29
2.4k
Fashionably flexible responsive web design (full day workshop)
malarkey
406
66k
Building a Modern Day E-commerce SEO Strategy
aleyda
38
7.1k
Building Your Own Lightsaber
phodgson
104
6.2k
How STYLIGHT went responsive
nonsquared
98
5.4k
Transcript
Visualizing and Measuring the Geometry of BERT NN จΛࡘʹञΛҿΉձ #9
ࣗݾհ • ਿࢁ Ѩ • Software Engineer @Repro • ػցֶशͱ͔౷ܭͱ͔։ൃͱ͔
• TensorFlow Docs ༁ & ϨϏϡʔ • ػցֶशਤؑ ڞஶ
Abstract • Google PAIRͰհ͞Ε͍ͯͨจ • ࣗવݴޠॲཧʹ͓͍ͯ Transformer ʹࣅͨΞʔΩςΫνϟͷ ωοτϫʔΫۃΊͯ༗ •
ͦͷΑ͏ͳωοτϫʔΫͰࣗવݴޠॲཧʹ͓͚ΔಛΛ෦Ͱ ͲͷΑ͏ʹอ͍࣋ͯ͠Δͷ͔໌Β͔ʹ͍ͨ͠ • BERT ʹ͍ͭͯఆྔɾఆੑతͳੳΛߦͬͨ • ҙຯɾߏจతͳใΛֶश͍ͯͦ͠͏ͳ݁Ռ͕ಘΒΕͨ
࣍ 1.Context & related works <- 2.Geometry of syntax 3.Geometry
of word senses • Measurement of word sense disambiguation capability • Embedding distance and context: a concatenation experiment 4.Conclusion
Context & related works • A Structural Probe for Finding
Syntax in Word Representations (2019) ͷΞϯαʔʹͳ͍ͬͯΔ • ͜ͷจൈ͖ʹ΄ͱΜͲԿΘ͔Βͳ͍ߏ
!
2 ഒಡΊΔ͓ಘͳจ
A Structural Probe for Finding Syntax in Word Representations NN
จΛࡘʹञΛҿΉձ #9
ࣗݾհ • ਿࢁ Ѩ • Software Engineer @Repro • ػցֶशͱ͔౷ܭͱ͔։ൃͱ͔
• TensorFlow Docs ༁ & ϨϏϡʔ • ػցֶशਤؑ ڞஶ
Abstract • Stanford େֶͷจ • ୯ޠදݱʹ͍ͭͯղੳ͕ਐΜͰ͖͍ͯΔ͕ɺߏจͷදݱ͕ ֶश͞Ε͍ͯΔ͔ʹ͍ͭͯ͜Ε·Ͱ͔֬ΊΒΕ͍ͯͳ͍ • ຊݚڀͰ structual
probe ͱ͍͏ख๏ΛఏҊ͢Δ • ͜Εneural networkͷ୯ޠදݱΛઢܗมۭͨؒ͠ʹߏจ ͕ຒΊࠐ·Ε͍ͯΔ͔ΛධՁ͢ΔͷͰ͋Δ • ELMo, BERT ͰߏจΛֶश͍ͯ͠Δͱࣔࠦ͢Δ݁ՌΛಘͨ
ݚڀͷత • ਂϞσϧͰߏจΛֶश͍ͯ͠Δͷ͔ɺͱ͍͏ٙʹ͑ ͍ͨ ͜ͷจͰઆ໌͢Δ͜ͱ • ୯ޠදݱ͔ΒߏจΛݟ͚ͭΔํ๏ʹ͍ͭͯ • ୯ޠදݱͷ࣍ݩͷࣹӨ͔Βߏจʹؔ͢ΔใΛ෮ݩ͠ɺ ධՁ͢Δํ๏ͱͦͷ۩ମྫ
(ELMo, BERT)ʹ͍ͭͯ
ख๏ͷΞΠσΞ • άϥϑͷϊʔυؒͷڑΛอͬͨ·· ϕΫτϧۭؒʹຒΊࠐΉ͜ͱΛߟ͑Δ • ͜͠Ε͕Ͱ͖͍ͯΕɺ͋Δϊʔυ ͷྡͷϊʔυ Λ୳͢͜ͱۙ ୳ࡧͱಉ͡ •
·ͨɺϞσϧ͕ਖ਼͘͠ߏΛֶश͢ ΕɺͦͷදݱۭؒͷҰ෦͚ͩΛར༻ ͢ΔͷͰͳ͍͔ • දݱۭؒͷ෦ۭؒͰɺߏͷڑ Λอ͍ͬͯΔΑ͏ͳͷΛ୳ͤྑ͍
ͭ·Γ? • ղઆهࣄ1ʹ͋Δਤ͕Θ͔Γ͍͢ • ࠨͷۭ͕ؒ୯ޠͷදݱۭؒ • ࠨਤதͷփ৭ͷฏ໘͕ߏΛදݱ͠ ͍ͯΔ෦ۭؒ • ӈଆ͕෮ݩ͞Εͨߏ
1 https://nlp.stanford.edu//~johnhew//structural-probe.html
None
The structural probe • : ൪ͷจதͷ ൪ͷ୯ޠͱͦͷϕΫτϧ • : ߏจ্Ͱͷϊʔυؒڑ
• : ෦্ۭؒͰͷڑ
Results (Table 1) • จ຺Λߟྀ͠ͳ͍Ϟσϧ(্4ͭ)ʹର͠ ͯɺจ຺Λߟྀ͢ΔϞσϧ(Լ4ͭ)ͷํ ͕ߏจΛ࠶ݱͰ͖͍ͯΔ2 2 Γड͚ߏʹ͍ͭͯɺछผํແࢹͯ͠ධՁ͍ͯ͠Δ
Results (Figure 2)
Results (Figure 4) • ࠨ: ߏจͰܭࢉͨ͠୯ޠؒڑ • ӈ: BERT(large) 16
Ͱܭࢉ͠ ͨ୯ޠؒڑ • શମతͳߏΛ࠶ݱͰ͖͍ͯͦ͏
future works • ڑͦͷͷͰͳ͘ڑͷ 2 Λ༻ ͍Δ͜ͱ͕ॏཁͩͱ࣮ݧ͔ΒΘ͔ͬͨ • ͳͥ 2
ͷํ͕ྑ͍ͷ͔Α͔͘Β ͳ͔ͬͨ
͜͜·Ͱ͕ Context
࣍ 1.Context & related works 2.Geometry of syntax <- 3.Geometry
of word senses • Measurement of word sense disambiguation capability • Embedding distance and context: a concatenation experiment 4.Conclusion
Geometry of syntax • BERT ͷֶश݁Ռʹ͍ͭͯɺ࣍ͷ 2 ͭͷ؍͔Βߦͬͨ 1.ͦͦʹཱͭදݱΛֶशͰ͖͍ͯΔͷ͔ 2.ߏจΛֶशͰ͖͍ͯΔͷ͔
Attention probes and dependency representations • BERT ͷֶश݁Ռʹؔ͢ΔఆྔධՁ (༧උ࣮ݧ) •
Penn Treebank ͷσʔλΛ༻͍ͯɺ 2 ͭͷ୯ޠͷؒͷΓड͚ߏΛఆ ͤ͞ΔλεΫ • BERT ͷग़ྗΛͱʹͯ͠ऑ͍Ϟσϧ (ઢܗࣝผػ + L2 ਖ਼ଇԽ) Ͱֶश • ݁Ռɺaccuracy ͕ 85.8% ͋ͬͨͷ Ͱɺ࣍ʹਐΜͰྑͦ͞͏ͩͱஅͯ͠ ͍Δ
Mathematics of embedding trees in Euclidean space • ϊʔυ͔ΒͳΔ ʹڑ
(తͳͷ)Λอͬͨ··ຒΊࠐΊΔ͜ ͱֶ͕తʹূ໌Ͱ͖ͨ • ·ͨɺڑͦͷͷΛ༻͍ͯ͠·͏ ͱɺڑΛอͭຒΊࠐΈ͕Ͱ͖ͳ͍Α ͏ͳ߹͕͋Δ͜ͱࣔ͞Εͨ • ͜ΕʹΑΓ͕॓ղܾͨ͠ͱ͍ͯ͠Δ
ͭ·Γ? • blog هࣄͰৄ͘͠ղઆ͞Ε͍ͯΔͷ Ͱɺৄࡉ͕ؾʹͳͬͨΒ͔͜͜ΒೖΔ ͷ͕͓͢͢Ί • https://pair-code.github.io/ interpretability/bert-tree/
Visualization of parse tree embeddings • ߏจͷڑΛอͭΑ͏ͳຒΊࠐΈͱ BERT ͱͷ݁Ռ͕ྨࣅ
Visualization of parse tree embeddings • ߏจΛຒΊࠐΜͩ݁ՌͱɺBERT ͷ ֶश݁ՌͱͰڑΛൺֱ •
ൺΛͱͬͨΛ৭Ͱදࣔ • BERT / ਅͷߏจ Λදࣔ • ͍ઢߏจ্Ͱܨ͕Γ͕ͳ͔ͬ ͕ͨɺBERT ͷֶश݁ՌͰۙ͘ͳͬ ͨͷ • part/of, sale/of ͳͲͻͱ·ͱ· ΓͰѻ͏ͷ͕ྑͦ͞͏ͳͷ͍ۙ
None
Visualization of parse tree embeddings • ߏจΛຒΊࠐΜͩ݁ՌͱɺBERT ͷ ֶश݁ՌͱͰڑͷൺͷΛݕ౼ •
ґଘؔ͝ͱʹूܭͨ݁͠Ռ͕ӈਤ • ؔ͝ͱʹ 1.2 ͔Β 2.5 ·Ͱ͘ ͍ͯ͠Δ • ؔੑʹରͯ͠ఆྔతͳ؍Λ BERT ͕Ճ͍͑ͯΔ͜ͱΛࣔࠦ͢Δ݁Ռ
࣍ 1.Context & related works 2.Geometry of syntax 3.Geometry of
word senses <- • Measurement of word sense disambiguation capability • Embedding distance and context: a concatenation experiment 4.Conclusion
Geometry of word senses • ߏจ͚ͩͰͳ͘୯ޠͷҙຯΛଊ͑ΒΕ͍ͯΔ͔ݕ౼ • ҙຯΛද͢෦ۭ͕ؒಘΒΕͳ͍͔࣮ݧ • Ͳ͏ΒಘΒΕͨ
! • จ຺ΛਓతʹௐઅͰ͖ͳ͍͔࣮ݧ • Ͱ͖ͳ͔ͬͨͲ͜Ζ͔ѱԽͨ͠
Measurement of word sense disambiguation capability • BERT ͷग़ྗΛ UMAP
ͰՄࢹԽ • ಉ͡ "die" ʹରͯ͠ෳͷҙຯΛ ͭΫϥελ͕Ͱ͖͍ͯΔ • kNN ΛͬͯޠٛᐆດੑղফλεΫΛ ߦͬͨ݁Ռ accuracy 71.1% (SOTA)
None
ҙຯͷใͷ • "structural probe" ͱಉ༷ʹͯ͠ ҙຯΛද͢෦ۭؒΛநग़ • ߏจͱͷڑͷࠩͰͳ͘ɺ୯ޠ ͷҙຯؒͰͷίαΠϯྨࣅΛར༻ (ৄࡉෆ໌)
• ࣍ݩݮલͷ accuracy 71.1% • ࣍ݩݮΛߦ͏ͱগ্͕͠Δ • ҙຯͷ෦ۭؒͱ͍͏ͷ͕͋Γͦ͏
Embedding distance and context: a concatenation experiment • จ຺Λҙਤతʹૢ࡞͢Δ͜ͱͰྑ͍݁ ՌΛಘΒΕͳ͍͔࣮ݧ
• ಛఆͷҙຯ͋Δ୯ޠΛ༻͍͍ͯΔද తͳจΛݟ͚ͭग़͠ɺಉ͡ҙຯͰಉ͡ ୯ޠΛ༻͍͍ͯΔจʹ࿈݁ͨ͠ • "I went to Edo" ͕දతͳจ ͳ߹ɺ"He went to Edo"ʹ ͚ͯ͠"He went to Edo and I went to Edo" ͱ͍͏จΛ࡞Δ
Embedding distance and context: a concatenation experiment • ԣ࣠: BERT
ͷϨΠϠʔ • ॎ࣠: ҙຯͷҧ͏Ϋϥελͷத৺ͱͷ ڑͷൺతͳͷ (େ͖͍΄ͲΑ͍) • දతͳจΛ͚Ճ͑ͨ߹ɺͦͷ୯ ޠͷҙຯΛΑΓΑ͘Ͱ͖Δ͔ͱ ࢥͬͨΒͦΜͳ͜ͱͳ͔ͬͨ
࣍ 1.Context & related works 2.Geometry of syntax 3.Geometry of
word senses • Measurement of word sense disambiguation capability • Embedding distance and context: a concatenation experiment 4.Conclusion <-
Conclusion • "structural probe" ʹֶతͳҙຯ͚Λߦͬͨ • ߏจͷຒΊࠐΈͱBERTͷֶश݁ՌΛൺֱͨ͠ͱ͜ΖɺߏจΛ ֶश͍ͯͦ͠͏ͳ݁Ռ͕ಘΒΕͨ • ߏจΛֶश͢ΔۭؒͱผʹɺҙຯΛֶश͢Δۭ͕ؒ͋Γͦ͏ͳ
͜ͱ͕Θ͔ͬͨ • ଞʹࣗવݴޠతͳҙຯͰॏཁͳ෦ۭ͕ؒ͋Δ͔ࠓޙͷݚڀ ՝
࠷ޙʹ • ࠓͷΠϕϯτͷ෮श • TensorFlow User Group Tokyo • NNจΛࡘʹञΛҿΉձ
None
None
TensorFlow User Group Tokyo NNจΛࡘʹञΛҿΉձ #9