Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Pathologies of Neural Models Make Interpretatio...
Search
Yasufumi Taniguchi
December 09, 2018
Research
1.8k
1
Share
Pathologies of Neural Models Make Interpretations Difficult
Yasufumi Taniguchi
December 09, 2018
More Decks by Yasufumi Taniguchi
See All by Yasufumi Taniguchi
AllenNLPを使った開発
yasufumy
0
2.3k
Making Neural QA as Simple as Possible but not Simpler
yasufumy
0
100
Other Decks in Research
See All in Research
業界横断 副業コンプライアンス調査 三者(副業者・本業先・発注者)におけるトラブル認知ギャップの構造分析
fkske
0
1.3k
PGDM: Physically Guided Diffusion Model for L Downscaling
satai
0
210
都市交通マスタープランとその後への期待@熊本商工会議所・熊本経済同友会
trafficbrain
0
210
AY 2026 Guide to Academic Writing Using Generative AI - Workshop
ks91
PRO
0
110
[BlackHatAsia2026] Hidden Telemetry: Uncovering TraceLogging ETW Providers You're Not Using (Yet)
asuna_jp
1
460
2026年1月の生成AI領域の重要リリース&トピック解説
kajikent
0
1k
「なんとなく」の顧客理解から脱却する ──顧客の解像度を武器にするインサイトマネジメント
tajima_kaho
10
7.6k
CyberAgent AI Lab研修 / Social Implementation Anti-Patterns in AI Lab
chck
7
4.5k
通時的な類似度行列に基づく単語の意味変化の分析
rudorudo11
0
290
FUSE-RSVLM: Feature Fusion Vision-Language Model for Remote Sensing
satai
3
820
2026年3月1日(日)福島「除染土」の公共利用をかんがえる
atsukomasano2026
0
610
Collective Predictive Coding and World Models in LLMs: A System 0/1/2/3 Perspective on Hierarchical Physical AI (IEEE SII 2026 Plenary Talk)
tanichu
1
400
Featured
See All Featured
A Guide to Academic Writing Using Generative AI - A Workshop
ks91
PRO
1
310
Impact Scores and Hybrid Strategies: The future of link building
tamaranovitovic
0
290
Raft: Consensus for Rubyists
vanstee
141
7.5k
Balancing Empowerment & Direction
lara
6
1.1k
The Organizational Zoo: Understanding Human Behavior Agility Through Metaphoric Constructive Conversations (based on the works of Arthur Shelley, Ph.D)
kimpetersen
PRO
0
340
Winning Ecommerce Organic Search in an AI Era - #searchnstuff2025
aleyda
1
2k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
128
55k
Accessibility Awareness
sabderemane
1
130
Claude Code のすすめ
schroneko
67
220k
HU Berlin: Industrial-Strength Natural Language Processing with spaCy and Prodigy
inesmontani
PRO
0
390
Introduction to Domain-Driven Design and Collaborative software design
baasie
1
810
Avoiding the “Bad Training, Faster” Trap in the Age of AI
tmiket
0
160
Transcript
ൃදऀ ୩ޱହ࢙ ҟৗͳڍಈ
!2 Pathological behavior ࣭จ͕did͚ͩͰ Ϟσϧͷग़ྗಉ͡ ֬ߴ͍
֓ཁ w NLPʹ͓͚ΔχϡʔϥϧϞσϧͷղੳख๏ΛఏҊ w Ϟσϧ͕λεΫΛղ্͘Ͱॏཁͳ୯ޠΛநग़͢Δख๏ w நग़͞Εͨ୯ޠਓʹͱͬͯҙຯෆ໌ w ҰํͰϞσϧநग़୯ޠͰਖ਼͘͠༧ଌ(Pathology) w
ղੳ݁Ռʹجͮ͘ਖ਼ଇԽ߲ΛఏҊ w ਖ਼ଇԽ߲ʹΑͬͯϞσϧͷղऍੑ্ !3
࣍ Ϟσϧղੳͷطଘख๏ ఏҊख๏ ࣮ݧ ·ͱΊ !4
Ϟσϧղੳͷطଘख๏
Ϟσϧղੳͷطଘख๏ !6 Adversarial Example Ϟσϧʹਓͷײʹ͢ΔڍಈΛͤ͞Δαϯϓϧ NLPͷλεΫ ओʹQAλεΫ Ͱύλʔϯ ਓʹͱͬͯҙຯͷͳ͍มߋ͕ɺϞσϧͷग़ྗΛܹมͤ͞Δέʔε
ਓʹͱͬͯ໌Β͔ͳมߋͰɺϞσϧ͕ग़ྗΛม͑ͳ͍έʔε
ग़ྗ͕ܹม͢Δέʔε !7 Jia et al., 2017 ΫΥʔλʔόοΫͷྸʹ͍ͭͯͷ จॻʹΫΥʔλʔόοΫͷഎ൪߸ʹ ؔ͢ΔจΛՃ Ϟσϧޡ
ग़ྗΛม͑ͳ͍έʔε !8 Mudrakarta et al., 2018 ݐͷന͍ϨϯΨ͕ରশ͔ʁ spherical (ٿঢ়ͷ) ݐͷന͍ϨϯΨ͕ٿঢ়͔ʁ
࣭จͷҙຯมԽ Ϟσϧͷ༧ଌෆม
2. ఏҊख๏
*OQVU3FEVDUJPO • ॏཁͰͳ͍୯ޠΛೖྗ͔ΒΓɺϞσϧͷڍಈΛੳ • Ϟσϧ͕ਖ਼͍͠ग़ྗΛ͢ΔͨΊʹඞཁͳ࠷୯ޠ (ॏཁ ୯ޠ) • Adversarial ExampleϞσϧʹͱͬͯͷॏཁ୯ޠʹண
*OQVU3FEVDUJPO !11 x y Ϟσϧͷ༧ଌ f( ⋅ ) χϡʔϥϧϞσϧ ೖྗܥྻ
(จจॻ) xi ೖྗܥྻͷ͋Δཁૉ (୯ޠ) g(xi |x) = f(y|x) − f(y|x−i ) ͋Δ୯ޠ ʹର͢Δ ॏཁΛఆٛ xi g i൪ͷ୯ޠΛফͨ͠ೖྗ
*OQVU3FEVDUJPO !12 g(xcontest |x) = f(y|x) − f(y|x−contest ) What
company won free advertisement due to QuickBooks contest ? What company won free advertisement due to QuickBooks contest ? g͕େ͖͚Εɺcontest͕ॏཁͳ୯ޠͱͳΔ Ϟσϧͷग़ྗʹେ͖͘د༩͍ͯ͠ΔͨΊ
*OQVU3FEVDUJPO !13 g(xi |x) = f(y|x) − f(y|x−i ) ॏཁͷ͍୯ޠΛআ
y͕มԽ͠ͳ͍Α͏ʹɺg͕࠷খͱͳΔ୯ޠiΛআ ͍ͯ͘͠
3. ࣮ݧ
ղੳͷରλεΫ 1. SQuAD w จॻͱ࣭จ͕༩͑ΒΕΔˠ࣭จʹରͯ͠Input Reduction w จॻ͔ΒղΛநग़͢ΔλεΫ 2. SNLI
w จ͕༩͑ΒΕΔˠͭͷจʹରͯ͠Input Reduction w จͷؔΛਪఆ͢ΔλεΫ 3. VQA w ը૾ͱ࣭จ͕༩͑ΒΕΔˠ࣭จʹରͯ͠Input Reduction w ղΛੜ͢ΔλεΫ !15
࣮ݧ༰ Input Reduction w Ϟσϧ͕ਖ਼͍͠ग़ྗΛ͢ΔαϯϓϧΛରʹ࣮ݧ w Input ReductionΛద༻ͨ͠ೖྗ(Reduced)ʹର͢ΔਓखධՁ w ReducedͱϥϯμϜʹ୯ޠΛམͱͨ͠߹(Random)ͷࠩҟͷධՁ
Regularization on Reduced Inputs w Input ReductionʹΑΔϞσϧͷPathological behaviorΛܰݮ͢Δਖ਼ଇԽ߲ ޙड़ ͷಋೖ !16
Reducedʹର͢ΔਓखධՁ !17 Reducedʹରͯ͠ ਓਖ਼͍͠༧ଌΛͰ ͖ͳ͍ w Reducedʹର͢Δਓͷਖ਼ w Ϟσϧͷਖ਼͕ͷαϯϓϧΛ༻
Reducedʹର͢ΔਓखධՁ !18 w ReducedͱRandomͷͲͪΒ͕ࣗવͳจ͔ w vs. Randomfifty-fiftyͱׂ͑ͨ߹ Reducedਓʹͱͬ ͯRandomͱಉ͡
Reducedͷࣄྫ !19 ʮͲ͜Ͱ࿅शͨ͠ ͔ʯΛฉ͔Ε͍ͯ ΔͷΘ͔Δ͕ɺ ʮͲͷνʔϜʯ͔ Θ͔Βͳ͍
Reducedͷฏۉ୯ޠ ͭͷλεΫͱɺ ਖ਼͢Δͷʹඞཁͳ୯ޠฏۉd
Reducedʹର͢ΔϞσϧͷ֬ !21 • Input Reductionͷద༻લޙͰϞσϧͷ ֬ʹมԽ΄ͱΜͲͳ͍ • ϞσϧӶ͍ϐʔΫΛ࣋ͭΑ͏ͳ Λֶश͍ͯ͠Δ͜ͱ͕ݪҼ
ਖ਼ଇԽ߲ͷಋೖ !22 ∑ (x,y)∈(X,Y) log(f(y|x)) + λ∑ ¯ x∈ ¯
X H(f(y| ¯ x)) Reducedʹରͯ͠ਖ਼͍͠yΛ ग़ྗ͠ʹ͘͘͢Δ ௨ৗͷతؔ Reducedαϯϓϧ௨ৗͷతؔΛֶͬͯशͨ͠ ϞσϧΛ༻͍ͯੜ
ਖ਼ଇԽ߲ͷޮՌ !23 • Ϟσϧͷਫ਼͕ඍ૿ • ਖ਼ʹඞཁͳ୯ޠ ͕૿Ճ
ਖ਼ଇԽ߲ͷޮՌ !24 ਓखධՁͷਫ਼্ Input Reductionͨ͠ೖྗ ͷղऍੑ্͕
ਖ਼ଇԽͨ͠Ϟσϧͷࣄྫ !25 Input Reductionͨ͠ೖྗ͕ਓͰ ղऍՄೳʹͳͬͨ
·ͱΊ ఏҊख๏ w NLPͷχϡʔϥϧϞσϧղੳख๏ͱͯ͠ɺInput ReductionΛఏҊ w ༧ଌʹد༩͠ͳ͍୯ޠΛೖྗ͔ΒΓɺϞσϧΛղੳ ࣮ݧ݁Ռ w ఏҊख๏Λద༻ͨ͠ೖྗਓʹͱͬͯҙຯෆ໌
w ҰํͰχϡʔϥϧϞσϧਖ਼͍͠༧ଌΛߦ͏ w ਖ਼ଇԽ߲Λಋೖ͢ΔͱϞσϧͷڍಈվળ !26