Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Pathologies of Neural Models Make Interpretatio...
Search
Yasufumi Taniguchi
December 09, 2018
Research
1
1.8k
Pathologies of Neural Models Make Interpretations Difficult
Yasufumi Taniguchi
December 09, 2018
Tweet
Share
More Decks by Yasufumi Taniguchi
See All by Yasufumi Taniguchi
AllenNLPを使った開発
yasufumy
0
2.3k
Making Neural QA as Simple as Possible but not Simpler
yasufumy
0
98
Other Decks in Research
See All in Research
LLM-Assisted Semantic Guidance for Sparsely Annotated Remote Sensing Object Detection
satai
3
470
SkySense V2: A Unified Foundation Model for Multi-modal Remote Sensing
satai
3
500
Attaques quantiques sur Bitcoin : comment se protéger ?
rlifchitz
0
140
自動運転におけるデータ駆動型AIに対する安全性の考え方 / Safety Engineering for Data-Driven AI in Autonomous Driving Systems
ishikawafyu
0
130
Thirty Years of Progress in Speech Synthesis: A Personal Perspective on the Past, Present, and Future
ktokuda
0
170
第二言語習得研究における 明示的・暗示的知識の再検討:この分類は何に役に立つか,何に役に立たないか
tam07pb915
0
1.2k
世界モデルにおける分布外データ対応の方法論
koukyo1994
7
1.5k
2026年1月の生成AI領域の重要リリース&トピック解説
kajikent
0
360
Combining Deep Learning and Street View Imagery to Map Smallholder Crop Types
satai
3
570
Remote sensing × Multi-modal meta survey
satai
4
710
[IBIS 2025] 深層基盤モデルのための強化学習驚きから理論にもとづく納得へ
akifumi_wachi
19
9.6k
【SIGGRAPH Asia 2025】Lo-Fi Photograph with Lo-Fi Communication
toremolo72
0
120
Featured
See All Featured
WENDY [Excerpt]
tessaabrams
9
36k
Code Reviewing Like a Champion
maltzj
527
40k
Groundhog Day: Seeking Process in Gaming for Health
codingconduct
0
94
Max Prin - Stacking Signals: How International SEO Comes Together (And Falls Apart)
techseoconnect
PRO
0
86
HDC tutorial
michielstock
1
390
Keith and Marios Guide to Fast Websites
keithpitt
413
23k
Building a Modern Day E-commerce SEO Strategy
aleyda
45
8.7k
JAMstack: Web Apps at Ludicrous Speed - All Things Open 2022
reverentgeek
1
350
Un-Boring Meetings
codingconduct
0
200
How Fast Is Fast Enough? [PerfNow 2025]
tammyeverts
3
460
Paper Plane
katiecoart
PRO
0
46k
The Cost Of JavaScript in 2023
addyosmani
55
9.5k
Transcript
ൃදऀ ୩ޱହ࢙ ҟৗͳڍಈ
!2 Pathological behavior ࣭จ͕did͚ͩͰ Ϟσϧͷग़ྗಉ͡ ֬ߴ͍
֓ཁ w NLPʹ͓͚ΔχϡʔϥϧϞσϧͷղੳख๏ΛఏҊ w Ϟσϧ͕λεΫΛղ্͘Ͱॏཁͳ୯ޠΛநग़͢Δख๏ w நग़͞Εͨ୯ޠਓʹͱͬͯҙຯෆ໌ w ҰํͰϞσϧநग़୯ޠͰਖ਼͘͠༧ଌ(Pathology) w
ղੳ݁Ռʹجͮ͘ਖ਼ଇԽ߲ΛఏҊ w ਖ਼ଇԽ߲ʹΑͬͯϞσϧͷղऍੑ্ !3
࣍ Ϟσϧղੳͷطଘख๏ ఏҊख๏ ࣮ݧ ·ͱΊ !4
Ϟσϧղੳͷطଘख๏
Ϟσϧղੳͷطଘख๏ !6 Adversarial Example Ϟσϧʹਓͷײʹ͢ΔڍಈΛͤ͞Δαϯϓϧ NLPͷλεΫ ओʹQAλεΫ Ͱύλʔϯ ਓʹͱͬͯҙຯͷͳ͍มߋ͕ɺϞσϧͷग़ྗΛܹมͤ͞Δέʔε
ਓʹͱͬͯ໌Β͔ͳมߋͰɺϞσϧ͕ग़ྗΛม͑ͳ͍έʔε
ग़ྗ͕ܹม͢Δέʔε !7 Jia et al., 2017 ΫΥʔλʔόοΫͷྸʹ͍ͭͯͷ จॻʹΫΥʔλʔόοΫͷഎ൪߸ʹ ؔ͢ΔจΛՃ Ϟσϧޡ
ग़ྗΛม͑ͳ͍έʔε !8 Mudrakarta et al., 2018 ݐͷന͍ϨϯΨ͕ରশ͔ʁ spherical (ٿঢ়ͷ) ݐͷന͍ϨϯΨ͕ٿঢ়͔ʁ
࣭จͷҙຯมԽ Ϟσϧͷ༧ଌෆม
2. ఏҊख๏
*OQVU3FEVDUJPO • ॏཁͰͳ͍୯ޠΛೖྗ͔ΒΓɺϞσϧͷڍಈΛੳ • Ϟσϧ͕ਖ਼͍͠ग़ྗΛ͢ΔͨΊʹඞཁͳ࠷୯ޠ (ॏཁ ୯ޠ) • Adversarial ExampleϞσϧʹͱͬͯͷॏཁ୯ޠʹண
*OQVU3FEVDUJPO !11 x y Ϟσϧͷ༧ଌ f( ⋅ ) χϡʔϥϧϞσϧ ೖྗܥྻ
(จจॻ) xi ೖྗܥྻͷ͋Δཁૉ (୯ޠ) g(xi |x) = f(y|x) − f(y|x−i ) ͋Δ୯ޠ ʹର͢Δ ॏཁΛఆٛ xi g i൪ͷ୯ޠΛফͨ͠ೖྗ
*OQVU3FEVDUJPO !12 g(xcontest |x) = f(y|x) − f(y|x−contest ) What
company won free advertisement due to QuickBooks contest ? What company won free advertisement due to QuickBooks contest ? g͕େ͖͚Εɺcontest͕ॏཁͳ୯ޠͱͳΔ Ϟσϧͷग़ྗʹେ͖͘د༩͍ͯ͠ΔͨΊ
*OQVU3FEVDUJPO !13 g(xi |x) = f(y|x) − f(y|x−i ) ॏཁͷ͍୯ޠΛআ
y͕มԽ͠ͳ͍Α͏ʹɺg͕࠷খͱͳΔ୯ޠiΛআ ͍ͯ͘͠
3. ࣮ݧ
ղੳͷରλεΫ 1. SQuAD w จॻͱ࣭จ͕༩͑ΒΕΔˠ࣭จʹରͯ͠Input Reduction w จॻ͔ΒղΛநग़͢ΔλεΫ 2. SNLI
w จ͕༩͑ΒΕΔˠͭͷจʹରͯ͠Input Reduction w จͷؔΛਪఆ͢ΔλεΫ 3. VQA w ը૾ͱ࣭จ͕༩͑ΒΕΔˠ࣭จʹରͯ͠Input Reduction w ղΛੜ͢ΔλεΫ !15
࣮ݧ༰ Input Reduction w Ϟσϧ͕ਖ਼͍͠ग़ྗΛ͢ΔαϯϓϧΛରʹ࣮ݧ w Input ReductionΛద༻ͨ͠ೖྗ(Reduced)ʹର͢ΔਓखධՁ w ReducedͱϥϯμϜʹ୯ޠΛམͱͨ͠߹(Random)ͷࠩҟͷධՁ
Regularization on Reduced Inputs w Input ReductionʹΑΔϞσϧͷPathological behaviorΛܰݮ͢Δਖ਼ଇԽ߲ ޙड़ ͷಋೖ !16
Reducedʹର͢ΔਓखධՁ !17 Reducedʹରͯ͠ ਓਖ਼͍͠༧ଌΛͰ ͖ͳ͍ w Reducedʹର͢Δਓͷਖ਼ w Ϟσϧͷਖ਼͕ͷαϯϓϧΛ༻
Reducedʹର͢ΔਓखධՁ !18 w ReducedͱRandomͷͲͪΒ͕ࣗવͳจ͔ w vs. Randomfifty-fiftyͱׂ͑ͨ߹ Reducedਓʹͱͬ ͯRandomͱಉ͡
Reducedͷࣄྫ !19 ʮͲ͜Ͱ࿅शͨ͠ ͔ʯΛฉ͔Ε͍ͯ ΔͷΘ͔Δ͕ɺ ʮͲͷνʔϜʯ͔ Θ͔Βͳ͍
Reducedͷฏۉ୯ޠ ͭͷλεΫͱɺ ਖ਼͢Δͷʹඞཁͳ୯ޠฏۉd
Reducedʹର͢ΔϞσϧͷ֬ !21 • Input Reductionͷద༻લޙͰϞσϧͷ ֬ʹมԽ΄ͱΜͲͳ͍ • ϞσϧӶ͍ϐʔΫΛ࣋ͭΑ͏ͳ Λֶश͍ͯ͠Δ͜ͱ͕ݪҼ
ਖ਼ଇԽ߲ͷಋೖ !22 ∑ (x,y)∈(X,Y) log(f(y|x)) + λ∑ ¯ x∈ ¯
X H(f(y| ¯ x)) Reducedʹରͯ͠ਖ਼͍͠yΛ ग़ྗ͠ʹ͘͘͢Δ ௨ৗͷతؔ Reducedαϯϓϧ௨ৗͷతؔΛֶͬͯशͨ͠ ϞσϧΛ༻͍ͯੜ
ਖ਼ଇԽ߲ͷޮՌ !23 • Ϟσϧͷਫ਼͕ඍ૿ • ਖ਼ʹඞཁͳ୯ޠ ͕૿Ճ
ਖ਼ଇԽ߲ͷޮՌ !24 ਓखධՁͷਫ਼্ Input Reductionͨ͠ೖྗ ͷղऍੑ্͕
ਖ਼ଇԽͨ͠Ϟσϧͷࣄྫ !25 Input Reductionͨ͠ೖྗ͕ਓͰ ղऍՄೳʹͳͬͨ
·ͱΊ ఏҊख๏ w NLPͷχϡʔϥϧϞσϧղੳख๏ͱͯ͠ɺInput ReductionΛఏҊ w ༧ଌʹد༩͠ͳ͍୯ޠΛೖྗ͔ΒΓɺϞσϧΛղੳ ࣮ݧ݁Ռ w ఏҊख๏Λద༻ͨ͠ೖྗਓʹͱͬͯҙຯෆ໌
w ҰํͰχϡʔϥϧϞσϧਖ਼͍͠༧ଌΛߦ͏ w ਖ਼ଇԽ߲Λಋೖ͢ΔͱϞσϧͷڍಈվળ !26