Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Adversarial Filters of Dataset Biases
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Scatter Lab Inc.
September 04, 2020
Research
2.3k
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Adversarial Filters of Dataset Biases
Scatter Lab Inc.
September 04, 2020
More Decks by Scatter Lab Inc.
See All by Scatter Lab Inc.
zeta introduction
scatterlab
0
1.9k
SimCLR: A Simple Framework for Contrastive Learning of Visual Representations
scatterlab
0
4.4k
Sparse, Dense, and Attentional Representations for Text Retrieval
scatterlab
0
2.3k
Weight Poisoning Attacks on Pre-trained Models
scatterlab
0
2.2k
Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval
scatterlab
0
2.5k
Beyond Accuracy: Behavioral Testing of NLP Models with CheckList
scatterlab
0
2.3k
Open-Retrieval Conversational Question Answering
scatterlab
0
2.3k
What Can Neural Networks Reason About?
scatterlab
0
2.3k
Exploring the Limits of Transfer Learning with Unified Text-to-Text Transformer
scatterlab
0
2.3k
Other Decks in Research
See All in Research
Dual Quadric表現を用いた動的物体追跡とRGB-D・IMU制約の密結合によるオドメトリ推定
nanoshimarobot
0
420
LLM Compute Infrastructure Overview
karakurist
2
1.5k
進学校の生徒にはア行の苗字が多いのか
ozekinote
0
450
Anthropic が提案する LLM の内部状態を自然言語で説明可能にした Natural Language Autoencoders / Natural Language Autoencoders Produce Unsupervised Explanations of LLM Activations
shunk031
0
130
PGDM: Physically Guided Diffusion Model for L Downscaling
satai
2
290
AI Agentの精度改善に見るML開発との共通点 / commonalities in accuracy improvements in agentic era
shimacos
6
1.7k
Ghost in the 7‑Zip: The Shadow of Residential Proxies Creeping into Your Life
nttcom
0
1.2k
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
shunk031
4
1.1k
CyberAgent AI Lab研修 / Social Implementation Anti-Patterns in AI Lab
chck
7
4.7k
SOTAのさらに先へ:厳しい推論制約下での高性能モデルのPost-Training
analokmaus
0
1.3k
敵対生成プロンプト同時探索による内省型プロンプト最適化
kinoue_smarthr
0
240
The mathematics of transformers
gpeyre
0
340
Featured
See All Featured
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
21
1.5k
The Curse of the Amulet
leimatthew05
2
13k
Amusing Abliteration
ianozsvald
1
210
Primal Persuasion: How to Engage the Brain for Learning That Lasts
tmiket
0
370
Lightning Talk: Beautiful Slides for Beginners
inesmontani
PRO
2
580
Marketing Yourself as an Engineer | Alaka | Gurzu
gurzu
0
240
[SF Ruby Conf 2025] Rails X
palkan
2
1.1k
Building Experiences: Design Systems, User Experience, and Full Site Editing
marktimemedia
0
540
Lessons Learnt from Crawling 1000+ Websites
charlesmeaden
PRO
1
1.3k
Large-scale JavaScript Application Architecture
addyosmani
515
110k
Game over? The fight for quality and originality in the time of robots
wayneb77
1
200
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.5k
Transcript
Adversarial Filters of Dataset Biases ࢿࠁ (ML Research Scientist, Pingpong)
ݾର ݾର 1. োҳ ߓ҃ 2. AFLite 1. द: WinoGrande
ؘఠࣇ 2. ੌ߈ചػ ঌҊ્ܻ 3. प 1. Synthetic Data 2. NLP 3. Vision
োҳ ߓ҃ োҳ ߓ҃
‘߮݃ ؘఠࣇীࢲ ֫ ࢿמਸ ׳ࢿ೮Ҋ ೧ ޙઁܳ ೧Ѿ೮Ҋ ݈ೡ ࣻ
ਸө?’ • In-distribution పझࣇীࢲח ੜೞ݅ Out-of-distribution adversarial sampleীח ডೠ അ࢚ • Input-Output рী ب ঋ Spurious correlation ࢤ҂ӝ ٸޙ • ܳ ೧Ѿೠ ؘఠࣇਸ ٜ݅যঠ दझమਸ ઁ۽ ಣоೡ ࣻ োҳ ߓ҃ High Performance = Problem Solved?
োҳо domain-specificೠ spurious ಁఢਸ ࠙ܨ ߂ ೞҊ ܳ ઁѢೞח
ߑध • োҳ domain-specificೠ धҗ ҙী ઓ • ঌҊ્ܻ ࢸ҅о Ҋ۰ೞ ޅೠ biasח ழߡ ࠛо োҳ ߓ҃ Previous Approaches
AFLite AFLite
• ޙীࢲ ݺࢎо оܻఃח ࢚ਸ ݏח ޙઁ • SOTA ഛب
ড 90% → ݽ؛ Spurious correlationਸ ਊೞח ѱ ইקө? • (3), (4)ח ߃ հ݈ җ ҙ۲ ਸ ഛܫ ֫ই Word association݅ਵ۽ ޙઁܳ ಽ ࣻ AFLite Winograd Schema Challenge (WSC)
• ࢎۈ ؘఠࣇਸ ٜ݅ݶ ۠ Annotation artifactী ೠ Biasܳ
ೖೞӝ য۰ • AFLite۽ ఠ݂ೠ WinoGrande ؘఠࣇ ݽ؛ ഛبب ծҊ ܲ ߮݃۽ Transferب ੜؽ AFLite WinoGrande Dataset
1. ؘఠ ੌࠗ݅ਵ۽ RoBERTa fine-tune 2. Splitਸ ׳ܻ ೞݶࢲ RoBERTa
feature۽ linear classifier ण 3. Split పझࣇীࢲ ߬٬݅ਵ۽ ਸ औѱ ਸ ࣻ ח పझ ೞҊ ੋझఢझ߹۽ ঔ࢚࠶ ࣇী ୶о 4. ৈ۞ linear classifierо ਸ ݏ൦ ࠺ਯ Thresholdܳ ֈח Ѫ Top-kѐܳ ୭ઙ ؘఠࣇীࢲ ઁ৻ 5. ઁ৻غח ѐࣻо kѐо উ غѢա ਗೞח ӝ ؘఠࣇ ؼ ٸ ө 2~4 ߈ࠂ AFLite AFLite in WinoGrande
• ױয ӓࢿ݅ਵ۽ ಽ ࣻ ח ޙઁܳ Ѧ۞ն • ח
ష ۨ߰ Biasۄӝࠁח ҳઑੋ Ѫ۽ lexical-level heuristicਵ۽ח Ѧ۞ղӝ ൨ٝ AFLite Filtered Examples
• AFLiteܳ ৈ۞ بݫੋਵ۽ ഛೞҊ model-agnosticೞѱ ੌ߈ച • Contributions: 1.
࢚݅ intractableೠ AFOptܳ AFLite۽ Ӕࢎೡ ࣻ ਸ ࠁੋ. (Skip) 2. Vision, NLP ࠙ঠ ৈ۞ ؘఠࣇীࢲ प೧ AFLite ਬബࢿਸ ّ߉ஜೠ. 3. Biasܳ হঙ ؘఠࣇਵ۽ णೠ ݽ؛ ੌ߈ചо ੜؽਸ पਵ۽ ࠁੋ. 4. AFLite۽ ఠ݂ೞݶ ؊ بੋ ߮݃ ؘఠࣇਸ ٜ݅ ࣻ ਸ ࠁੋ. AFLite Adversarial Filters of Dataset Biases
: any feature extractor : a family of classification models
Φ M AFLite AFLite (Generalized)
Experiments Experiments
Biasing Dataset • Class-specificೠ ੋҕ featureܳ ؘఠ 75%ী ੑ, աݠח
random feature ੑ • Biased sample ੌࠗח ۨ࠶ ߄Է Results • Linear classifier۽ب ֫ ࢿמ ׳ࢿ • AFLiteܳ ਊೞݶ ࢚धੋ ࢿמਵ۽ جই১ Experiments Synthetic Data
• प ࢚: SNLI annotation artifactܳ ೖೠ out-of-distribution ؘఠࣇ 3ઙ
• Non-entailment ޙઁ ਬഋ߹۽ Zero-shot పझ Experiments NLP: Out-of-distribution Generalization
AFLite۽ ఠ݂ೠ ؘఠࣇ ݽٚ ݽ؛ীࢲ ࢿמ ѱ ڄয Experiments In-distribution
Benchmark Re-estimation: SNLI
Experiments In-distribution Benchmark Re-estimation: MultiNLI & QNLI
• : ImageNet ؘఠࣇ 20%۽ णೠ EfficientNet-B7 feature • ImageNet-A۽
ಣоೞפ AFLite-filtered ؘఠࣇਵ۽ ण೮ਸ ٸ ࢿמ ؊ જ Φ Experiments Vision: Adversarial Image Classification
ImageNet dev setਸ ఠ݂ೞҊ ಣо೮ਸ ٸ ࢿמ ೞۅ ؊ ఀ
Experiments In-distribution Image Classification
ӝઓীب ࠁҊػ ౠ ನૉী ೠ Bias, ݽনࠁ х݅ਵ۽ ҳ࠙ೞח ޙઁ
١җ Ѿਸ эೣ Experiments Filtered Examples
• Adversarial Filtering SWAG: A Large-Scale Adversarial Dataset for Grounded
Commonsense Inference [EMNLP’18] HellaSwag: Can a Machine Really Finish Your Sentence? [ACL’19] • AFLite WinoGrande: An Adversarial Winograd Schema Challenge at Scale [arXiv’19] Adversarial Filters of Dataset Biases [ICML’20] References References
хࢎפ✌ ୶о ޙ ژח ҾӘೠ ݶ ઁٚ ইې োۅ۽
োۅ ࣁਃ! ࢿࠁ (ML Research Scientist, Pingpong)
[email protected]