Adversarial Filters of Dataset Biases

Adversarial Filters of Dataset Biases ੢ࢿࠁ (ML Research Scientist, Pingpong)

ݾର ݾର 1. োҳ੸ ߓ҃ 2. AFLite 1. ৘द: WinoGrande
ؘ੉ఠࣇ 2. ੌ߈ചػ ঌҊ્ܻ 3. प೷ 1. Synthetic Data 2. NLP 3. Vision

োҳ੸ ߓ҃ োҳ੸ ߓ҃

‘߮஖݃௼ ؘ੉ఠࣇীࢲ ֫਷ ࢿמਸ ׳ࢿ೮׮Ҋ ೧׼ ޙઁܳ ೧Ѿ೮׮Ҋ ݈ೡ ࣻ
੓ਸө?’ • In-distribution పझ౟ࣇীࢲח ੜೞ૑݅ Out-of-distribution adversarial sampleীח ডೠ അ࢚ • Input-Output рী ੄ب஖ ঋ਷ Spurious correlation੉ ࢤ҂ӝ ٸޙ • ੉ܳ ೧Ѿೠ ؘ੉ఠࣇਸ ٜ݅যঠ दझమਸ ઁ؀۽ ಣоೡ ࣻ ੓਺ োҳ੸ ߓ҃ High Performance = Problem Solved?

োҳ੗о ૒੽ domain-speciﬁcೠ spurious ಁఢਸ ࠙ܨ ߂ ੿੄ೞҊ ੉ܳ ઁѢೞח
ߑध • োҳ੗੄ domain-speciﬁcೠ ૑धҗ ૒ҙী ੄ઓ • ঌҊ્ܻ ࢸ҅੗о ޷୊ Ҋ۰ೞ૑ ޅೠ biasח ழߡ ࠛо োҳ੸ ߓ҃ Previous Approaches

AFLite AFLite

• ޙ੢ীࢲ ؀ݺࢎо оܻఃח ؀࢚ਸ ݏ൤ח ޙઁ • SOTA ੿ഛب
ড 90% → ݽ؛੉ Spurious correlationਸ ੉ਊೞח ѱ ইקө? • (3), (4)ח ߃઴ ஘ հ݈੉ ੿׹җ ҙ۲ ੓ਸ ഛܫ੉ ֫ই Word association݅ਵ۽ ޙઁܳ ಽ ࣻ ੓਺ AFLite Winograd Schema Challenge (WSC)

• ࢎۈ੉ ૒੽ ؘ੉ఠࣇਸ ٜ݅ݶ ੉۠ Annotation artifactী ੄ೠ Biasܳ
ೖೞӝ য۰਑ • AFLite۽ ೙ఠ݂ೠ WinoGrande ؘ੉ఠࣇ਷ ݽ؛ ੿ഛبب ծҊ ׮ܲ ߮஖݃௼۽ Transferب ੜؽ AFLite WinoGrande Dataset

1. ؘ੉ఠ੄ ੌࠗ݅ਵ۽ RoBERTa fine-tune 2. Splitਸ ׳ܻ ೞݶࢲ RoBERTa
feature۽ linear classifier ೟ण 3. Split పझ౟ࣇীࢲ ੐߬٬݅ਵ۽ ׹ਸ औѱ ଺ਸ ࣻ ੓ח૑ పझ ౟ೞҊ ੋझఢझ߹۽ ঔ࢚࠶ ࣇী ୶о 4. ৈ۞ linear classifierо ੿׹ਸ ݏ൦ ࠺ਯ੉ Thresholdܳ ֈח Ѫ ઺ Top-kѐܳ ୭ઙ ؘ੉ఠࣇীࢲ ઁ৻ 5. ઁ৻غח ѐࣻо kѐо উ غѢա ਗೞח ௼ӝ੄ ؘ੉ఠࣇ੉ ؼ ٸ ө૑ 2~4 ߈ࠂ AFLite AFLite in WinoGrande

• ױয੄ ӓࢿ݅ਵ۽ ಽ ࣻ ੓ח ޙઁܳ Ѧ۞ն • ੉ח
ష௾ ۨ߰੄ Biasۄӝࠁ׮ח ҳઑ੸ੋ Ѫ੉޲۽ lexical-level heuristicਵ۽ח Ѧ۞ղӝ ൨ٝ AFLite Filtered Examples

• AFLiteܳ ৈ۞ بݫੋਵ۽ ഛ੢ೞҊ model-agnosticೞѱ ੌ߈ച • Contributions: 1.
੉࢚੸੉૑݅ intractableೠ AFOptܳ AFLite۽ Ӕࢎೡ ࣻ ੓਺ਸ ࠁੋ׮. (Skip) 2. Vision, NLP ࠙ঠ੄ ৈ۞ ؘ੉ఠࣇীࢲ प೷೧ AFLite੄ ਬബࢿਸ ّ߉ஜೠ׮. 3. Biasܳ হঙ ؘ੉ఠࣇਵ۽ ೟णೠ ݽ؛੉ ੌ߈ചо ੜؽਸ प೷੸ਵ۽ ࠁੋ׮. 4. AFLite۽ ೙ఠ݂ೞݶ ؊ ب੹੸ੋ ߮஖݃௼ ؘ੉ఠࣇਸ ٜ݅ ࣻ ੓਺ਸ ࠁੋ׮. AFLite Adversarial Filters of Dataset Biases

: any feature extractor : a family of classiﬁcation models
Φ M AFLite AFLite (Generalized)

Experiments Experiments

Biasing Dataset • Class-speciﬁcೠ ੋҕ featureܳ ؘ੉ఠ੄ 75%ী ઱ੑ, աݠ૑ח
random feature ઱ੑ • Biased sample ઺ ੌࠗח ۨ੉࠶ ߄Է Results • Linear classiﬁer۽ب ֫਷ ࢿמ ׳ࢿ • AFLiteܳ ੸ਊೞݶ ࢚ध੸ੋ ࢿמਵ۽ جই১ Experiments Synthetic Data

• प೷ ؀࢚: SNLI੄ annotation artifactܳ ೖೠ out-of-distribution ؘ੉ఠࣇ 3ઙ
• Non-entailment ઺ ޙઁ ਬഋ߹۽ Zero-shot పझ౟ Experiments NLP: Out-of-distribution Generalization

AFLite۽ ೙ఠ݂ೠ ؘ੉ఠࣇ਷ ݽٚ ݽ؛ীࢲ ࢿמ੉ ௼ѱ ڄয૗ Experiments In-distribution
Benchmark Re-estimation: SNLI

Experiments In-distribution Benchmark Re-estimation: MultiNLI & QNLI

• : ImageNet ؘ੉ఠࣇ੄ 20%۽ ೟णೠ EfficientNet-B7 feature • ImageNet-A۽
ಣоೞפ AFLite-filtered ؘ੉ఠࣇਵ۽ ೟ण೮ਸ ٸ ࢿמ੉ ؊ જ਺ Φ Experiments Vision: Adversarial Image Classification

ImageNet੄ dev setਸ ೙ఠ݂ೞҊ ಣо೮ਸ ٸ ࢿמ ೞۅ੉ ؊ ఀ
Experiments In-distribution Image Classiﬁcation

ӝઓীب ࠁҊػ ౠ੿ ನૉী ؀ೠ Bias, ݽনࠁ׮ ૕х݅ਵ۽ ҳ࠙ೞח ޙઁ
١җ Ѿਸ э੉ೣ Experiments Filtered Examples

• Adversarial Filtering SWAG: A Large-Scale Adversarial Dataset for Grounded
Commonsense Inference [EMNLP’18] HellaSwag: Can a Machine Really Finish Your Sentence? [ACL’19] • AFLite WinoGrande: An Adversarial Winograd Schema Challenge at Scale [arXiv’19] Adversarial Filters of Dataset Biases [ICML’20] References References

хࢎ೤פ׮✌ ୶о ૕ޙ ژח ҾӘೠ ੼੉ ੓׮ݶ ঱ઁٚ ইې োۅ୊۽
োۅ ઱ࣁਃ! ੢ࢿࠁ (ML Research Scientist, Pingpong) [email protected]

Adversarial Filters of Dataset Biases

Adversarial Filters of Dataset Biases

Scatter Lab Inc.

More Decks by Scatter Lab Inc.

Other Decks in Research

Featured

Transcript