Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
The Natural Language Decathlon: Multitask Learn...
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Scatter Lab Inc.
July 10, 2019
Research
920
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
The Natural Language Decathlon: Multitask Learning as Question Answering
Scatter Lab Inc.
July 10, 2019
More Decks by Scatter Lab Inc.
See All by Scatter Lab Inc.
zeta introduction
scatterlab
0
1.9k
SimCLR: A Simple Framework for Contrastive Learning of Visual Representations
scatterlab
0
4.4k
Adversarial Filters of Dataset Biases
scatterlab
0
2.3k
Sparse, Dense, and Attentional Representations for Text Retrieval
scatterlab
0
2.3k
Weight Poisoning Attacks on Pre-trained Models
scatterlab
0
2.2k
Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval
scatterlab
0
2.5k
Beyond Accuracy: Behavioral Testing of NLP Models with CheckList
scatterlab
0
2.3k
Open-Retrieval Conversational Question Answering
scatterlab
0
2.3k
What Can Neural Networks Reason About?
scatterlab
0
2.3k
Other Decks in Research
See All in Research
[BlackHatAsia2026] Hidden Telemetry: Uncovering TraceLogging ETW Providers You're Not Using (Yet)
asuna_jp
1
500
Any-Optical-Model: A Universal Foundation Model for Optical Remote Sensing
satai
3
810
Model Discovery and Graph Simulation: A Lightweight Gateway to Chaos Engineering
anatolykr
0
190
FUSE-RSVLM: Feature Fusion Vision-Language Model for Remote Sensing
satai
3
840
オーストリア流 都市の公共交通サービス水準評価@公共交通オープンデータ最前線2026
trafficbrain
0
180
英語教育 “研究” のあり方:学術知とアウトリーチの緊張関係
terasawat
1
990
「車1割削減、渋滞半減、公共交通2倍」を 熊本から岡山へ@RACDA設立30周年記念都市交通フォーラム2026
trafficbrain
1
1.1k
2026 東京科学大 情報通信系 研究室紹介 (大岡山)
icttitech
0
3.7k
Ankylosing Spondylitis
ankh2054
0
170
The mathematics of transformers
gpeyre
0
310
CyberAgent AI Lab研修 / Social Implementation Anti-Patterns in AI Lab
chck
7
4.6k
衛星×エッジAI勉強会 衛星上におけるAI処理制約とそ取組について
satai
4
530
Featured
See All Featured
ReactJS: Keep Simple. Everything can be a component!
pedronauck
666
130k
Keith and Marios Guide to Fast Websites
keithpitt
413
23k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
35
2.5k
Jess Joyce - The Pitfalls of Following Frameworks
techseoconnect
PRO
1
160
Applied NLP in the Age of Generative AI
inesmontani
PRO
4
2.3k
技術選定の審美眼(2025年版) / Understanding the Spiral of Technologies 2025 edition
twada
PRO
118
120k
Git: the NoSQL Database
bkeepers
PRO
432
67k
The Curious Case for Waylosing
cassininazir
1
380
The Pragmatic Product Professional
lauravandoore
37
7.3k
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
4.3k
Conquering PDFs: document understanding beyond plain text
inesmontani
PRO
4
2.8k
Visualization
eitanlees
152
17k
Transcript
스캐터랩(ScatterLab) ੌ࢚ച ੋҕמ Scatterlab ML Technical Seminar Session 2 (QA):
백영민 The Natural Language Decathlon: Multitask Learning as Question Answering McCann et al. Salesforce Research Machine Learning Engineer
#1. Concept
!3 Multitask Learning ৈ۞ о taskܳ э ण೧ࠁ!
• Method: ౠ objectiveܳ оҊ णػ model/representationਸ ܲ downstream taskী
ਊೞח Ѫ • ੌ߈ਵ۽ Language Modeling١ Natural Language ߈ੋ ౠࢿਸ णೡ ࣻ ח objective ਊ • : • Random initializeীࢲ दೞח Ѫ ࠁ જ Ѿҗܳ ࠁҊ, ࡅܲ ࣻ۴ਸ оמೞѱ ೧ષ • ߑध • Representation: Word2Vec, Glove ١ fixed representation, ULMFit, ELMO ١ intermediate representation(context aware)ਸ downstream taskীࢲ ਊೞח Ѫ(߹ب model ઓ) • Model: BERT, GPT١ pre-trainingী ਊ೮؍ modelਸ downstream taskীࢲ “fine-tuning” !4 #1 Concept Transfer Learning
• Method: ৈ۞ taskٜਸ ೞա ݽ؛۽ زदী णदఃח Ѫ •
Chunking, POS tagging, NER, SRL, dependency parsing, NLI ١ NLP taskٜਸ زੌೠ ݽ؛۽ زदী ण • : • ৈ۞ taskܳ زदী modelingೡ ࣻ • ੜ णغݶ ౠ taskী ೠػ Ѫ ইצ “General Representation”ਸ ਸ ࣻ • Zero-shot Learning, Meta-Learning ١ ਊ оמࢿ • ୭Ӕ ഝߊ োҳо ܖযҊ ח ࠙ঠ • ই singletask learningী ࠺Үؼ݅ೠ ࢿמਸ ࠁৈҊ ঋ݅, challengingೠ োҳٜ ݆ ܖযҊ . • Image classification + NLP • ࣗѐೡ ֤ޙ “MQAN” singletask learningী Ӓա݃ ࠺Үؼ݅ೠ ࢿמਸ ࠁ(BERT ֤ޙ ੑפ) !5 #1 Concept Multitask Learning
#2. Method
!7 Approach ݽٚ taskܳ QAഋधী ݏࠁ!
• য questionী ೠ ਸ “contextղীࢲ” ח ޙઁܳ ಿ (
Context ղࠗী Ҋ о) • Ex) SQuAD, RACE… • ࠁా द index৬ indexܳ ח ߑध !8 #2 Approach Question Answering
!9 #2 Approach Question Answering Idea: ݽٚ taskٜਸ “QAഋध”ਵ۽ ٜ݅যࢲ
“Multitask Learning”ਸ ೧ࠁ! QA Translation Summary NLI Sentiment Analysis
• ୨ 10ѐ taskܳ Multitask Learning !10 #2 Approach Decathlon
!11 Model Architecture Multitask Question Answering Network(MQAN)
!12 #2 Architecture Overview
!13 #2 Architecture I/O • Input: • Q: Question Sentences
• C: Context Sentences • A: Answer Sentences(for generation - autoregressive) • Output: • General QA: Contextীࢲ द, indexܳ • Q(Question Sentence) + C(Context Sentences) + Outer Vocabulary(Generation) ী ࢶఖ
!14 #2 Architecture Feature - Input Representation
!15 #2 Architecture Feature - Alignment <dummy dataܳ ֍ח ਬ>
!16 #2 Architecture Feature - Dual Coattention
!17 #2 Architecture Feature - Compression & Self-Attention
!18 #2 Architecture Feature - Answer Representation
!19 #2 Architecture Feature - Answer Representation
!20 #2 Architecture Feature - Answer
!21 Training Strategy Curriculum learning
!22 #3 Traning Strategy Multitask Learning Strategy • Round-robin Algorithm
• CPU scheduling ߑߨ ೞա۽ ஹೊఠ ਗਸ ࢎਊೡ ࣻ ח ӝഥܳ “۽ࣁझٜীѱ ҕ”ೞѱ ࠗৈ • п ۽ࣁझী ੌदрਸ ೡ, ೡػ दр աݶ Ӓ ۽ࣁझ ਫ਼द ࠁܨ, ܲ ۽ࣁझীѱ ӝഥܳ ષ • п Taskী ੌߓܳ ೡ, ೡػ ߓо աݶ Ӓ Taskਸ ਫ਼द ࠁܨ, ܲ Taskীѱ ӝഥܳ ષ • Fully Joint • п Task ࣽࢲܳ ҊೞҊ, round-robin algorithmਸ ా೧ batchܳ sampling • Single-task trainingীࢲ iterationਵ۽ب ࣻ۴೮؍ taskٜ ੜ غ݅ աݠח single-task݅ ण೮ ਸ ٸ݅ఀ ࢿמਸ ࠁৈ ޅೣ
!23 #3 Traning Strategy Multitask Learning Strategy • Curriculum Strategy
• Curriculumਸ ٜ݅যࢲ learningदெࠁ! • First Phase:࠺Ү ए taskٜਸ ݢ ण -> Second Phase:য۰ taskٜਸ ण • First Phase(SST, QA-SRL, QA-ZRE, WOZ, WikiSQL, MWSC) -> Second Phase(Others)
!24 #3 Traning Strategy Multitask Learning Strategy • Curriculum Strategy
• Curriculumਸ ٜ݅যࢲ learningदெࠁ! • First Phase:࠺Ү ए taskٜਸ ݢ ण -> Second Phase:য۰ taskٜਸ ण • First Phase(SST, QA-SRL, QA-ZRE, WOZ, WikiSQL, MWSC) -> Second Phase(Others) • Anti-Curriculum Strategy • Curriculumী “߈(Anti)ೞח” ۚ - Curriculum Learning Bengio et al. [2009] • ए taskח ܲ taskٜী بਸ ࣻ ח ਬਊೠ representationਸ णೡ ࣻ হ! • First Phase:য۰ taskٜਸ ݢ ण -> Second Phase:ए taskٜਸ ण • First Phase(SQuAD, IWSLT, CNN/DM, MNLI) -> Second Phase(Others)
#3. Result
!26 Result ־о־о ੜ೮ա?
!27 #1single vs multitask Single vs Multitask Training
!28 #2 curriculum Training Strategy
!29 #2 curriculum Pointer weight distribution
!30 Q & A хࢎפ