Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Natural Language Decathlon: Multitask Learn...

The Natural Language Decathlon: Multitask Learning as Question Answering

Avatar for Scatter Lab Inc.

Scatter Lab Inc.

July 10, 2019
Tweet

More Decks by Scatter Lab Inc.

Other Decks in Research

Transcript

  1. 스캐터랩(ScatterLab) ੌ࢚؀ച ੋҕ૑מ Scatterlab ML Technical Seminar Session 2 (QA):

    백영민 The Natural Language Decathlon: Multitask Learning as Question Answering McCann et al. Salesforce Research Machine Learning Engineer
  2. • Method: ౠ੿ objectiveܳ о૑Ҋ ೟णػ model/representationਸ ׮ܲ downstream taskী

    ੉ਊೞח Ѫ • ੌ߈੸ਵ۽ Language Modeling١ Natural Language੄ ੹߈੸ੋ ౠࢿਸ ೟णೡ ࣻ ੓ח objective ੉ਊ • ੉੼: • Random initializeীࢲ द੘ೞח Ѫ ࠁ׮ જ਷ Ѿҗܳ ࠁ੉Ҋ, ࡅܲ ࣻ۴ਸ оמೞѱ ೧ષ • ߑध • Representation: Word2Vec, Glove ١੄ fixed representation, ULMFit, ELMO ١੄ intermediate representation(context aware)ਸ downstream taskীࢲ ੉ਊೞח Ѫ(߹ب੄ model ઓ੤) • Model: BERT, GPT١ pre-trainingী ੉ਊ೮؍ modelਸ downstream taskীࢲ “fine-tuning” !4 #1 Concept Transfer Learning
  3. • Method: ৈ۞ taskٜਸ ೞա੄ ݽ؛۽ زदী ೟णदఃח Ѫ •

    Chunking, POS tagging, NER, SRL, dependency parsing, NLI ١੄ NLP taskٜਸ زੌೠ ݽ؛۽ زदী ೟ ण • ੉੼: • ৈ۞ taskܳ زदী modelingೡ ࣻ ੓਺ • ੜ ೟णغݶ ౠ੿ taskী ೠ੿ػ Ѫ੉ ইצ “General Representation”ਸ ঳ਸ ࣻ ੓਺ • Zero-shot Learning, Meta-Learning ١੄ ੸ਊ оמࢿ • ୭Ӕ ഝߊ൤ োҳо ੉ܖয૑Ҋ ੓ח ࠙ঠ • ই૒ singletask learningী ࠺Үؼ݅ೠ ࢿמਸ ࠁৈ઱Ҋ ੓૓ ঋ૑݅, challengingೠ োҳٜ੉ ݆੉ ੉ܖয૑Ҋ ੓਺. • Image classification + NLP • ࣗѐೡ ֤ޙ “MQAN”਷ singletask learningী Ӓա݃ ࠺Үؼ݅ೠ ࢿמਸ ࠁ੐(BERT ੉੹ ֤ޙ ੑפ׮) !5 #1 Concept Multitask Learning
  4. • ઱য૓ questionী ؀ೠ ׹ਸ “contextղীࢲ” ଺ח ޙઁܳ ಿ (੿׹੉

    Context ղࠗী ੓׮Ҋ о੿) • Ex) SQuAD, RACE… • ࠁా ੿׹੄ द੘ index৬ ՘ indexܳ ଺ח ߑध !8 #2 Approach Question Answering
  5. !9 #2 Approach Question Answering Idea: ݽٚ taskٜਸ “QAഋध”ਵ۽ ٜ݅যࢲ

    “Multitask Learning”ਸ ೧ࠁ੗! QA Translation Summary NLI Sentiment Analysis
  6. !13 #2 Architecture I/O • Input: • Q: Question Sentences

    • C: Context Sentences • A: Answer Sentences(for generation - autoregressive) • Output: • General QA: Contextীࢲ ੿׹੄ द੘, ՘ indexܳ ଺਺ • Q(Question Sentence) + C(Context Sentences) + Outer Vocabulary(Generation) ઺ী ࢶఖ
  7. !22 #3 Traning Strategy Multitask Learning Strategy • Round-robin Algorithm

    • CPU scheduling ߑߨ ઺ ೞա۽ ஹೊఠ ੗ਗਸ ࢎਊೡ ࣻ ੓ח ӝഥܳ “೐۽ࣁझٜীѱ ҕ੿”ೞѱ ࠗৈ • п ೐۽ࣁझী ੌ੿दрਸ ೡ׼, ೡ׼ػ दр੉ ૑աݶ Ӓ ೐۽ࣁझ ਫ਼द ࠁܨ, ׮ܲ ೐۽ࣁझীѱ ӝഥܳ ષ • п Taskী ੌ੿ߓ஖ܳ ೡ׼, ೡ׼ػ ߓ஖о ૑աݶ Ӓ Taskਸ ਫ਼द ࠁܨ, ׮ܲ Taskীѱ ӝഥܳ ષ • Fully Joint • п Task੄ ࣽࢲܳ Ҋ੿ೞҊ, round-robin algorithmਸ ా೧ batchܳ sampling • Single-task trainingীࢲ ੸਷ iterationਵ۽ب ࣻ۴೮؍ taskٜ਷ ੜ غ૑݅ աݠ૑ח single-task݅ ೟ण೮ ਸ ٸ݅ఀ੄ ࢿמਸ ࠁৈ઱૑ ޅೣ
  8. !23 #3 Traning Strategy Multitask Learning Strategy • Curriculum Strategy

    • Curriculumਸ ٜ݅যࢲ learningदெࠁ੗! • First Phase:࠺Ү੸ ए਍ taskٜਸ ݢ੷ ೟ण -> Second Phase:য۰਍ taskٜਸ ೟ण • First Phase(SST, QA-SRL, QA-ZRE, WOZ, WikiSQL, MWSC) -> Second Phase(Others)
  9. !24 #3 Traning Strategy Multitask Learning Strategy • Curriculum Strategy

    • Curriculumਸ ٜ݅যࢲ learningदெࠁ੗! • First Phase:࠺Ү੸ ए਍ taskٜਸ ݢ੷ ೟ण -> Second Phase:য۰਍ taskٜਸ ೟ण • First Phase(SST, QA-SRL, QA-ZRE, WOZ, WikiSQL, MWSC) -> Second Phase(Others) • Anti-Curriculum Strategy • Curriculumী “߈(Anti)ೞח” ੹ۚ - Curriculum Learning Bengio et al. [2009] • ए਍ taskח ׮ܲ taskٜী ب਑ਸ ઴ ࣻ ੓ח ਬਊೠ representationਸ ೟णೡ ࣻ হ׮! • First Phase:য۰਍ taskٜਸ ݢ੷ ೟ण -> Second Phase:ए਍ taskٜਸ ೟ण • First Phase(SQuAD, IWSLT, CNN/DM, MNLI) -> Second Phase(Others)