Deep contextualized word representations

Deep contextualized word representations Matthew E. Peters, Mark Neumann, Mohit
Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, Luke Zettlemoyer, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 2227--2237 自然言語処理研究室勝田哲弘

Abstract • 単語の複雑な特性（syntax, semantics）や用途による違い (polysemy）を表現した新たなモデルの提案 • 大規模なテキストコーパスで事前学習されたbidirectional language model (biLM)の内部状態から単語ベクトルを得る
• 既存のモデルに簡単に適用でき、様々なNLPタスクで改善が見られた 2

Introduction • 従来の単語型の方法とは異なり、提案するモデルは入力文から各単語の表現を得る ◦ ELMo (Embeddings from Language Models)
◦ LSTMの線形結合を学習 • 低レベルの層が文法を、高レベルの層が語義の文脈依存を捕捉することを示す結果となった 3

ELMo: Embeddings from Language Models • 2層のbiLMに対して重み付けをしてword embeddingsを表現する biLSTM
biLSTM biLSTM biLSTM biLSTM biLSTM t1 t2 tN ELMo :各層の重み(softmax) :全体の重み 4

Evaluation 5

Question answering • Stanford Question Answering Dataset (SQuAD) (Rajpurkar et
al., 2016) • 100K+ のクラウドソーシングした質疑応答のペア • 答えはWikipediaの段落から与えられる 6

Textual entailment • 仮説が真であるかを判断するタスク • Stanford Natural Language Inference (SNLI)
corpus (Bowman et al., 2015) • 約550Kの仮説/前提のペア 7

Semantic role labeling • 述語構造解析 • “Who did what to
whom”で解答する形が多い • OntoNotes benchmark (Pradhan et al., 2013) • SOTAはBIOタグ付けとしてモデル化 [He et al.(2017)] 8

Coreference resolution • 照応解析 • OntoNotes coreference annotations from the
CoNLL 2012 shared task (Pradhan et al., 2012) 9

Named entity extraction • 固有表現抽出 • CoNLL 2003 NER task
(Sang and Meulder, 2003) • 4つの異なるエンティティタイプ（PER、LOC、ORG、MISC）でタグ付けされたReuters RCV1コーパス 10

Sentiment analysis • 感情分析 • Stanford Sentiment Treebank (SST-5; Socher
et al., 2013) • ５つのラベルに分類(from very negative to very positive) 11

Evaluation 12

Analysis • Alternate layer weighting schemes 13

Where to include ELMo? 14

What information is captured by the biLM’s representations? 15

What information is captured by the biLM’s representations? 16

Sample efficiency 17

Visualization of learned weights 18

Conclusion • 文脈依存の表現を学習するモデルの提案 • ELMoを用いることで様々なNLPタスクの改善 • biLM層がそれぞれ異なる情報を学習してることを確認 (syntactic and semantic)
19

Deep contextualized word representations

Deep contextualized word representations

katsutan

More Decks by katsutan

Other Decks in Technology

Featured

Transcript