EMNLP2018論文読み会:Deriving Machine Attention from Human Rationales

Bdf798e9136dc2b0bd08391fdaeaeab2?s=47 Yuta Koreeda
December 09, 2018

EMNLP2018論文読み会:Deriving Machine Attention from Human Rationales

Bdf798e9136dc2b0bd08391fdaeaeab2?s=128

Yuta Koreeda

December 09, 2018
Tweet

Transcript

  1. Deriving Machine Attention from Human Rationales @koreyou 2018/12/9 EMNLP2018読み会@サイバーエージェント, 東京

    1
  2. Who am I? 是枝祐太 某電機会社リサーチャー 研究歴 • 〜2015: 医療+ロボット(大学) •

    〜2016: ロボット+応用機械学習 • 〜現在: 応用機械学習+自然言語処理 koreyou koreyou_ 2
  3. Who am I? 是枝祐太 某電機会社リサーチャー 研究歴 • 〜2015: 医療+ロボット(大学) •

    〜2016: ロボット+応用機械学習 • 〜現在: 応用機械学習+自然言語処理 koreyou koreyou_ 2
  4. tl;dr Bat et al. 2018. Deriving Machine Attention from Human

    Rationales. EMNLP. 目的:分類の根拠となった箇所のデータを用い 低リソースドメインで分類精度向上 手法:ドメイン非依存なRationale⇒attentionの 変換を学習 • Rationale:人間が作成した分類の根拠となる記載箇所 結果:観点付き評判分析の観点、ドメイン転移 でベースラインを上回った 3
  5. Table of contents 1. 背景 2. 提案手法 3. 実験 4.

    まとめ 4
  6. Deriving Machine Attention from Human Rationales Yujia Bao1, Shiyu Chang2,

    Mo Yu2, Regina Barzilay1 1Computer Science and Artif cial Intelligence Lab, MIT 2MIT-IBM Watson AI Lab, IBM Research {yujia, regina} @csail.mit.edu, shiyu.chang@ibm.com, yum@us.ibm.com 5
  7. 背景 5

  8. 観点付き評判分析 観点付き評判分析をビークルに研究 本研究は自然言語タスク全般に利用可能 • わかりやすさのために具体的なタスクを先に紹介 観点付き評判分析 (Aspect-based sentiment analysis; ABSA)

    • 入力文が各観点について肯定的か否定的かを分類 本発表では“ドメイン”を“観点”と読みかえて理解 a nice and clean hotel to stay for business and leisure . but the location is not good if you need public transport . it took too long for transport and waiting for bus . but the swimming pool looks good . Location Cleanliness 6
  9. Rationale (根拠、解釈) 根拠 (Rationale) 提示型AIが注目されている Rationale=分類の根拠となる記載箇所 • なぜその予測をしたかの解釈を与えAIを説明可能にする Rationaleを提示する研究が注目されている[1, 2]

    a nice and clean hotel to stay for business and leisure . but the location is not good if you need public transport . it took too long for transport and waiting for bus . but the swimming pool looks good . Location Cleanliness Cleanliness Location 6
  10. 研究目的 根拠データを使い低リソースドメインで精度向上 分類の教師データに加え、なぜそのような分類 をすべきかという根拠を学習に加えることで、 少量のデータで高い分類精度を実現できないか? 7

  11. 提案手法 7

  12. 文分類におけるAttention機構の活用 Attention機構により文分類の精度向上が図れる プーリングとしてのattention機構 • 各単語表現からattentionの値 (実数) を計算 • attentionの値で単語表現の重み付き和 Task:

    Hotel location label: negative a nice and clean hotel to stay for business and leisure . but the location is not good if you need public transport . it took too long for transport and waiting for bus . but the swimming pool looks good . 8
  13. Attention vs. Rationale RationaleをAttention風に変換する Attention ̸= Rationale • Attentionは連続値(強弱がある)、rationaleは二値 •

    Attentionは分類精度を最大化するよう最適化されている Rationaleを直接学習に使うよりも、Rationaleを attention 風に変換してから学習に使うほうが良 いのでは?⇒R2A (rationale to attention) • 分類学習に適したAttentionを真(oracle) attentionと呼ぶ Task: Hotel location label: negative a nice and clean hotel to stay for business and leisure . but the location is not good if you need public transport . it took too long for transport and waiting for bus . but the swimming pool looks good . Attention Rationale 9
  14. 提案手法のキーポイント Rationale⇒真attentionの変換をドメイン転移 真attentionは大量の正解分類データを用い分類 を学習することで獲得できる 正解分類データが少ないドメイン(観点)では、 真attentionは獲得できない 仮説:Rationale⇒真attentionの変換はドメイン によらず共通 • e.g.

    Rationaleの中でも内容語にattentionが強くかかる データが多いドメインでの変換を転移 10
  15. 提案手法の流れ 文章 Attention生成 you get what you pay for .

    not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... 学習 分類器 R2A 学習 分類器 学習 Attention生成 you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... 文章 ラベル Rationale Rationale 真Attention ラベル 真Attention ターゲットドメイン(観点) ソース ドメイン(観点) 大量 少量 11
  16. 問題設定(データ) データ区分 分類正解 データ数 rationale 正解データ数 転移元ドメイン(学習) 大 大(疑似的に生成) ターゲットドメイン(学習)

    小 小 ターゲットドメイン(評価) 無 無 12
  17. 提案手法:提案手法の全体構成 (a) 真attentionを生成する分類器 (b) 各ドメインを共通の空間にマッピング (c) rationaleからattentionを生成 13

  18. 提案手法(a) ソースドメインでattention生成と分類を学習 生成されたattentionはR2Aの正解データに ソースドメイン(1個以上) 単語列 Attention ⇒ 後のタスクの正解データへ 分類ラベル 13

  19. 提案手法(b) 各ドメインを共通の空間にマッピング • encがリッチな特徴量を抽出できるよう言語モデル • hinvが共通の空間になるようWasserstein距離を損失に 言語モデルの損失 ターゲットドメイン 13

  20. 提案手法(b) 各ドメインを共通の空間にマッピング • encがリッチな特徴量を抽出できるよう言語モデル • hinvが共通の空間になるようWasserstein距離を損失に Wasserstein距離 同じ特徴空間へ 13

  21. 提案手法(c) Rationaleから真attentionへの変換を学習 正解のrationale oracle attention concat 13

  22. 提案手法:End-to-endな学習 ソースドメインにおけるモデルをマルチタスク 学習 13

  23. 提案手法:Attentionの転移 ソースドメインで学習したモデルを使用 ターゲットドメインでRationaleから真attention を生成 正解のrationale 生成したattention 13

  24. 提案手法:分類器の学習 ターゲットドメインにおいて、真attentionと分 類ラベルから分類器をマルチタスク学習 13

  25. 実験 13

  26. 観点間の転移 提案手法の有効性を確認 Rationaleを加えると分類精度が向上 真attentionを介する提案手法では更に精度向上 • Attention ̸= Rationaleを確認 Source Target

    SVM RA-SVM‡ RA-CNN‡ TRANS† RA-TRANS‡† OURS‡† ORACLE† Beer aroma+palate Beer look 74.41 74.83 74.94 72.75 76.41 79.53 80.29 Beer look+palate Beer aroma 68.57 69.23 67.55 69.92 76.45 77.94 78.11 Beer look+aroma Beer palate 63.88 67.82 65.72 74.66 73.40 75.24 75.50 Table 3: Accuracy of transferring between aspects. Models with † use labeled data from source aspects. Models with ‡ use human rationales on the target aspect. Rationaleを学習に 使うベースライン Rationaleで attentionを 学習 Rationaleを 使わない 転移学習 ターゲット ドメインの 真attention を活用 14
  27. ドメイン間の転移 提案手法の有効性を確認 ドメイン間の転移でも同様の傾向を確認 ターゲットドメインの真attentionを使った Oracleは更に性能が高い Source Target SVM RA-SVM‡ RA-CNN‡

    TRANS† RA-TRANS‡† OURS‡† ORACLE† Beer look + Beer aroma + Beer palate Hotel location 78.65 79.09 79.28 80.42 82.10 84.52 85.43 Hotel cleanliness 86.44 86.68 89.01 86.95 87.15 90.66 92.09 Hotel service 85.34 86.61 87.91 87.37 86.40 89.93 92.42 Table 4: Accuracy of transferring between domains. Models with † use labeled data from source domains and unlabeled data from the target domain. Models with ‡ use human rationales on the target task. Rationaleを学習に 使うベースライン Rationaleで attentionを 学習 Rationaleを 使わない 転移学習 ターゲット ドメインの 真attention を活用 15
  28. 各機能の評価 Wasserstein距離を損失を入れたモデルで はhinvが共通の空間ある Rationaleよりも、R2Aで生成したattentionのほ うが真attentionに近い (a) OURS (b) OURS w/o

    L wd Figure 5: t-SNE visualization of the learned hidden representation5 for beer review (blue circle) and hotel review (orange triangle). Target Human rationales R2A-generated attention Location 0.5185 0.2371 Cleanliness 0.5948 0.3141 Service 0.5833 0.2871 Table 6: Avg. cosine distance to the oracle atten- tion over the target training set. The R2A is trained on beer reviews with unlabeled hotel reviews. 16
  29. Rationaleを使うことの効率性 データを増やすよりRationaleを与える方が効率的 Rationaleのデータを作るくらいならばデータを 増やせば良いのではないか? Rationaleであれば6.5%〜50%のデータ量で同等 の分類精度を得られる Accuracy 73.00 76.75 80.50

    84.25 88.00 Num. training examples 200 400 600 800 1000 1200 1400 1600 1800 2000 84.52 Ours
 (using 200) Attention-based classif er Accuracy 78.00 81.50 85.00 88.50 92.00 Num. training examples 200 700 1200 1700 2200 2700 3200 3700 90.66 Ours
 (using 200) Attention-based classif er Accuracy 80.00 83.00 86.00 89.00 92.00 Num. training examples 200 500 800 1100 1400 1700 2000 2300 89.93 Ours
 (using 200) Attention-based classif er Figure 7: Learning curve of an attention-based classif er on three tasks: hotel location (left), hotel clean- liness (center), hotel service (right). The performance of our approach trained on 200 examples with human rationales is shown as a reference. 17
  30. まとめ 17

  31. まとめ Bat et al. 2018. Deriving Machine Attention from Human

    Rationales. EMNLP. 目的:分類の根拠となった箇所のデータを用い 低リソースドメインで分類精度向上 手法:ドメイン非依存なRationale⇒attentionの 変換を学習 • Rationale:人間が作成した分類の根拠となる記載箇所 結果:観点付き評判分析の観点、ドメイン転移 でベースラインを上回った 18
  32. コメント RationaleとAttentionの関係について様々な仮説 をたてており興味深い Adversarial学習を含む4タスクのマルチタスク学 習は地獄だったのでは... モデルのどの工夫が効いたかablation studyをし て欲しかった 19

  33. Get the presentation slides from: https://bit.ly/2Us6tFn https://github.com/koreyou/emnlp2018- meetup.git This presentation

    is licensed under CC0 1.0 Universal (CC0 1.0) Public Domain Dedication (except figures derived from the original paper[3]) . cz 20
  34. References i Tao Lei, Regina Barzilay, and Tommi Jaakkola. “Rationalizing

    Neural Predictions”. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016, pp. 107–117. Wang Ling et al. “Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems”. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vol. 1. 2017, pp. 158–167. Yujia Bao et al. “Deriving Machine Attention from Human Rationales”. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2018. url: http://www.aclweb.org/anthology/D18-1216. 21