Slide 1

Slide 1 text

Deriving Machine Attention from Human Rationales @koreyou 2018/12/9 EMNLP2018読み会@サイバーエージェント, 東京 1

Slide 2

Slide 2 text

Who am I? 是枝祐太 某電機会社リサーチャー 研究歴 • 〜2015: 医療+ロボット(大学) • 〜2016: ロボット+応用機械学習 • 〜現在: 応用機械学習+自然言語処理 koreyou koreyou_ 2

Slide 3

Slide 3 text

Who am I? 是枝祐太 某電機会社リサーチャー 研究歴 • 〜2015: 医療+ロボット(大学) • 〜2016: ロボット+応用機械学習 • 〜現在: 応用機械学習+自然言語処理 koreyou koreyou_ 2

Slide 4

Slide 4 text

tl;dr Bat et al. 2018. Deriving Machine Attention from Human Rationales. EMNLP. 目的:分類の根拠となった箇所のデータを用い 低リソースドメインで分類精度向上 手法:ドメイン非依存なRationale⇒attentionの 変換を学習 • Rationale:人間が作成した分類の根拠となる記載箇所 結果:観点付き評判分析の観点、ドメイン転移 でベースラインを上回った 3

Slide 5

Slide 5 text

Table of contents 1. 背景 2. 提案手法 3. 実験 4. まとめ 4

Slide 6

Slide 6 text

Deriving Machine Attention from Human Rationales Yujia Bao1, Shiyu Chang2, Mo Yu2, Regina Barzilay1 1Computer Science and Artif cial Intelligence Lab, MIT 2MIT-IBM Watson AI Lab, IBM Research {yujia, regina} @csail.mit.edu, [email protected], [email protected] 5

Slide 7

Slide 7 text

背景 5

Slide 8

Slide 8 text

観点付き評判分析 観点付き評判分析をビークルに研究 本研究は自然言語タスク全般に利用可能 • わかりやすさのために具体的なタスクを先に紹介 観点付き評判分析 (Aspect-based sentiment analysis; ABSA) • 入力文が各観点について肯定的か否定的かを分類 本発表では“ドメイン”を“観点”と読みかえて理解 a nice and clean hotel to stay for business and leisure . but the location is not good if you need public transport . it took too long for transport and waiting for bus . but the swimming pool looks good . Location Cleanliness 6

Slide 9

Slide 9 text

Rationale (根拠、解釈) 根拠 (Rationale) 提示型AIが注目されている Rationale=分類の根拠となる記載箇所 • なぜその予測をしたかの解釈を与えAIを説明可能にする Rationaleを提示する研究が注目されている[1, 2] a nice and clean hotel to stay for business and leisure . but the location is not good if you need public transport . it took too long for transport and waiting for bus . but the swimming pool looks good . Location Cleanliness Cleanliness Location 6

Slide 10

Slide 10 text

研究目的 根拠データを使い低リソースドメインで精度向上 分類の教師データに加え、なぜそのような分類 をすべきかという根拠を学習に加えることで、 少量のデータで高い分類精度を実現できないか? 7

Slide 11

Slide 11 text

提案手法 7

Slide 12

Slide 12 text

文分類におけるAttention機構の活用 Attention機構により文分類の精度向上が図れる プーリングとしてのattention機構 • 各単語表現からattentionの値 (実数) を計算 • attentionの値で単語表現の重み付き和 Task: Hotel location label: negative a nice and clean hotel to stay for business and leisure . but the location is not good if you need public transport . it took too long for transport and waiting for bus . but the swimming pool looks good . 8

Slide 13

Slide 13 text

Attention vs. Rationale RationaleをAttention風に変換する Attention ̸= Rationale • Attentionは連続値(強弱がある)、rationaleは二値 • Attentionは分類精度を最大化するよう最適化されている Rationaleを直接学習に使うよりも、Rationaleを attention 風に変換してから学習に使うほうが良 いのでは?⇒R2A (rationale to attention) • 分類学習に適したAttentionを真(oracle) attentionと呼ぶ Task: Hotel location label: negative a nice and clean hotel to stay for business and leisure . but the location is not good if you need public transport . it took too long for transport and waiting for bus . but the swimming pool looks good . Attention Rationale 9

Slide 14

Slide 14 text

提案手法のキーポイント Rationale⇒真attentionの変換をドメイン転移 真attentionは大量の正解分類データを用い分類 を学習することで獲得できる 正解分類データが少ないドメイン(観点)では、 真attentionは獲得できない 仮説:Rationale⇒真attentionの変換はドメイン によらず共通 • e.g. Rationaleの中でも内容語にattentionが強くかかる データが多いドメインでの変換を転移 10

Slide 15

Slide 15 text

提案手法の流れ 文章 Attention生成 you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... 学習 分類器 R2A 学習 分類器 学習 Attention生成 you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... 文章 ラベル Rationale Rationale 真Attention ラベル 真Attention ターゲットドメイン(観点) ソース ドメイン(観点) 大量 少量 11

Slide 16

Slide 16 text

問題設定(データ) データ区分 分類正解 データ数 rationale 正解データ数 転移元ドメイン(学習) 大 大(疑似的に生成) ターゲットドメイン(学習) 小 小 ターゲットドメイン(評価) 無 無 12

Slide 17

Slide 17 text

提案手法:提案手法の全体構成 (a) 真attentionを生成する分類器 (b) 各ドメインを共通の空間にマッピング (c) rationaleからattentionを生成 13

Slide 18

Slide 18 text

提案手法(a) ソースドメインでattention生成と分類を学習 生成されたattentionはR2Aの正解データに ソースドメイン(1個以上) 単語列 Attention ⇒ 後のタスクの正解データへ 分類ラベル 13

Slide 19

Slide 19 text

提案手法(b) 各ドメインを共通の空間にマッピング • encがリッチな特徴量を抽出できるよう言語モデル • hinvが共通の空間になるようWasserstein距離を損失に 言語モデルの損失 ターゲットドメイン 13

Slide 20

Slide 20 text

提案手法(b) 各ドメインを共通の空間にマッピング • encがリッチな特徴量を抽出できるよう言語モデル • hinvが共通の空間になるようWasserstein距離を損失に Wasserstein距離 同じ特徴空間へ 13

Slide 21

Slide 21 text

提案手法(c) Rationaleから真attentionへの変換を学習 正解のrationale oracle attention concat 13

Slide 22

Slide 22 text

提案手法:End-to-endな学習 ソースドメインにおけるモデルをマルチタスク 学習 13

Slide 23

Slide 23 text

提案手法:Attentionの転移 ソースドメインで学習したモデルを使用 ターゲットドメインでRationaleから真attention を生成 正解のrationale 生成したattention 13

Slide 24

Slide 24 text

提案手法:分類器の学習 ターゲットドメインにおいて、真attentionと分 類ラベルから分類器をマルチタスク学習 13

Slide 25

Slide 25 text

実験 13

Slide 26

Slide 26 text

観点間の転移 提案手法の有効性を確認 Rationaleを加えると分類精度が向上 真attentionを介する提案手法では更に精度向上 • Attention ̸= Rationaleを確認 Source Target SVM RA-SVM‡ RA-CNN‡ TRANS† RA-TRANS‡† OURS‡† ORACLE† Beer aroma+palate Beer look 74.41 74.83 74.94 72.75 76.41 79.53 80.29 Beer look+palate Beer aroma 68.57 69.23 67.55 69.92 76.45 77.94 78.11 Beer look+aroma Beer palate 63.88 67.82 65.72 74.66 73.40 75.24 75.50 Table 3: Accuracy of transferring between aspects. Models with † use labeled data from source aspects. Models with ‡ use human rationales on the target aspect. Rationaleを学習に 使うベースライン Rationaleで attentionを 学習 Rationaleを 使わない 転移学習 ターゲット ドメインの 真attention を活用 14

Slide 27

Slide 27 text

ドメイン間の転移 提案手法の有効性を確認 ドメイン間の転移でも同様の傾向を確認 ターゲットドメインの真attentionを使った Oracleは更に性能が高い Source Target SVM RA-SVM‡ RA-CNN‡ TRANS† RA-TRANS‡† OURS‡† ORACLE† Beer look + Beer aroma + Beer palate Hotel location 78.65 79.09 79.28 80.42 82.10 84.52 85.43 Hotel cleanliness 86.44 86.68 89.01 86.95 87.15 90.66 92.09 Hotel service 85.34 86.61 87.91 87.37 86.40 89.93 92.42 Table 4: Accuracy of transferring between domains. Models with † use labeled data from source domains and unlabeled data from the target domain. Models with ‡ use human rationales on the target task. Rationaleを学習に 使うベースライン Rationaleで attentionを 学習 Rationaleを 使わない 転移学習 ターゲット ドメインの 真attention を活用 15

Slide 28

Slide 28 text

各機能の評価 Wasserstein距離を損失を入れたモデルで はhinvが共通の空間ある Rationaleよりも、R2Aで生成したattentionのほ うが真attentionに近い (a) OURS (b) OURS w/o L wd Figure 5: t-SNE visualization of the learned hidden representation5 for beer review (blue circle) and hotel review (orange triangle). Target Human rationales R2A-generated attention Location 0.5185 0.2371 Cleanliness 0.5948 0.3141 Service 0.5833 0.2871 Table 6: Avg. cosine distance to the oracle atten- tion over the target training set. The R2A is trained on beer reviews with unlabeled hotel reviews. 16

Slide 29

Slide 29 text

Rationaleを使うことの効率性 データを増やすよりRationaleを与える方が効率的 Rationaleのデータを作るくらいならばデータを 増やせば良いのではないか? Rationaleであれば6.5%〜50%のデータ量で同等 の分類精度を得られる Accuracy 73.00 76.75 80.50 84.25 88.00 Num. training examples 200 400 600 800 1000 1200 1400 1600 1800 2000 84.52 Ours
 (using 200) Attention-based classif er Accuracy 78.00 81.50 85.00 88.50 92.00 Num. training examples 200 700 1200 1700 2200 2700 3200 3700 90.66 Ours
 (using 200) Attention-based classif er Accuracy 80.00 83.00 86.00 89.00 92.00 Num. training examples 200 500 800 1100 1400 1700 2000 2300 89.93 Ours
 (using 200) Attention-based classif er Figure 7: Learning curve of an attention-based classif er on three tasks: hotel location (left), hotel clean- liness (center), hotel service (right). The performance of our approach trained on 200 examples with human rationales is shown as a reference. 17

Slide 30

Slide 30 text

まとめ 17

Slide 31

Slide 31 text

まとめ Bat et al. 2018. Deriving Machine Attention from Human Rationales. EMNLP. 目的:分類の根拠となった箇所のデータを用い 低リソースドメインで分類精度向上 手法:ドメイン非依存なRationale⇒attentionの 変換を学習 • Rationale:人間が作成した分類の根拠となる記載箇所 結果:観点付き評判分析の観点、ドメイン転移 でベースラインを上回った 18

Slide 32

Slide 32 text

コメント RationaleとAttentionの関係について様々な仮説 をたてており興味深い Adversarial学習を含む4タスクのマルチタスク学 習は地獄だったのでは... モデルのどの工夫が効いたかablation studyをし て欲しかった 19

Slide 33

Slide 33 text

Get the presentation slides from: https://bit.ly/2Us6tFn https://github.com/koreyou/emnlp2018- meetup.git This presentation is licensed under CC0 1.0 Universal (CC0 1.0) Public Domain Dedication (except figures derived from the original paper[3]) . cz 20

Slide 34

Slide 34 text

References i Tao Lei, Regina Barzilay, and Tommi Jaakkola. “Rationalizing Neural Predictions”. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016, pp. 107–117. Wang Ling et al. “Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems”. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vol. 1. 2017, pp. 158–167. Yujia Bao et al. “Deriving Machine Attention from Human Rationales”. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2018. url: http://www.aclweb.org/anthology/D18-1216. 21