EMNLP2018論文読み会:Deriving Machine Attention from Human Rationales

Deriving Machine Attention from Human Rationales @koreyou 2018/12/9 EMNLP2018読み会@サイバーエージェント, 東京
1

Who am I? 是枝祐太某電機会社リサーチャー研究歴 • 〜2015: 医療＋ロボット（大学） •
〜2016: ロボット＋応用機械学習 • 〜現在: 応用機械学習＋自然言語処理 koreyou koreyou_ 2

tl;dr Bat et al. 2018. Deriving Machine Attention from Human
Rationales. EMNLP. 目的：分類の根拠となった箇所のデータを用い低リソースドメインで分類精度向上手法：ドメイン非依存なRationale⇒attentionの変換を学習 • Rationale:人間が作成した分類の根拠となる記載箇所結果：観点付き評判分析の観点、ドメイン転移でベースラインを上回った 3

Table of contents 1. 背景 2. 提案手法 3. 実験 4.
まとめ 4

Deriving Machine Attention from Human Rationales Yujia Bao1, Shiyu Chang2,
Mo Yu2, Regina Barzilay1 1Computer Science and Artif cial Intelligence Lab, MIT 2MIT-IBM Watson AI Lab, IBM Research {yujia, regina} @csail.mit.edu, [email protected], [email protected] 5

背景 5

観点付き評判分析観点付き評判分析をビークルに研究本研究は自然言語タスク全般に利用可能 • わかりやすさのために具体的なタスクを先に紹介観点付き評判分析 (Aspect-based sentiment analysis; ABSA)
• 入力文が各観点について肯定的か否定的かを分類本発表では“ドメイン”を“観点”と読みかえて理解 a nice and clean hotel to stay for business and leisure . but the location is not good if you need public transport . it took too long for transport and waiting for bus . but the swimming pool looks good . Location Cleanliness 6

Rationale (根拠、解釈) 根拠 (Rationale) 提示型AIが注目されている Rationale=分類の根拠となる記載箇所 • なぜその予測をしたかの解釈を与えAIを説明可能にする Rationaleを提示する研究が注目されている[1, 2]
a nice and clean hotel to stay for business and leisure . but the location is not good if you need public transport . it took too long for transport and waiting for bus . but the swimming pool looks good . Location Cleanliness Cleanliness Location 6

研究目的根拠データを使い低リソースドメインで精度向上分類の教師データに加え、なぜそのような分類をすべきかという根拠を学習に加えることで、少量のデータで高い分類精度を実現できないか? 7

提案手法 7

文分類におけるAttention機構の活用 Attention機構により文分類の精度向上が図れるプーリングとしてのattention機構 • 各単語表現からattentionの値 (実数) を計算 • attentionの値で単語表現の重み付き和 Task:
Hotel location label: negative a nice and clean hotel to stay for business and leisure . but the location is not good if you need public transport . it took too long for transport and waiting for bus . but the swimming pool looks good . 8

Attention vs. Rationale RationaleをAttention風に変換する Attention ̸= Rationale • Attentionは連続値(強弱がある)、rationaleは二値 •
Attentionは分類精度を最大化するよう最適化されている Rationaleを直接学習に使うよりも、Rationaleを attention 風に変換してから学習に使うほうが良いのでは？⇒R2A (rationale to attention) • 分類学習に適したAttentionを真(oracle) attentionと呼ぶ Task: Hotel location label: negative a nice and clean hotel to stay for business and leisure . but the location is not good if you need public transport . it took too long for transport and waiting for bus . but the swimming pool looks good . Attention Rationale 9

提案手法のキーポイント Rationale⇒真attentionの変換をドメイン転移真attentionは大量の正解分類データを用い分類を学習することで獲得できる正解分類データが少ないドメイン(観点)では、真attentionは獲得できない仮説：Rationale⇒真attentionの変換はドメインによらず共通 • e.g.
Rationaleの中でも内容語にattentionが強くかかるデータが多いドメインでの変換を転移 10

提案手法の流れ文章 Attention生成 you get what you pay for .
not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... 学習分類器 R2A 学習分類器学習 Attention生成 you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... you get what you pay for . not the cleanest rooms but bed was clean and so was bath- room . bring your own towels though as very thin . service was excellent , ...... 文章ラベル Rationale Rationale 真Attention ラベル真Attention ターゲットドメイン(観点) ソースドメイン(観点) 大量少量 11

問題設定(データ) データ区分分類正解データ数 rationale 正解データ数転移元ドメイン(学習) 大大(疑似的に生成) ターゲットドメイン(学習)
小小ターゲットドメイン(評価) 無無 12

提案手法：提案手法の全体構成 (a) 真attentionを生成する分類器 (b) 各ドメインを共通の空間にマッピング (c) rationaleからattentionを生成 13

提案手法(a) ソースドメインでattention生成と分類を学習生成されたattentionはR2Aの正解データにソースドメイン(1個以上) 単語列 Attention ⇒ 後のタスクの正解データへ分類ラベル 13

提案手法(b) 各ドメインを共通の空間にマッピング • encがリッチな特徴量を抽出できるよう言語モデル • hinvが共通の空間になるようWasserstein距離を損失に言語モデルの損失ターゲットドメイン 13

提案手法(b) 各ドメインを共通の空間にマッピング • encがリッチな特徴量を抽出できるよう言語モデル • hinvが共通の空間になるようWasserstein距離を損失に Wasserstein距離同じ特徴空間へ 13

提案手法(c) Rationaleから真attentionへの変換を学習正解のrationale oracle attention concat 13

提案手法：End-to-endな学習ソースドメインにおけるモデルをマルチタスク学習 13

提案手法：Attentionの転移ソースドメインで学習したモデルを使用ターゲットドメインでRationaleから真attention を生成正解のrationale 生成したattention 13

提案手法：分類器の学習ターゲットドメインにおいて、真attentionと分類ラベルから分類器をマルチタスク学習 13

実験 13

観点間の転移提案手法の有効性を確認 Rationaleを加えると分類精度が向上真attentionを介する提案手法では更に精度向上 • Attention ̸= Rationaleを確認 Source Target
SVM RA-SVM‡ RA-CNN‡ TRANS† RA-TRANS‡† OURS‡† ORACLE† Beer aroma+palate Beer look 74.41 74.83 74.94 72.75 76.41 79.53 80.29 Beer look+palate Beer aroma 68.57 69.23 67.55 69.92 76.45 77.94 78.11 Beer look+aroma Beer palate 63.88 67.82 65.72 74.66 73.40 75.24 75.50 Table 3: Accuracy of transferring between aspects. Models with † use labeled data from source aspects. Models with ‡ use human rationales on the target aspect. Rationaleを学習に使うベースライン Rationaleで attentionを学習 Rationaleを使わない転移学習ターゲットドメインの真attention を活用 14

ドメイン間の転移提案手法の有効性を確認ドメイン間の転移でも同様の傾向を確認ターゲットドメインの真attentionを使った Oracleは更に性能が高い Source Target SVM RA-SVM‡ RA-CNN‡
TRANS† RA-TRANS‡† OURS‡† ORACLE† Beer look + Beer aroma + Beer palate Hotel location 78.65 79.09 79.28 80.42 82.10 84.52 85.43 Hotel cleanliness 86.44 86.68 89.01 86.95 87.15 90.66 92.09 Hotel service 85.34 86.61 87.91 87.37 86.40 89.93 92.42 Table 4: Accuracy of transferring between domains. Models with † use labeled data from source domains and unlabeled data from the target domain. Models with ‡ use human rationales on the target task. Rationaleを学習に使うベースライン Rationaleで attentionを学習 Rationaleを使わない転移学習ターゲットドメインの真attention を活用 15

各機能の評価 Wasserstein距離を損失を入れたモデルではhinvが共通の空間ある Rationaleよりも、R2Aで生成したattentionのほうが真attentionに近い (a) OURS (b) OURS w/o
L wd Figure 5: t-SNE visualization of the learned hidden representation5 for beer review (blue circle) and hotel review (orange triangle). Target Human rationales R2A-generated attention Location 0.5185 0.2371 Cleanliness 0.5948 0.3141 Service 0.5833 0.2871 Table 6: Avg. cosine distance to the oracle attention over the target training set. The R2A is trained on beer reviews with unlabeled hotel reviews. 16

Rationaleを使うことの効率性データを増やすよりRationaleを与える方が効率的 Rationaleのデータを作るくらいならばデータを増やせば良いのではないか？ Rationaleであれば6.5%〜50%のデータ量で同等の分類精度を得られる Accuracy 73.00 76.75 80.50
84.25 88.00 Num. training examples 200 400 600 800 1000 1200 1400 1600 1800 2000 84.52 Ours  (using 200) Attention-based classif er Accuracy 78.00 81.50 85.00 88.50 92.00 Num. training examples 200 700 1200 1700 2200 2700 3200 3700 90.66 Ours  (using 200) Attention-based classif er Accuracy 80.00 83.00 86.00 89.00 92.00 Num. training examples 200 500 800 1100 1400 1700 2000 2300 89.93 Ours  (using 200) Attention-based classif er Figure 7: Learning curve of an attention-based classif er on three tasks: hotel location (left), hotel cleanliness (center), hotel service (right). The performance of our approach trained on 200 examples with human rationales is shown as a reference. 17

まとめ 17

まとめ Bat et al. 2018. Deriving Machine Attention from Human
Rationales. EMNLP. 目的：分類の根拠となった箇所のデータを用い低リソースドメインで分類精度向上手法：ドメイン非依存なRationale⇒attentionの変換を学習 • Rationale:人間が作成した分類の根拠となる記載箇所結果：観点付き評判分析の観点、ドメイン転移でベースラインを上回った 18

コメント RationaleとAttentionの関係について様々な仮説をたてており興味深い Adversarial学習を含む4タスクのマルチタスク学習は地獄だったのでは... モデルのどの工夫が効いたかablation studyをして欲しかった 19

Get the presentation slides from: https://bit.ly/2Us6tFn https://github.com/koreyou/emnlp2018- meetup.git This presentation
is licensed under CC0 1.0 Universal (CC0 1.0) Public Domain Dedication (except ﬁgures derived from the original paper[3]) . cz 20

References i Tao Lei, Regina Barzilay, and Tommi Jaakkola. “Rationalizing
Neural Predictions”. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016, pp. 107–117. Wang Ling et al. “Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems”. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vol. 1. 2017, pp. 158–167. Yujia Bao et al. “Deriving Machine Attention from Human Rationales”. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2018. url: http://www.aclweb.org/anthology/D18-1216. 21

EMNLP2018論文読み会:Deriving Machine Attention from ...

EMNLP2018論文読み会:Deriving Machine Attention from Human Rationales

Yuta Koreeda

More Decks by Yuta Koreeda

Other Decks in Research

Featured

Transcript

Deriving Machine Attention from Human Rationales @koreyou 2018/12/9 EMNLP2018読み会@サイバーエージェント, 東京

Who am I? 是枝祐太某電機会社リサーチャー研究歴 • 〜2015: 医療＋ロボット（大学） •

Who am I? 是枝祐太某電機会社リサーチャー研究歴 • 〜2015: 医療＋ロボット（大学） •

tl;dr Bat et al. 2018. Deriving Machine Attention from Human

Table of contents 1. 背景 2. 提案手法 3. 実験 4.

Deriving Machine Attention from Human Rationales Yujia Bao1, Shiyu Chang2,

背景 5

観点付き評判分析観点付き評判分析をビークルに研究本研究は自然言語タスク全般に利用可能 • わかりやすさのために具体的なタスクを先に紹介観点付き評判分析 (Aspect-based sentiment analysis; ABSA)

Rationale (根拠、解釈) 根拠 (Rationale) 提示型AIが注目されている Rationale=分類の根拠となる記載箇所 • なぜその予測をしたかの解釈を与えAIを説明可能にする Rationaleを提示する研究が注目されている[1, 2]

研究目的根拠データを使い低リソースドメインで精度向上分類の教師データに加え、なぜそのような分類をすべきかという根拠を学習に加えることで、少量のデータで高い分類精度を実現できないか? 7

提案手法 7

文分類におけるAttention機構の活用 Attention機構により文分類の精度向上が図れるプーリングとしてのattention機構 • 各単語表現からattentionの値 (実数) を計算 • attentionの値で単語表現の重み付き和 Task:

Attention vs. Rationale RationaleをAttention風に変換する Attention ̸= Rationale • Attentionは連続値(強弱がある)、rationaleは二値 •

提案手法の流れ文章 Attention生成 you get what you pay for .

問題設定(データ) データ区分分類正解データ数 rationale 正解データ数転移元ドメイン(学習) 大大(疑似的に生成) ターゲットドメイン(学習)

提案手法：提案手法の全体構成 (a) 真attentionを生成する分類器 (b) 各ドメインを共通の空間にマッピング (c) rationaleからattentionを生成 13

提案手法(a) ソースドメインでattention生成と分類を学習生成されたattentionはR2Aの正解データにソースドメイン(1個以上) 単語列 Attention ⇒ 後のタスクの正解データへ分類ラベル 13

提案手法(b) 各ドメインを共通の空間にマッピング • encがリッチな特徴量を抽出できるよう言語モデル • hinvが共通の空間になるようWasserstein距離を損失に言語モデルの損失ターゲットドメイン 13

提案手法(b) 各ドメインを共通の空間にマッピング • encがリッチな特徴量を抽出できるよう言語モデル • hinvが共通の空間になるようWasserstein距離を損失に Wasserstein距離同じ特徴空間へ 13

提案手法(c) Rationaleから真attentionへの変換を学習正解のrationale oracle attention concat 13

提案手法：End-to-endな学習ソースドメインにおけるモデルをマルチタスク学習 13

提案手法：Attentionの転移ソースドメインで学習したモデルを使用ターゲットドメインでRationaleから真attention を生成正解のrationale 生成したattention 13

提案手法：分類器の学習ターゲットドメインにおいて、真attentionと分類ラベルから分類器をマルチタスク学習 13

実験 13

観点間の転移提案手法の有効性を確認 Rationaleを加えると分類精度が向上真attentionを介する提案手法では更に精度向上 • Attention ̸= Rationaleを確認 Source Target

ドメイン間の転移提案手法の有効性を確認ドメイン間の転移でも同様の傾向を確認ターゲットドメインの真attentionを使った Oracleは更に性能が高い Source Target SVM RA-SVM‡ RA-CNN‡

各機能の評価 Wasserstein距離を損失を入れたモデルではhinvが共通の空間ある Rationaleよりも、R2Aで生成したattentionのほうが真attentionに近い (a) OURS (b) OURS w/o

まとめ 17

まとめ Bat et al. 2018. Deriving Machine Attention from Human

コメント RationaleとAttentionの関係について様々な仮説をたてており興味深い Adversarial学習を含む4タスクのマルチタスク学習は地獄だったのでは... モデルのどの工夫が効いたかablation studyをして欲しかった 19

Get the presentation slides from: https://bit.ly/2Us6tFn https://github.com/koreyou/emnlp2018- meetup.git This presentation

References i Tao Lei, Regina Barzilay, and Tommi Jaakkola. “Rationalizing