文献紹介：Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank

文献紹介 2014/10/07 長岡技術科学大学自然言語処理研究室岡田正平

文献情報 Richard Socher, Alex Perelygin, Jean Y. Wu, Jason Chuang,
Christopher D. Manning, Andrew Y. Ng and Christopher Potts Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp 1631-1642. 2013. 2014/10/7 文献紹介 2

概要 • 感情解析における言語の compositional effect を扱う • Stanford Sentiment Treebank
の作成 – 構文木の subtree レベルで sentiment label を付与 • Recursive Neural Tensor Network (RNTN) の提案 – 感情極性推定の精度で state of the art を上回る – 否定語の影響を正しく扱えていることを確認 2014/10/7 文献紹介 3

Stanford Sentiment Treebank

Stanford Sentiment Treebank • 初の完全にタグ付けされた構文木コーパス • Pang and Lee (2005)による
dataset に基づく • 映画のレビューより抽出された11,855文（単文） – 215,154 unique phrases – Stanford parser による構文解析 • 言語現象と感情の複雑な関係を解析できるようになる 2014/10/7 文献紹介 5

Stanford Sentiment Treebank • Amazon Mechanical Turk を利用 – 作業者は3人
– 215,154句に25段階の値をつける 2014/10/7 文献紹介 6

Stanford Sentiment Treebank 2014/10/7 文献紹介 7

Recursive Neural Models

Recursive Neural Models • 構文型・可変長の句を扱うためのモデル • 入力文（n-gram）は2分木の構文木にparseされる – 各単語が葉に相当 •
子ノードのベクトルから親ノードのベクトルを計算 – 葉からbottom up的に，再帰的に行う • まず既存の Recursive Neural Models 2種類を説明次に提案手法の RNTN を説明 2014/10/7 文献紹介 9

Recursive Neural Models • 各単語は次元ベクトル – word embedding matrix: ∈
ℝ× • 5値分類の例 – 各ラベルに対する事後確率を求める = softmax ∈ ℝ5× 2014/10/7 文献紹介 10

Recursive Neural Models • Recursive Neural Network 1 = ,
2 = 1 ∈ ℝ×2 2014/10/7 文献紹介 11

Recursive Neural Models • Matrix-Vector RNN • 各単語や句をベクトルと行列の両方で表現する 1
= 𝐵 , 1 = ∈ ℝ×2 2014/10/7 文献紹介 12

Recursive Neural Models • RNN – 入力ベクトル（語や句）は非線形関数を通じてしか相互作用しない • MV-RNN
– 語彙数に対するパラメータ数の増加が大きい（各単語に対して × 行列） 2014/10/7 文献紹介 13

Recursive Neural Models • RNTN（提案手法） 1 = 1: + ,
2 = 1 1: 1 + 1 1: ∈ ℝ2×2× 2014/10/7 文献紹介 14

実験 2014/10/7 文献紹介 15

実験 1. 各句に対する5値分類 (very negative ~ very positive) 2. 各文に対する2値分類
(positive or negative) 3. モデルの解析：対比接続詞 ‘X but Y’ 4. モデルの解析：High level negation 5. モデルの解析：強いpositive/negativeを表す句

実験 | 比較手法 • bag of words を素性としたナイーブベイズ (NB) •
bag of words を素性としたSVM (SVM) • bag of bigrams を素性としたナイーブベイズ (BiNB) • neural word vector の平均を用いる手法 (VecAvg) • RNN • MV-RNN • RNTN （提案手法） 2014/10/7 文献紹介 17

実験 | データセット • Sentiment Treebank を分単位で次のように分割（括弧内の数字は neutral を除いたもの）
– train: 8,544 (6,920) – dev: 1,101 (872) – test: 2,210 (1,821) 2014/10/7 文献紹介 18

実験 | 結果 • 左：5値分類，右：2値分類の精度 2014/10/7 文献紹介 20

実験 | 結果 • RNTNが最高精度を達成 • 2値分類ではstate of the art
でも80%を超えていなかった  sentiment treebank を用いるとbaselineでさえ80%超  粗いアノテーションでは，強力なモデルを用いていても複雑な言語現象を捉えることができていなかった 2014/10/7 文献紹介 21

実験 | 結果 2014/10/7 文献紹介 22

実験 | 結果 2014/10/7 文献紹介 23

実験 | 結果 • RNTNはほとんどのn-gramの長さにおいて最高精度 • bag of feature では長い句に対してのみ性能を発揮
– 短い句においては，否定や構造の影響を強く受ける 2014/10/7 文献紹介 24

実験 | 結果 • 対比接続詞 ‘X but Y’ – XとYは句で，異なる感情を持つ（neutralを含む）
– XとYの極性分類が正しく，接続詞’but’と句Y全体を表すノードを支配する最も低いノードがYと同じ極性を持つ場合に正解とする • 131事例に対して，RNTNは41%の精度を達成 – MV-RNN: 37%, RNN: 36%, biNB: 27% 2014/10/7 文献紹介 26

実験 | 結果 2014/10/7 文献紹介 27

実験 | 結果 High level negation • 評価のためにデータセットを2つに分割 1. positive
sentence の否定 2. negative sentence の否定 2014/10/7 文献紹介 29

実験 | 結果 positive sentence の否定 • 否定により極性は positive から
negative に変わる 2014/10/7 文献紹介 30

実験 | 結果 negative sentence の否定 • 否定によりnegative の度合いが弱まる
（positiveとは限らない） 2014/10/7 文献紹介 31

実験 | 結果 2014/10/7 文献紹介 32

実験 | 結果 • RNTNが否定の振舞いを最も正しく扱えている 2014/10/7 文献紹介 33

実験 | 結果 • 上位10 positive n-gramの sentiment value の平均値
2014/10/7 文献紹介 35

文献紹介：Recursive Deep Models for Semantic Composi...

文献紹介：Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank

More Decks by Shohei Okada

Other Decks in Research

Featured

Transcript