EMNLP 2014: Opinion Mining with Deep Recurrent Neural Networks

Komachi Lab. M1 Peinan ZHANG Opinion Mining with Deep Recurrent
Neural Networks EMNLP 2014 reading @ Komachi Lab 2014/12/04 All figures and tables in this slide are cited from the paper.

Komachi Lab. M1 Peinan ZHANG Introduction Fine-grained opinion analysis aims
to detect the subjective expressions in n  a text (e.g. “hate” or “like”) and to characterize their n  intensity (e.g. “strong” or “weak”) n  sentiment (e.g. “negative” or “positive”) as well as to identify n  the opinion holder: the entity expressing the opinion n  the target, or topic of the opinion: what the opinion is about 2

Komachi Lab. M1 Peinan ZHANG Introduction: Tasks Detection of opinion
expressions [Wiebe et al., 2005] DSEs (Direct Subjective Expressions) consist of explicit mentions of private states or speech events expressing private states ESEs (Expressive Subjective Expressions) consist of expressions that indicate sentiment, emotion, etc., without explicitly conveying them 3

Komachi Lab. M1 Peinan ZHANG Introduction: Examples DSE: explicitly express
an opinion holder’s attitude ESE: indirectly express the attitude of the writer 4

Komachi Lab. M1 Peinan ZHANG Introduction: Labeling Opinion extraction has
often been tackled as a sequence labeling problem in previous work. n  B: the beginning of an opinion-related expression n  I: tokens inside the opinion-related expression n  O: tokens outside any opinion-related class 5

Komachi Lab. M1 Peinan ZHANG Introduction: Methods (1/3) CRFs (Conditional
Random Field) variants of CRF approaches have been successfully applied to opinion expression extraction using this token-based view. semiCRF (state-of-the-art) relaxes the Markovian assumption inherent to CRFs and operates at the phrase level rather then the token level, allowing the incorporation of phrase-level features. But those CRFs hinges critically on access to an appropriate feature set, typically based on constituent, dependency parse trees, manually crafted opinion lexicons, named entity tagger and other preprocessing 6

Komachi Lab. M1 Peinan ZHANG Introduction: Methods (2/3) RNN (Recurrent
Neural Network) n  latent features are modeled as distributed dense vectors of hidden layers n  can operate on sequential data of variable length n  it can also be applied as a sequence labeler bidirectional RNN n  incorporate information from preceding as well as following tokens n  allowing a lower dimensional dense input representation n  more compact networks 7

Komachi Lab. M1 Peinan ZHANG Introduction: Methods (3/3) Deep Recurrent
Network n  lower levels capture short term interactions among words n  higher layers reflect interpretations aggregate over longer spans n  such hierarchies might better model the multi-scale language effects Deep Bidirectional RNN (the proposed method) motivated by the recent success of deep architectures in general and deep recurrent networks in particular 8

Komachi Lab. M1 Peinan ZHANG Agenda Introduction 1.  Tasks 2. 
Examples 3.  Labeling 4.  Methods Methodology 1.  Recurrent Neural Network 2.  Bidirectionality 3.  Deep in Space 9 Experiments 1.  Data and Metrics 2.  Baselines 3.  Tuning and Training 4.  Results and Discussion Conclusion

Komachi Lab. M1 Peinan ZHANG Methodology: Recurrent Neural Network Elman-type
network [Elman, 1990] t: step, h0 = 0 h: hidden layer x: input layer y: final output layer f: nonlinear function (e.g. sigmoid) g: output nonlinearity (e.g. softmax) 10 W, V: weight matrices between the input and hidden layer U: output weight matrix b, c: bias vectors

Komachi Lab. M1 Peinan ZHANG Methodology: Recurrent Neural Network Problem
with this model: the Elman-style unidirectional RNN lack the representational power to model this task. For example: n  I did not accept his suggestion. n  I did not go to the rodeo. The first example has a DSE phrase “did not accept”. However, any such RNN will assign the same labels for the word “did” and “not” in both sentences, since the preceding sequences (past) are the same. 11

Komachi Lab. M1 Peinan ZHANG Methodology: Bidirectionality bidirectional RNN [Schuster
et al., 1997] ˠ : forward step (representations of the past) ˡ : backward step (representations of the future) h0 → = hT+1 ← = 0 Note: the forward and backward parts of the network are independent of each other until the output layer when they are combined. 12

Komachi Lab. M1 Peinan ZHANG Methodology: Depth in Space (1/2)
deep RNN: constructed by stacking Elman-type RNNs on top of each other when i > 1 Intuitively, every layer of the deep RNN treats the memory sequence of the previous layer as the input sequence, and computes its own memory representation. 13

Komachi Lab. M1 Peinan ZHANG Methodology: Depth in Space (2/2)
bidirectional deep RNN when i > 1 To compute the output layer, we only employ the last memory layers. 14

Komachi Lab. M1 Peinan ZHANG Experiments: Data and Metrics Data
MPQA 1.2 corpus [Wiebe et al., 2005] n  535 news articles, 11,111 sentences n  manually annotated w/ both DSEs and ESEs at phrase level n  135 development set n  10-fold cross validation over remaining 400 documents Evaluation Metrics Binary Overlap count every overlapping match between a predicted and true expression as correct Proportional Overlap impart a partial correctness, proportional to the overlapping amount, to each match All statistical comparisons are done using a two-sided paired t-test with a confidence level of Ћ = .05 15

Komachi Lab. M1 Peinan ZHANG Experiments: Baselines Baselines n  CRF
and semiCRF n  Features: words, POS tag, membership in a manually constructed opinion lexicon Word Vectors (+VEC) n  versions of the baselines that have access to pre-trained word vectors n  CRF+VEC: as continuous features per every token n  semiCRF+VEC: simply take the mean of every word vector for a phrase-level vector representation n  300-dimensinal n  trained on part of Google News Dataset (ʙ100B words) 16

Komachi Lab. M1 Peinan ZHANG Experiments: Tuning and Training Regularizer
Dropout: randomly set entries of hidden representations to 0 with a probability called dropout rate Network Training n  use SGD with fixed learning rate .005 n  update weights after minibatches of 80 sentences n  run 200 epochs n  initialized from small random uniform noise 17

Komachi Lab. M1 Peinan ZHANG Experiments: Results and Discussion Bidirectional
vs. Unidirectional Shallow biRNN vs. uniRNN n  each network has the same number of total parameters n  65 hidden units for the unidirectional network n  36 hidden units for the bidirectional network DSEs n  Proportional Overlap: 63.83 vs. 60.35 n  Binary Overlap: 69.31 vs. 68.31 ESEs n  Proportional Overlap: 54.22 vs. 51.51 n  Binary Overlap: 65.44 vs. 63.65 Thus, we will not include comparisons to the unidirectional RNNs in the remaining experiments. 18

Komachi Lab. M1 Peinan ZHANG Experiments: Results and Discussion 19
Adding Depth Bold: Best result Asterisk: statistically indis- tinguishable performance w/ respect to the best n  for both DSE and ESE, 3-layer RNN provide the best results n  2, 3 and 4-layer RNNs show equally good performance for certain sizes n  adding additional layers degrades performance

Komachi Lab. M1 Peinan ZHANG Experiments: Results and Discussion Compare
with other methods 20

Komachi Lab. M1 Peinan ZHANG Conclusion p Explored an application of
deep recurrent neural networks to the task of sentence-level opinion expression. p deep RNNs outperformed shallow RNNs. p deep RNNs outperformed pervious (semi)CRF baselines. p One potential future direction is to explore the effects of pre-training. 21

Komachi Lab. M1 Peinan ZHANG Agenda Introduction 1.  Tasks 2. 
Examples 3.  Labeling 4.  Methods Methodology 1.  Recurrent Neural Network 2.  Bidirectionality 3.  Deep in Space 22 Experiments 1.  Data and Metrics 2.  Baselines 3.  Tuning and Training 4.  Results and Discussion Conclusion

EMNLP 2014: Opinion Mining with Deep Recurrent ...

EMNLP 2014: Opinion Mining with Deep Recurrent Neural Networks

peinan

More Decks by peinan

Other Decks in Technology

Featured

Transcript

Komachi Lab. M1 Peinan ZHANG Opinion Mining with Deep Recurrent

Komachi Lab. M1 Peinan ZHANG Introduction Fine-grained opinion analysis aims

Komachi Lab. M1 Peinan ZHANG Introduction: Tasks Detection of opinion

Komachi Lab. M1 Peinan ZHANG Introduction: Examples DSE: explicitly express

Komachi Lab. M1 Peinan ZHANG Introduction: Labeling Opinion extraction has

Komachi Lab. M1 Peinan ZHANG Introduction: Methods (1/3) CRFs (Conditional

Komachi Lab. M1 Peinan ZHANG Introduction: Methods (2/3) RNN (Recurrent

Komachi Lab. M1 Peinan ZHANG Introduction: Methods (3/3) Deep Recurrent

Komachi Lab. M1 Peinan ZHANG Agenda Introduction 1.  Tasks 2.

Komachi Lab. M1 Peinan ZHANG Methodology: Recurrent Neural Network Elman-type

Komachi Lab. M1 Peinan ZHANG Methodology: Recurrent Neural Network Problem

Komachi Lab. M1 Peinan ZHANG Methodology: Bidirectionality bidirectional RNN [Schuster

Komachi Lab. M1 Peinan ZHANG Methodology: Depth in Space (1/2)

Komachi Lab. M1 Peinan ZHANG Methodology: Depth in Space (2/2)

Komachi Lab. M1 Peinan ZHANG Experiments: Data and Metrics Data

Komachi Lab. M1 Peinan ZHANG Experiments: Baselines Baselines n  CRF

Komachi Lab. M1 Peinan ZHANG Experiments: Tuning and Training Regularizer

Komachi Lab. M1 Peinan ZHANG Experiments: Results and Discussion Bidirectional

Komachi Lab. M1 Peinan ZHANG Experiments: Results and Discussion 19

Komachi Lab. M1 Peinan ZHANG Experiments: Results and Discussion Compare

Komachi Lab. M1 Peinan ZHANG Conclusion p Explored an application of

Komachi Lab. M1 Peinan ZHANG Agenda Introduction 1.  Tasks 2.