Slide 1

Slide 1 text

Authors: Eric Breck, Yejin Choi, and Claire Cardie Presented By: Mahak Gupta Date: 14th June 2016 International Joint Conference on Artificial Intelligence (IJCAI-2007)

Slide 2

Slide 2 text

§ Brief Introduction to Text Chunking task. § Types of Opinion Expressions: Direct Subjective Expressions, Expressive Subjective Elements § Sequence Labelling : Brief about IOB Encoding § Linear-chain Conditional Random Field (CRF) Model (0 order, 1st Order) § No Math Equations J 14.06.16 2

Slide 3

Slide 3 text

1. Introduction 2. Task, Features and Model Used 3. Dataset and Evaluation 4. Baselines and Results 5. Conclusion 13.06.16 3

Slide 4

Slide 4 text

§ Two main types of Information : Facts, Opinion § Traditional Information Extraction systems : Answers questions directly from text (facts). § More complicated systems have the ability to answer opinions… 13.06.16 4 Input text : New Delhi is the capital and seat of government of India. It is also a municipality and district in Delhi and serves as the seat of Government of Delhi. Mr. Narendra Modi is the current Prime minister of India*. Question 1: What is the capital of India ? (Fact) Question 2: What do people feel about Mr. Modi’s Govt. ? (Opinion) * Excerpt from wikipedia

Slide 5

Slide 5 text

§Sentence level or clause level opinion mining indicating subjectivity are well studied tasks [Riloff and Wiebe, 2003; Wiebe and Wilson, 2002] § Mr. Modi’s Govt. won with the count of 282/566 seats in the house. Do majority of people like Mr. Modi’s Govt. ? (Yes/No) §Identification of Expressions in context indicating Subjectivity is the question of research of this paper: § The people are pretty much excited to see Mr. Modi as their Prime Minister joining office after the Oath Ceremony this morning. § This is a text chunking problem i.e. a problem of dividing a text in syntactically correlated parts of words. 13.06.16 5

Slide 6

Slide 6 text

§ The/DT people/NNS are/VBP pretty/RB much/JJ excited/VBN to/TO see/VB Mr./NNP Modi/NNP as/IN their/PRP Prime/NNP Minister/NNP joining/VBG office/NN after/IN Oath/NN Ceremony/NNP this/DT morning/NN. § Rule à ADV : RB JJ VBN § The/DT people/NNS are/VBP [pretty/RB much/JJ excited/VBN] to/TO see/VB Mr./NNP Modi/NNP as/IN their/PRP$ Prime/NNP Minister/NNP joining/VBG office/NN after/IN Oath/NN Ceremony/NNP this/DT morning/NN ./. 13.06.16 6

Slide 7

Slide 7 text

§Wiebe et al, 2005 • Direct Subjective Expressions : Direct mentions of opinions The students were nervous before the exam. The ISIS fears a spill-over from their anti-terrorist campaign. (source - Times of India) • Expressive Subjective Elements: Spans of text that indicate subjectivity on the choice of words by speaker. The part of the US human rights report about China is full of absurdities and fabrications 13.06.16 7

Slide 8

Slide 8 text

1. Introduction 2. Task, Features and Model Used 3. Dataset and Evaluation 4. Baselines and Results 5. Conclusion 13.06.16 8

Slide 9

Slide 9 text

Input Text • The part of the US human rights report about China is full of absurdities and fabrications Supervised Learning • Identify subjective phrases in context • Sequence Labelling Problem, Linear CRF Model Output Text • The part of the US human rights report about China is full of absurdities and fabrications 13.06.16 9

Slide 10

Slide 10 text

§ Here the approach is to convert chunking task à tagging task § In this problem the authors have used the IOB/IO encoding: • I -> Inside • O -> Outside • B -> Begin 13.06.16 10 The people are very much excited to have their … O O O B I I O O O O O O O I I I O O O O

Slide 11

Slide 11 text

§ Lexical (current word, nearby words i.e. unigrams, bigrams etc) § Syntactic features (POS Tagging, constituent types *) § Dictionary features : Use 3 available sources • Wordnet : Hypernyms of the current token (concern: anxiety, emotion, feeling, psych state) • Framenet/Levin : Communication words (speak, talk etc.) • Subjectivity clues : weak or strong subjectivity CASS Parser defines 4 constituents: NP. VP, PP, Clause 11 13.06.16 Type Word Pos Polarity Stemmed Weak subjectivity Abandoned Adjective Negative N Strong subjectivity Amaze Verb Positive Y

Slide 12

Slide 12 text

§ Linear-chain conditional random field (using Mallet toolkit) § CRF : Undirected graphical model encoding conditional probability distribution for a set of features. 13.06.16 12 Yt-1 Yt+1 Yt Xt-1 Xt+1 Xt Yt-2 Xt-2 Yt+2 Xt+2 are pretty much Excited to O I I I O CRF-1 Model - X can communicate with Y -Y’s can communicate(Only consecutive) CRF-0 Model - X can communicate with Y - Basically Max. Entropy Model

Slide 13

Slide 13 text

1. Introduction 2. Task, Features and Model Used 3. Dataset and Evaluation 4. Baselines and Results 5. Conclusion 13.06.16 13

Slide 14

Slide 14 text

§ MPQA corpus consists of 535 newswire documents § 135 documents for training, 400 for testing/evaluation (10-fold Cross validation) § Annotations of Subjectivity phrases: Thousands of coup supporters celebrated overnight, waving flags, blowing whistles 13.06.16 14 Number of sentences 8297 Number of DSE’s 6712 Number of ESE’s 8640 Average length of DSE’s 1.86 words Average length of ESE’s 3.33 words

Slide 15

Slide 15 text

§ Precision, Recall, F-score* § Edge cases : Some complex colloquial language description § for e.g. The opposition house roundly criticized the new education reforms bill. GS =“Roundly criticized”, Predicted=“criticized” § To handle this Soft precision, Soft recall are introduced (Exact vs Overlap) § In overlap cases, we assume it’s a match 13.06.16 15 * The talk presents results only in F-score

Slide 16

Slide 16 text

1. Introduction 2. Task, Features and Model Used 3. Dataset and Evaluation 4. Baselines and Results 5. Conclusion 13.06.16 16

Slide 17

Slide 17 text

§ Wiebe and Riloff, 2005: Focusses on the task of distinguishing subjective from objective sentences. (Two week’s talk on subjectivity) § Wilson et al., 2005 : Identify the contextual polarity for a large subset of sentiment expressions (Earlier talk today) 14.06.16 17 Essentially this paper does the same as the work done by Wilson et al. 2005. This approach is done via CRFs

Slide 18

Slide 18 text

Overlap Exact Method F-Score F-Score Wiebe baseline 36.97 16.87 Wilson baseline 39.44 17.52 CRF-1 68.44 49.01 CRF-0 69.83 42.10 18

Slide 19

Slide 19 text

Overlap Exact Method F-Score F-Score Wiebe baseline 48.66 11.92 Wilson baseline 50.38 11.56 CRF-1 57.14 19.35 CRF-0 62.82 17.61 19

Slide 20

Slide 20 text

overlap exact feature set F-Score F-Score base 56.60 36.41 base + Levin/FN 58.86 37.20 base + Wilson 61.81 38.57 base + Wilson + Levin/FN 63.26 39.21 base + WordNet 70.00 42.24 base + Wilson + WordNet 70.45 42.40 Base+Levin/FN + WordNet 70.13 42.34 Base+Levin/FN+ WordNet + Wilson 70.65 42.40 13.06.16 20

Slide 21

Slide 21 text

1. Introduction 2. Task, Features and Model Used 3. Dataset and Evaluation 4. Baselines and Results 5. Conclusion 13.06.16 21

Slide 22

Slide 22 text

§ Linear CRF approach is definitely a better approach as compared to baselines methods used. § CRF-0 outperforms CRF-1 for F-score in overlap setting but CRF-1 turns out to be better for exact phrases. § Feature ablation results indicate wordnet features are by far the best. § Adding other dictionary are useful too but very low impact compared to other features. 13.06.16 22

Slide 23

Slide 23 text

§ Breck, Eric, Yejin Choi, and Claire Cardie. "Identifying Expressions of Opinion in Context." IJCAI. Vol. 7. 2007. § Wilson, Theresa, JanyceWiebe, and Paul Hoffmann. "Recognizing contextual polarity in phrase-level sentiment analysis." Proceedings of the conference on human language technology and empirical methods in natural language processing. Association for Computational Linguistics, 2005. § Wiebe, Janyce, and Ellen Riloff. "Creating subjective and objective sentence classifiers from unannotated texts." Computational Linguistics and Intelligent Text Processing. Springer Berlin Heidelberg, 2005. 486-497. 13.06.16 23

Slide 24

Slide 24 text

13.06.16 24 Thanks!! Confusions… Questions… Feedback… Please