Opinion Expressions: Direct Subjective Expressions, Expressive Subjective Elements § Sequence Labelling : Brief about IOB Encoding § Linear-chain Conditional Random Field (CRF) Model (0 order, 1st Order) § No Math Equations J 14.06.16 2
Traditional Information Extraction systems : Answers questions directly from text (facts). § More complicated systems have the ability to answer opinions… 13.06.16 4 Input text : New Delhi is the capital and seat of government of India. It is also a municipality and district in Delhi and serves as the seat of Government of Delhi. Mr. Narendra Modi is the current Prime minister of India*. Question 1: What is the capital of India ? (Fact) Question 2: What do people feel about Mr. Modi’s Govt. ? (Opinion) * Excerpt from wikipedia
well studied tasks [Riloff and Wiebe, 2003; Wiebe and Wilson, 2002] § Mr. Modi’s Govt. won with the count of 282/566 seats in the house. Do majority of people like Mr. Modi’s Govt. ? (Yes/No) §Identification of Expressions in context indicating Subjectivity is the question of research of this paper: § The people are pretty much excited to see Mr. Modi as their Prime Minister joining office after the Oath Ceremony this morning. § This is a text chunking problem i.e. a problem of dividing a text in syntactically correlated parts of words. 13.06.16 5
mentions of opinions The students were nervous before the exam. The ISIS fears a spill-over from their anti-terrorist campaign. (source - Times of India) • Expressive Subjective Elements: Spans of text that indicate subjectivity on the choice of words by speaker. The part of the US human rights report about China is full of absurdities and fabrications 13.06.16 7
report about China is full of absurdities and fabrications Supervised Learning • Identify subjective phrases in context • Sequence Labelling Problem, Linear CRF Model Output Text • The part of the US human rights report about China is full of absurdities and fabrications 13.06.16 9
tagging task § In this problem the authors have used the IOB/IO encoding: • I -> Inside • O -> Outside • B -> Begin 13.06.16 10 The people are very much excited to have their … O O O B I I O O O O O O O I I I O O O O
§ Syntactic features (POS Tagging, constituent types *) § Dictionary features : Use 3 available sources • Wordnet : Hypernyms of the current token (concern: anxiety, emotion, feeling, psych state) • Framenet/Levin : Communication words (speak, talk etc.) • Subjectivity clues : weak or strong subjectivity CASS Parser defines 4 constituents: NP. VP, PP, Clause 11 13.06.16 Type Word Pos Polarity Stemmed Weak subjectivity Abandoned Adjective Negative N Strong subjectivity Amaze Verb Positive Y
: Undirected graphical model encoding conditional probability distribution for a set of features. 13.06.16 12 Yt-1 Yt+1 Yt Xt-1 Xt+1 Xt Yt-2 Xt-2 Yt+2 Xt+2 are pretty much Excited to O I I I O CRF-1 Model - X can communicate with Y -Y’s can communicate(Only consecutive) CRF-0 Model - X can communicate with Y - Basically Max. Entropy Model
documents for training, 400 for testing/evaluation (10-fold Cross validation) § Annotations of Subjectivity phrases: Thousands of coup supporters celebrated overnight, waving flags, blowing whistles 13.06.16 14 Number of sentences 8297 Number of DSE’s 6712 Number of ESE’s 8640 Average length of DSE’s 1.86 words Average length of ESE’s 3.33 words
colloquial language description § for e.g. The opposition house roundly criticized the new education reforms bill. GS =“Roundly criticized”, Predicted=“criticized” § To handle this Soft precision, Soft recall are introduced (Exact vs Overlap) § In overlap cases, we assume it’s a match 13.06.16 15 * The talk presents results only in F-score
distinguishing subjective from objective sentences. (Two week’s talk on subjectivity) § Wilson et al., 2005 : Identify the contextual polarity for a large subset of sentiment expressions (Earlier talk today) 14.06.16 17 Essentially this paper does the same as the work done by Wilson et al. 2005. This approach is done via CRFs
compared to baselines methods used. § CRF-0 outperforms CRF-1 for F-score in overlap setting but CRF-1 turns out to be better for exact phrases. § Feature ablation results indicate wordnet features are by far the best. § Adding other dictionary are useful too but very low impact compared to other features. 13.06.16 22
of Opinion in Context." IJCAI. Vol. 7. 2007. § Wilson, Theresa, JanyceWiebe, and Paul Hoffmann. "Recognizing contextual polarity in phrase-level sentiment analysis." Proceedings of the conference on human language technology and empirical methods in natural language processing. Association for Computational Linguistics, 2005. § Wiebe, Janyce, and Ellen Riloff. "Creating subjective and objective sentence classifiers from unannotated texts." Computational Linguistics and Intelligent Text Processing. Springer Berlin Heidelberg, 2005. 486-497. 13.06.16 23