Detecting Irony on Greek Political Tweets: A Text Mining Approach

Detecting Irony on Greek Political Tweets: A Text Mining Approach
BASILIS CHARALAMPAKIS DIMITRIS SPATHIS ELIAS KOUSLIS KATIA KERMANIDIS DEPARTMENT of INFORMATICS EANN 2015, RHODES IONIAN UNIVERSITY, GREECE MHDW WORKSHOP

Topics Irony detection Text mining Twitter sentiment Politics

How does the humorous political social-media commentary reflect on the
election results? IDEA

Data* 44.438 GREEK POLITICAL TWEETS MAY 2012 COLLECTED ONE WEEK
BEFORE AND AFTER ELECTIONS 2 SUB-DATASETS OF TWEETS MENTIONING PARTIES & LEADERS 126 TWEETS MANUAL TRAIN SET *dataset avaılable at dı.ıonıo.gr/hılab

Cleaning Eliminating diplotypes and artifacts (~20k) Labelling Manual labelling 126
tweets as ironic or not Feature scoring Score every tweet of the dataset with feature scores Training Machine learning using the labelled tweets as training set and the rest dataset as test Data workflow Counting Estimating the irony percentage per party Correlating Trend between the ironic tweets that received every party before elections and their election results.

The Literature focuses on irony detection & computational linguistics Detecting
irony by examining the following features: signatures, unexpectedness, style, emotional scenarios, frequent trigrams and pleasantness. Reyes et al (2013 & 2011) Hashtag analysis method creates noisy results with low accuracy. Gonzalez-Ibanez et al (2011) Same dataset. Positive - negative distinction. Alignment between actual political results and web sentiment. Kermanidis & Maragoudakis (2013) High accuracy combining Amazon reviews, tweets and semi- supervised learning. Gonzalez-Ibanez et al (2011)

Tools Balkanet (Wordnet) Python NLTK library Weka

The Features take into consideration structural sentence formations and unexpectedness
occurrences — Spoken spoken style applied in writings — Rarity most rare words — Meanings No of Wordnet synsets as a measure of ambiguity — Lexical punctuation, prosodic repeated letters, metaphors — Emoticons smiley faces etc

Spoken verbal irony as daily chats using dashes (-) and
asterisks (*) -γεια σου -τι γίνεται; *χειραψία* use of spoken language is often related to unexpectedness

Rarity created a frequency dictionary of the dataset’s words isolated
the most rare words limited the upper bound up to three occurrences

Meanings Balkanet (Wordnet) wn.synsets('χώρος',lang='ell') [Synset('expanse.n.03'), Synset('space.n.02'), Synset ('precinct.n.01'), Synset('space.n.03'), Synset
('vacuum.n.03'), Synset('apartment.n.01'), Synset('space.n.01'), Synset ('area.n.02'), Synset('space-time.n.01'), Synset('proportion.n.02'), Synset('size.n.01'), Synset('acreage.n.01')] “We noticed that ironic tweets present words with more meanings (more WordNet synsets)” -Barbıerı et al

Lexical repeated letters prosody metaphor words σα, σαν & σάν
punctuation ! ? ... ;

Emoticons all possible variations of smiley, sad and mocking faces
:-) :-( :-D ….

“Ο Καμμένος σαν λιοκαμένος μου φαίνεται” Spoken Score 0 Rarity
Score 1,67 Meaning Score 0 Lexical Score 1 Emoticon Score 0 EXAMPLE

Training Algorithms Performance

Is irony before elections related to the election results?

Hypothesis Evaluation No direct correlation with the election results. The
fluctuation percent shows a trend on the edges, so that the 'loser' parties of ND and PASOK are getting the most ironic tweets as well as the 'winner' parties SYRIZA and XA.

...meaning that many ironic tweets aimed at a political party
could predict non-stagnatıon

Limitations Average user YOUNG EDUCATED MOSTLY LIBERAL 3.7% OF GREECE
POPULATION USES TWITTER Further work STEMMER LEMMATIZER GRAMMATICAL CONJUGATION MORE TRAINING DATA SEMI-SUPERVISED LEARNING

Detecting Irony on Greek Political Tweets: A Te...

Detecting Irony on Greek Political Tweets: A Text Mining Approach

Dimitris Spathis

More Decks by Dimitris Spathis

Other Decks in Research

Featured

Transcript