Political Attack India

Analysing Social Media Text: A case study of political attacks
in India Sarah Masud, Phd@LCS2, IIITD Visiting Researcher @TUM

Background into Indian Politics • General elections and Assembly elections
are held every 5 year at national and state level respectively. Both elect representatives in the national and state level assembly house. • At National level there are 2 major parties: Bharatiya Janata Party (BJP) and Indian National Congress (INC), they contest elections in all states either independently or via alliance. • There are other regional as well as national parties that contest elections in a fewer states like AAP (Delhi, Goa, Punjab), SP (UP mainly) and so on.

Background into Indian Politics • Since 2014 BJP has been
in power at national/central level. • In this study we look at Indian Assembly elections of February 2022 where 5 states contested elections. ◦ Uttar Pradesh (UP) ◦ Punjab ◦ Manipur ◦ Goa ◦ Uttrakhand • Interesting UP contributes a large number of seats to the central (80) compared to second highest (48).

Image source: https://blog.google/intl/en-in/pollcheck-2022-digital-training-series-journalists-covering-upcoming-state-elections/

Use of Twitter to study Indian politics? • India has
almost 24 M active users on Twitter (out of 1.4B population). • Most political parties and groups have official Twitter accounts, so do most politicians • Political attacks via social media posts, videos, memes is very common in India. (so are spread of fake news and lynching rumours…) • Support for Indic languages on social media has further accelerated the use of social media for political parties to communicate with general public all year round.

Step 1 of text analysis: Data Curation • Collection •
Annotation Social media text analysis: STEP 1 • Data Collection • Data Analysis • Data Preprocessing

Data Curation • Twitter API and timeline crawler • The
elections were held in February, we ran our data collection on a weekly basis from January to March 2022 (i.e before, during and after election). • We shortlisted 100 politicians active on Twitter associated with the states contesting elections. They cover 17 parties and political groups in total. • Employing general knowledge, Twitter bios and Wikipedia we mapped the politicians to their political groups.

Data Curation • We scrape tweets with the hashtag <#STATENAME>Assembly
Election 2022 • We also scrapped the official twitter handle of 6 parties. • We managed to collect 45k tweets in this process. ◦ 32k from politicians • Hindi, English and Punjabi are top-3 contributing languages.

Data Analysis T: tweets U: Unique politicians R: Retweets L:
Likes

Data Analysis • BJP and INC are highly active •
SP is 3rd highest • SP has more interaction per tweet than BJp and INC. Is this ﬁnding of any signiﬁcance? • Are all REAL human followers?

Data Preprocessing • Perform the usual preprocessing: ◦ Remove emojis
◦ Replace URLS and USER mentions ◦ Remove other special characters. • Challenges: ◦ Removal of special characters can impact the detection of hashtags… #<WORDS> ◦ Replacing USER mentions make it hard to understand who the target is in case of mud slandering. ◦ Hindi characters treated as special characters and punctuations and removed completely. ◦ How to handle code mixing?

Data Preprocessing • Trade off between preprocessing and loss of
information. ◦ Pick which aspects are critical ◦ Add separate preprocessing based on the detected language (lower casing for english) ◦ For code mixing how to generate embeddings? • Our approach: ◦ User mentions kept intact ◦ Punctuations removed after detection of #tags separately. ◦ Urls removed as we do not build systems that require searching (what if this was about fact checking and not hatefulness)?

Annotation Social media text analysis: STEP 2 • Data Annotation (manual) • Data Annotation (modeling)

Manual Annotations of political attacks • Political attacks though offensive
are not hate speech as political parties and politicians are not a protected class. • We manually annotate 1.7k tweets into explicit, implicit and none labels of attacks. • We also annotated for BJP, INC, SP and AAP whether the tweet is self promotion, a demotion of opposition or both.

Annotations

Large-scale annotations • We use the manual annotations to train
two models and use the pseudo-labels generated from them for large scale annotation of rest of the tweets. • Here we tested 2 approaches: ◦ N-gram based logistic regression approach ◦ Multilingual large language modeling approach

Large Scale Annotations Image Source: https://devopedia.org/n-gram-model

Large Scale Annotations Can be any Deep Learning based system
that can generate numeric embeddings for words. Image Source: https://jalammar.github.io/illustrated-word2vec/

Annotations • Upper row is manual annotation, lower is machine
generated annotations. ◦ PA: Party handle ◦ PO: Politician • Overall # of neutral > explicit > implicit • Implicit is harder to detect • Explicit will catch the attention of reader faster. • The curated dataset has 695 (resp. 23, 838) neutral, 696 (resp. 17, 771) explicit, and 329 (resp. 4, 858) implicit instances of manually annotated (resp. Model predicted) samples of political attacks represented in pictogram A (resp. D).

Patterns in Volume of Attack • Patterns from manual and
machine annotations follow similar trend. • Increase in attacks during elections weeks when in person rallies were held. • Neutral promotional content majorly high even before and after elections. (hinting at round year activity of political parties)

Patterns in Volume of Attack • Neutral to attack 3:2
in manual annotation samples. • The ratio is 1:1 in predicted samples (over predicting attack maybe?) • Direct attacks in manual and predicted samples overshadow implicit ones by 2 : 1 and 3 : 1, respectively.

Annotation Social media text analysis: STEP 3 • Analysing and Findings

Patterns in Volume of Attack • While based on signiﬁcant
testing explicit receive more retweets and likes than implicit in machine annotations, but the results are not signiﬁcant manually annotated samples. • Can we trust one statistical test over other? • Probably not because the machine annotated samples are 20x in size.

Power dynamics of promotion and demotion • Simply check the
unique hashtags employed by each party and then manually assign them a promotion or demotion value. ◦ We observe that most parties use promotional hashtags more than demotion. ◦ This is most prominent for the incumbent BJP that operates from a position of comfort and can there use implicit demotion/challenging hashtags ◦ Opposition parties on the other hand employed more directly challenging hashtags as they are attacking the one in power.

Power dynamics of promotion and demotion • Employ manual annotations
to mark promotion and demotion among the 1.7k manually annotated samples. • INC the largest opposition party at center (in terms of resources) attacks BJP the most (most of the attacks are criticisms). • BJP focuses more on self-promotion. Among the parties it attacks the most after self-promotion, it is INC (no surprise).

Power dynamics of promotion and demotion • Smaller parties like
AAP and SP have to balance promotion and demotion. • SP was focused on elections in UP which is BJP’s strong hold hence it attacks BJP as much as it self-promotes. • AAP was focused on elections in Punjab where both INC and BJP have equal footing and we see that in the distribution of attacks by AAP.

Conclusion • Political attacks help understand the power dynamics during
a speciﬁc election season. • These dynamics change from one election to the next and from one state to the next. ◦ Recently in 2023 state elections similar promotional tactics of BJP did not help in Karnataka elections. ◦ While BJP came to power in Manipur, it has not been able to control the ongoing political tension and civic unrest. ◦ While SP has grown in popularity both online and offline (as visible from its vote share at state level), it does not indicate they will be able to retain the same in the general elections.

Conclusion • Use of name calling can prove a boon
or bane depending on the audience perceives it, and who wins the elections. • Political parties should engage in critical political attack without referring to gender, caste of politicians so as not to make the criticism hateful in nature. • We need information curated from multiple sources like Twitter, Facebook, Whatsapp, News articles to be able establish the overall sense of how politics shapes social discourse and vice-versa. Until then studies like ours remain an anecdotal commentary.

Paper Link [email protected]

Thank You Q&A

Political Attack India

Political Attack India

_themessier

More Decks by _themessier

Other Decks in Research

Featured

Transcript

Analysing Social Media Text: A case study of political attacks

Background into Indian Politics • General elections and Assembly elections

Background into Indian Politics • Since 2014 BJP has been

Image source: https://blog.google/intl/en-in/pollcheck-2022-digital-training-series-journalists-covering-upcoming-state-elections/

Use of Twitter to study Indian politics? • India has

Step 1 of text analysis: Data Curation • Collection •

Data Curation • Twitter API and timeline crawler • The

Data Curation • We scrape tweets with the hashtag <#STATENAME>Assembly

Data Analysis T: tweets U: Unique politicians R: Retweets L:

Data Analysis • BJP and INC are highly active •

Data Preprocessing • Perform the usual preprocessing: ◦ Remove emojis

Data Preprocessing • Trade off between preprocessing and loss of

Step 1 of text analysis: Data Curation • Collection •

Manual Annotations of political attacks • Political attacks though offensive

Annotations

Large-scale annotations • We use the manual annotations to train

Large Scale Annotations Image Source: https://devopedia.org/n-gram-model

Large Scale Annotations Can be any Deep Learning based system

Annotations • Upper row is manual annotation, lower is machine

Patterns in Volume of Attack • Patterns from manual and

Patterns in Volume of Attack • Neutral to attack 3:2

Step 1 of text analysis: Data Curation • Collection •

Patterns in Volume of Attack • While based on signiﬁcant

Power dynamics of promotion and demotion • Simply check the

Power dynamics of promotion and demotion • Employ manual annotations

Power dynamics of promotion and demotion • Smaller parties like

Conclusion • Political attacks help understand the power dynamics during

Conclusion • Use of name calling can prove a boon

Paper Link [email protected]

Thank You Q&A