are held every 5 year at national and state level respectively. Both elect representatives in the national and state level assembly house. • At National level there are 2 major parties: Bharatiya Janata Party (BJP) and Indian National Congress (INC), they contest elections in all states either independently or via alliance. • There are other regional as well as national parties that contest elections in a fewer states like AAP (Delhi, Goa, Punjab), SP (UP mainly) and so on.
in power at national/central level. • In this study we look at Indian Assembly elections of February 2022 where 5 states contested elections. ◦ Uttar Pradesh (UP) ◦ Punjab ◦ Manipur ◦ Goa ◦ Uttrakhand • Interesting UP contributes a large number of seats to the central (80) compared to second highest (48).
almost 24 M active users on Twitter (out of 1.4B population). • Most political parties and groups have official Twitter accounts, so do most politicians • Political attacks via social media posts, videos, memes is very common in India. (so are spread of fake news and lynching rumours…) • Support for Indic languages on social media has further accelerated the use of social media for political parties to communicate with general public all year round.
elections were held in February, we ran our data collection on a weekly basis from January to March 2022 (i.e before, during and after election). • We shortlisted 100 politicians active on Twitter associated with the states contesting elections. They cover 17 parties and political groups in total. • Employing general knowledge, Twitter bios and Wikipedia we mapped the politicians to their political groups.
Election 2022 • We also scrapped the official twitter handle of 6 parties. • We managed to collect 45k tweets in this process. ◦ 32k from politicians • Hindi, English and Punjabi are top-3 contributing languages.
◦ Replace URLS and USER mentions ◦ Remove other special characters. • Challenges: ◦ Removal of special characters can impact the detection of hashtags… #<WORDS> ◦ Replacing USER mentions make it hard to understand who the target is in case of mud slandering. ◦ Hindi characters treated as special characters and punctuations and removed completely. ◦ How to handle code mixing?
information. ◦ Pick which aspects are critical ◦ Add separate preprocessing based on the detected language (lower casing for english) ◦ For code mixing how to generate embeddings? • Our approach: ◦ User mentions kept intact ◦ Punctuations removed after detection of #tags separately. ◦ Urls removed as we do not build systems that require searching (what if this was about fact checking and not hatefulness)?
are not hate speech as political parties and politicians are not a protected class. • We manually annotate 1.7k tweets into explicit, implicit and none labels of attacks. • We also annotated for BJP, INC, SP and AAP whether the tweet is self promotion, a demotion of opposition or both.
two models and use the pseudo-labels generated from them for large scale annotation of rest of the tweets. • Here we tested 2 approaches: ◦ N-gram based logistic regression approach ◦ Multilingual large language modeling approach
generated annotations. ◦ PA: Party handle ◦ PO: Politician • Overall # of neutral > explicit > implicit • Implicit is harder to detect • Explicit will catch the attention of reader faster. • The curated dataset has 695 (resp. 23, 838) neutral, 696 (resp. 17, 771) explicit, and 329 (resp. 4, 858) implicit instances of manually annotated (resp. Model predicted) samples of political attacks represented in pictogram A (resp. D).
machine annotations follow similar trend. • Increase in attacks during elections weeks when in person rallies were held. • Neutral promotional content majorly high even before and after elections. (hinting at round year activity of political parties)
in manual annotation samples. • The ratio is 1:1 in predicted samples (over predicting attack maybe?) • Direct attacks in manual and predicted samples overshadow implicit ones by 2 : 1 and 3 : 1, respectively.
testing explicit receive more retweets and likes than implicit in machine annotations, but the results are not significant manually annotated samples. • Can we trust one statistical test over other? • Probably not because the machine annotated samples are 20x in size.
unique hashtags employed by each party and then manually assign them a promotion or demotion value. ◦ We observe that most parties use promotional hashtags more than demotion. ◦ This is most prominent for the incumbent BJP that operates from a position of comfort and can there use implicit demotion/challenging hashtags ◦ Opposition parties on the other hand employed more directly challenging hashtags as they are attacking the one in power.
to mark promotion and demotion among the 1.7k manually annotated samples. • INC the largest opposition party at center (in terms of resources) attacks BJP the most (most of the attacks are criticisms). • BJP focuses more on self-promotion. Among the parties it attacks the most after self-promotion, it is INC (no surprise).
AAP and SP have to balance promotion and demotion. • SP was focused on elections in UP which is BJP’s strong hold hence it attacks BJP as much as it self-promotes. • AAP was focused on elections in Punjab where both INC and BJP have equal footing and we see that in the distribution of attacks by AAP.
a specific election season. • These dynamics change from one election to the next and from one state to the next. ◦ Recently in 2023 state elections similar promotional tactics of BJP did not help in Karnataka elections. ◦ While BJP came to power in Manipur, it has not been able to control the ongoing political tension and civic unrest. ◦ While SP has grown in popularity both online and offline (as visible from its vote share at state level), it does not indicate they will be able to retain the same in the general elections.
or bane depending on the audience perceives it, and who wins the elections. • Political parties should engage in critical political attack without referring to gender, caste of politicians so as not to make the criticism hateful in nature. • We need information curated from multiple sources like Twitter, Facebook, Whatsapp, News articles to be able establish the overall sense of how politics shapes social discourse and vice-versa. Until then studies like ours remain an anecdotal commentary.