Disclaimer Subsequent content has extreme language (verbatim from social media), which does not reflect the opinions of myself or my collaborators. Reader’s discretion is advised.
Why is hate detection hard? ● No universal definition of hate. ● Changes with context, geography, time. ● Has a power dynamics associated with it, yet no standard list of vulnerable groups. ● Subjective annotation from the point of view of NLP modeling. ● In online mode, the issue becomes more complicated due to ○ Anonymity ○ Network virality Effect ○ Implicitness is hard to model UN defines hate as, “any kind of communication in speech, writing or behaviour, that attacks or uses pejorative or discriminatory language with reference to a person or a group on the basis of who they are, in other words, based on their religion, ethnicity, nationality, race, colour, descent, gender or other identity factor.” [1] [1]: https://www.un.org/en/hate-speech/understanding-hate-speech/what-is-hate-speech
Why context is important? [1]: Focal Inferential Infusion Coupled with Tractable Density Discrimination for Implicit Hate Speech Detection [2]: https://hasocfire.github.io/hasoc/2021/ichcl/index.html Fig 1: Implicit hate speech [1] Fig 1:Hateful comments or hateful tweets [2]
Workflow for Analysing and Mitigating Online Hate Speech [1]: Tanmoy and Sarah, Nipping in the bud: detection, diffusion and mitigation of hate speech on social media, ACM SIGWEB Winter, Invited Publication Fig 1: The various input signals (red), models (green) and user groups (blue) involved in analysing hate speech. [1]
Types of Signals: Auxiliary and Within Dataset Endogenous Signals Exogenous Signals Auxiliary Data ● Length of comments ● # Punctuations, Capitalization ● URLs, Hashtags, emojis etc. ● Sentiment score ● Readability score Within Data Signals
Auxiliary Dataset Signal: Meta data ● Twitter Meta-data: ○ # Followers ○ # Followee ○ # Tweets/Retweets/Likes ○ Account Age etc… [1]: Founta et.al Fig 1: Concatenating textual and metadata information from tweets for hate detection [1]
Auxiliary Dataset Signal: User Network ● Infusing Network Information with textual feature [1]. ● Node2vec is employed to map graphs to emb space [2]. [1]: Chowdhury et al., SRW-ACL’21 [2]: Grover et al., KDD’16 Fig 1: Infusing textual and network information for hate detection [1]
\$MENTION\$ \$MENTION\$ \$MENTION\$ AND Remember president loco SAID MEXICO WILL PAY FUC**kfu ck trump f*** gop f*** republicans Make go fund me FOR HEALTH CARE, COLLEGE EDUCATION , CLIMATE CHANGE, SOMETHING GOOD AND POSITIVE !! Not for a fucking wall go fund the wall the resistance resist \$URL\$" $MENTION\$ DERANGED DELUSIONAL DUMB DICTATOR DONALD IS MENTALLY UNSTABLE! I WILL NEVER VOTE REPUBLICAN AGAIN IF THEY DON'T STAND UP TO THIS TYRANT LIVING IN THE WHITE HOUSE! fk republicans worst dictator ever unstable dictator \$URL\$" $MENTION\$ COULD WALK ON WATER AND THE never trump WILL CRAP ON EVERYTHING HE DOES. SHAME IN THEM. UNFOLLOW ALL OF THEM PLEASE!" Offensive train sample Labelled Corpus E1: Offensive train sample exemplar (can be same or different author) E2: Offensive train sample exemplar (can be same or different author) Within Dataset Signal: Exemplars
"look at what Hindus living in mixed-population localities are facing, what Dhruv Tyagi had to face for merely asking his Muslim neighbors not to sexually harass his daughter...and even then, if u ask why people don’t rent to Muslims, get ur head examined $MENTION\$ $MENTION\$ naah...Islamists will never accept Muslim refugees, they will tell the Muslims to create havoc in their home countries and do whatever it takes to convert Dar-ul-Harb into Dar-ul Islam..something we should seriously consider doing with Pak Hindus too One of the tweet by author before Example 2 One of the tweet by author after Example 2 Accusatory tone timestamp t-1 Hateful tweet timestamp t Accusatory and instigating timestamp t+1 Auxiliary Dataset Signal: Timeline
Fig 1: Motivation for Auxiliary Data Signals[1] [1]: Kulkarni et al., Revisiting Hate Speech Benchmarks: From Data Curation to System Deployment, KDD 2023 Contextual Signal Infusion for Hate Detection
[1]: Kulkarni et al., Revisiting Hate Speech Benchmarks: From Data Curation to System Deployment, KDD 2023 Contextual Signal Infusion for Hate Detection HEN-mBERT: History, Exemplar and Network infused mBERT model. Fig 1: Proposed model HEN-mBERT [1]
Contextual Signal Infusion for Hate Detection: Takeaways Fig 1: Baseline and Ablation [1] [1]: Kulkarni et al., Revisiting Hate Speech Benchmarks: From Data Curation to System Deployment, KDD 2023 ● O: Attentive infusion of signals seem to be helping reducing the noisy information in them. ● T: No one signal significantly dominates other. Different signals seem to be helping different classes. ● T: Combining all 4 signals lead to an improved detection of hate by 5 macro-F1 !!
RETINA Dataset ● Crawled a large-scale Twitter dataset. ○ Timeline ○ Follow network (2-hops) ○ Meta data ● Manually annotated a total of 17k tweets. ● Trained a Hate Detection model for our dataset. ● Additionally crawled online news articles (600k).
Exogenous Signal: Topical Affinity of Users Fig 1: Hatefulness of different users towards different hashtags in RETINA [1] Fig 2: Retweet cascades for hateful and non-hate tweets in RETINA [1] ● Different users show varying tendencies to engage in hateful content depending on the topic. ● Hate speech spreads faster in a shorter period. [1]: Masud et al., Hate is the New Infodemic: A Topic-aware Modeling of Hate Speech Diffusion on Twitter, ICDE 2021
Exogenous Signal: Influence of News [1]: Masud et al., Hate is the New Infodemic: A Topic-aware Modeling of Hate Speech Diffusion on Twitter, ICDE 2021 XN: News Headline XT: Incoming Tweet ● Crawled a large-scale Twitter dataset. ● Manually annotated a total of 17k tweets. ● Additionally crawled online news articles (600k).
[1]: Masud et al., Hate is the New Infodemic: A Topic-aware Modeling of Hate Speech Diffusion on Twitter, ICDE 2021 Context Infused Retweet Prediction Fig 1: Exogenous Attention Mechanism [1] Fig 2: Static Retweet Prediction [1] Fig 3: Dynamic Retweet Prediction [1]
[1]: Masud et al., Hate is the New Infodemic: A Topic-aware Modeling of Hate Speech Diffusion on Twitter, ICDE 2021 Context Infused Retweet Prediction Fig 1: Baseline Comparisons [1] Fig 2: Behaviour of cascade for different baselines. Darker bars are hate [1].
Political Attacks During Assembly Elections T: tweets U: Unique politicians R: Retweets L: Likes ● We shortlisted 100 politicians active on Twitter associated with the states contesting elections. They cover 17 parties and political groups in total.
[1]: Masud & Chakraboty, Political mud slandering and power dynamics during Indian assembly elections, SNAM Promotion vs Demotion ● Employ manual annotations to mark promotion and demotion among the 1.7k manually annotated samples. ● INC the largest opposition party at center (in terms of resources) attacks BJP the most (most of the attacks are criticisms). ● BJP focuses more on self-promotion. Among the parties it attacks the most after self-promotion, it is INC (no surprise).
● Increase in attacks during elections weeks when in person rallies were held. ● Neutral promotional content majorly high even before and after elections. (hinting at round year activity of political parties) ● Neutral to attack 3:2 in manual annotation samples. ● The ratio is 1:1 in predicted samples (over predicting attack maybe?) ● Direct attacks in manual and predicted samples overshadow implicit ones by 2 : 1 and 3 : 1, respectively. [1]: Masud & Chakraboty, Political mud slandering and power dynamics during Indian assembly elections, SNAM Manual vs Large Scale Pseudo Labels