hate. • Changes with context, geography, time. • Has a power dynamics associated with it, yet no standard list of vulnerable groups. • Subjective annotation from the point of view of NLP modeling. • In online mode, the issue becomes more complicated due to ◦ Anonymity ◦ Network virality Effect ◦ Implicitness is hard to model UN defines hate as, “any kind of communication in speech, writing or behaviour, that attacks or uses pejorative or discriminatory language with reference to a person or a group on the basis of who they are, in other words, based on their religion, ethnicity, nationality, race, colour, descent, gender or other identity factor.” [1] [1]: https://www.un.org/en/hate-speech/understanding-hate-speech/what-is-hate-speech
Tractable Density Discrimination for Implicit Hate Speech Detection [2]: https://hasocfire.github.io/hasoc/2021/ichcl/index.html Fig 1: Implicit hate speech [1] Fig 1:Hateful comments or hateful tweets [2]
and Sarah, Nipping in the bud: detection, diffusion and mitigation of hate speech on social media, ACM SIGWEB Winter, Invited Publication Fig 1: The various input signals (red), models (green) and user groups (blue) involved in analysing hate speech. [1]
Signals Auxiliary Data • Length of comments • # Punctuations, Capitalization • URLs, Hashtags, emojis etc. • Sentiment score • Readability score Within Data Signals
Followers ◦ # Followee ◦ # Tweets/Retweets/Likes ◦ Account Age etc… [1]: Founta et.al Fig 1: Concatenating textual and metadata information from tweets for hate detection [1]
textual feature [1]. • Node2vec is employed to map graphs to emb space [2]. [1]: Chowdhury et al., SRW-ACL’21 [2]: Grover et al., KDD’16 Fig 1: Infusing textual and network information for hate detection [1]
seeded topics from Twitter • 50k tweets • 3k hateful • Codemixed [1]: Kulkarni et al., Revisiting Hate Speech Benchmarks: From Data Curation to System Deployment, KDD 2023
PAY FUC**kfu ck trump f*** gop f*** republicans Make go fund me FOR HEALTH CARE, COLLEGE EDUCATION , CLIMATE CHANGE, SOMETHING GOOD AND POSITIVE !! Not for a fucking wall go fund the wall the resistance resist \$URL\$" $MENTION\$ DERANGED DELUSIONAL DUMB DICTATOR DONALD IS MENTALLY UNSTABLE! I WILL NEVER VOTE REPUBLICAN AGAIN IF THEY DON'T STAND UP TO THIS TYRANT LIVING IN THE WHITE HOUSE! fk republicans worst dictator ever unstable dictator \$URL\$" $MENTION\$ COULD WALK ON WATER AND THE never trump WILL CRAP ON EVERYTHING HE DOES. SHAME IN THEM. UNFOLLOW ALL OF THEM PLEASE!" Offensive train sample Labelled Corpus E1: Offensive train sample exemplar (can be same or different author) E2: Offensive train sample exemplar (can be same or different author) Within Dataset Signal: Exemplars
what Dhruv Tyagi had to face for merely asking his Muslim neighbors not to sexually harass his daughter...and even then, if u ask why people don’t rent to Muslims, get ur head examined $MENTION\$ $MENTION\$ naah...Islamists will never accept Muslim refugees, they will tell the Muslims to create havoc in their home countries and do whatever it takes to convert Dar-ul-Harb into Dar-ul Islam..something we should seriously consider doing with Pak Hindus too One of the tweet by author before Example 2 One of the tweet by author after Example 2 Accusatory tone timestamp t-1 Hateful tweet timestamp t Accusatory and instigating timestamp t+1 Auxiliary Dataset Signal: Timeline
Curation to System Deployment, KDD 2023 Contextual Signal Infusion for Hate Detection HEN-mBERT: History, Exemplar and Network infused mBERT model. Fig 1: Proposed model HEN-mBERT [1]
and Ablation [1] [1]: Kulkarni et al., Revisiting Hate Speech Benchmarks: From Data Curation to System Deployment, KDD 2023 • O: Attentive infusion of signals seem to be helping reducing the noisy information in them. • T: No one signal significantly dominates other. Different signals seem to be helping different classes. • T: Combining all 4 signals lead to an improved detection of hate by 5 macro-F1 !!
◦ Follow network (2-hops) ◦ Meta data • Manually annotated a total of 17k tweets. • Trained a Hate Detection model for our dataset. • Additionally crawled online news articles (600k).
different users towards different hashtags in RETINA [1] Fig 2: Retweet cascades for hateful and non-hate tweets in RETINA [1] • Different users show varying tendencies to engage in hateful content depending on the topic. • Hate speech spreads faster in a shorter period. [1]: Masud et al., Hate is the New Infodemic: A Topic-aware Modeling of Hate Speech Diffusion on Twitter, ICDE 2021
is the New Infodemic: A Topic-aware Modeling of Hate Speech Diffusion on Twitter, ICDE 2021 XN: News Headline XT: Incoming Tweet • Crawled a large-scale Twitter dataset. • Manually annotated a total of 17k tweets. • Additionally crawled online news articles (600k).
Topic-aware Modeling of Hate Speech Diffusion on Twitter, ICDE 2021 Context Infused Retweet Prediction Fig 1: Baseline Comparisons [1] Fig 2: Behaviour of cascade for different baselines. Darker bars are hate [1].
R: Retweets L: Likes • We shortlisted 100 politicians active on Twitter associated with the states contesting elections. They cover 17 parties and political groups in total.
during Indian assembly elections, SNAM Promotion vs Demotion • Employ manual annotations to mark promotion and demotion among the 1.7k manually annotated samples. • INC the largest opposition party at center (in terms of resources) attacks BJP the most (most of the attacks are criticisms). • BJP focuses more on self-promotion. Among the parties it attacks the most after self-promotion, it is INC (no surprise).
rallies were held. • Neutral promotional content majorly high even before and after elections. (hinting at round year activity of political parties) • Neutral to attack 3:2 in manual annotation samples. • The ratio is 1:1 in predicted samples (over predicting attack maybe?) • Direct attacks in manual and predicted samples overshadow implicit ones by 2 : 1 and 3 : 1, respectively. [1]: Masud & Chakraboty, Political mud slandering and power dynamics during Indian assembly elections, SNAM Manual vs Large Scale Pseudo Labels