Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Hateful Signals In Indic Context and Where to Find Them

_themessier
November 17, 2023

Hateful Signals In Indic Context and Where to Find Them

Ketchup Talk, IIIT Delhi, Nov 2023

_themessier

November 17, 2023
Tweet

More Decks by _themessier

Other Decks in Research

Transcript

  1. Disclaimer Subsequent content has extreme language (verbatim from social media),

    which does not reflect the opinions of myself or my collaborators. Reader’s discretion is advised.
  2. Why is hate detection hard? • No universal definition of

    hate. • Changes with context, geography, time. • Has a power dynamics associated with it, yet no standard list of vulnerable groups. • Subjective annotation from the point of view of NLP modeling. • In online mode, the issue becomes more complicated due to ◦ Anonymity ◦ Network virality Effect ◦ Implicitness is hard to model UN defines hate as, “any kind of communication in speech, writing or behaviour, that attacks or uses pejorative or discriminatory language with reference to a person or a group on the basis of who they are, in other words, based on their religion, ethnicity, nationality, race, colour, descent, gender or other identity factor.” [1] [1]: https://www.un.org/en/hate-speech/understanding-hate-speech/what-is-hate-speech
  3. Why context is important? [1]: Focal Inferential Infusion Coupled with

    Tractable Density Discrimination for Implicit Hate Speech Detection [2]: https://hasocfire.github.io/hasoc/2021/ichcl/index.html Fig 1: Implicit hate speech [1] Fig 1:Hateful comments or hateful tweets [2]
  4. Workflow for Analysing and Mitigating Online Hate Speech [1]: Tanmoy

    and Sarah, Nipping in the bud: detection, diffusion and mitigation of hate speech on social media, ACM SIGWEB Winter, Invited Publication Fig 1: The various input signals (red), models (green) and user groups (blue) involved in analysing hate speech. [1]
  5. Types of Signals: Auxiliary and Within Dataset Endogenous Signals Exogenous

    Signals Auxiliary Data • Length of comments • # Punctuations, Capitalization • URLs, Hashtags, emojis etc. • Sentiment score • Readability score Within Data Signals
  6. Auxiliary Dataset Signal: Meta data • Twitter Meta-data: ◦ #

    Followers ◦ # Followee ◦ # Tweets/Retweets/Likes ◦ Account Age etc… [1]: Founta et.al Fig 1: Concatenating textual and metadata information from tweets for hate detection [1]
  7. Auxiliary Dataset Signal: User Network • Infusing Network Information with

    textual feature [1]. • Node2vec is employed to map graphs to emb space [2]. [1]: Chowdhury et al., SRW-ACL’21 [2]: Grover et al., KDD’16 Fig 1: Infusing textual and network information for hate detection [1]
  8. GOTHate Dataset Fig 1: Dataset Stats [1] • 7 neutrally

    seeded topics from Twitter • 50k tweets • 3k hateful • Codemixed [1]: Kulkarni et al., Revisiting Hate Speech Benchmarks: From Data Curation to System Deployment, KDD 2023
  9. \$MENTION\$ \$MENTION\$ \$MENTION\$ AND Remember president loco SAID MEXICO WILL

    PAY FUC**kfu ck trump f*** gop f*** republicans Make go fund me FOR HEALTH CARE, COLLEGE EDUCATION , CLIMATE CHANGE, SOMETHING GOOD AND POSITIVE !! Not for a fucking wall go fund the wall the resistance resist \$URL\$" $MENTION\$ DERANGED DELUSIONAL DUMB DICTATOR DONALD IS MENTALLY UNSTABLE! I WILL NEVER VOTE REPUBLICAN AGAIN IF THEY DON'T STAND UP TO THIS TYRANT LIVING IN THE WHITE HOUSE! fk republicans worst dictator ever unstable dictator \$URL\$" $MENTION\$ COULD WALK ON WATER AND THE never trump WILL CRAP ON EVERYTHING HE DOES. SHAME IN THEM. UNFOLLOW ALL OF THEM PLEASE!" Offensive train sample Labelled Corpus E1: Offensive train sample exemplar (can be same or different author) E2: Offensive train sample exemplar (can be same or different author) Within Dataset Signal: Exemplars
  10. "look at what Hindus living in mixed-population localities are facing,

    what Dhruv Tyagi had to face for merely asking his Muslim neighbors not to sexually harass his daughter...and even then, if u ask why people don’t rent to Muslims, get ur head examined $MENTION\$ $MENTION\$ naah...Islamists will never accept Muslim refugees, they will tell the Muslims to create havoc in their home countries and do whatever it takes to convert Dar-ul-Harb into Dar-ul Islam..something we should seriously consider doing with Pak Hindus too One of the tweet by author before Example 2 One of the tweet by author after Example 2 Accusatory tone timestamp t-1 Hateful tweet timestamp t Accusatory and instigating timestamp t+1 Auxiliary Dataset Signal: Timeline
  11. Fig 1: Motivation for Auxiliary Data Signals[1] [1]: Kulkarni et

    al., Revisiting Hate Speech Benchmarks: From Data Curation to System Deployment, KDD 2023 Contextual Signal Infusion for Hate Detection
  12. [1]: Kulkarni et al., Revisiting Hate Speech Benchmarks: From Data

    Curation to System Deployment, KDD 2023 Contextual Signal Infusion for Hate Detection HEN-mBERT: History, Exemplar and Network infused mBERT model. Fig 1: Proposed model HEN-mBERT [1]
  13. Contextual Signal Infusion for Hate Detection: Takeaways Fig 1: Baseline

    and Ablation [1] [1]: Kulkarni et al., Revisiting Hate Speech Benchmarks: From Data Curation to System Deployment, KDD 2023 • O: Attentive infusion of signals seem to be helping reducing the noisy information in them. • T: No one signal significantly dominates other. Different signals seem to be helping different classes. • T: Combining all 4 signals lead to an improved detection of hate by 5 macro-F1 !!
  14. RETINA Dataset • Crawled a large-scale Twitter dataset. ◦ Timeline

    ◦ Follow network (2-hops) ◦ Meta data • Manually annotated a total of 17k tweets. • Trained a Hate Detection model for our dataset. • Additionally crawled online news articles (600k).
  15. Exogenous Signal: Topical Affinity of Users Fig 1: Hatefulness of

    different users towards different hashtags in RETINA [1] Fig 2: Retweet cascades for hateful and non-hate tweets in RETINA [1] • Different users show varying tendencies to engage in hateful content depending on the topic. • Hate speech spreads faster in a shorter period. [1]: Masud et al., Hate is the New Infodemic: A Topic-aware Modeling of Hate Speech Diffusion on Twitter, ICDE 2021
  16. Exogenous Signal: Influence of News [1]: Masud et al., Hate

    is the New Infodemic: A Topic-aware Modeling of Hate Speech Diffusion on Twitter, ICDE 2021 XN: News Headline XT: Incoming Tweet • Crawled a large-scale Twitter dataset. • Manually annotated a total of 17k tweets. • Additionally crawled online news articles (600k).
  17. [1]: Masud et al., Hate is the New Infodemic: A

    Topic-aware Modeling of Hate Speech Diffusion on Twitter, ICDE 2021 Context Infused Retweet Prediction
  18. [1]: Masud et al., Hate is the New Infodemic: A

    Topic-aware Modeling of Hate Speech Diffusion on Twitter, ICDE 2021 Context Infused Retweet Prediction Fig 1: Exogenous Attention Mechanism [1] Fig 2: Static Retweet Prediction [1] Fig 3: Dynamic Retweet Prediction [1]
  19. [1]: Masud et al., Hate is the New Infodemic: A

    Topic-aware Modeling of Hate Speech Diffusion on Twitter, ICDE 2021 Context Infused Retweet Prediction Fig 1: Baseline Comparisons [1] Fig 2: Behaviour of cascade for different baselines. Darker bars are hate [1].
  20. Political Attacks During Assembly Elections T: tweets U: Unique politicians

    R: Retweets L: Likes • We shortlisted 100 politicians active on Twitter associated with the states contesting elections. They cover 17 parties and political groups in total.
  21. [1]: Masud & Chakraboty, Political mud slandering and power dynamics

    during Indian assembly elections, SNAM Promotion vs Demotion • Employ manual annotations to mark promotion and demotion among the 1.7k manually annotated samples. • INC the largest opposition party at center (in terms of resources) attacks BJP the most (most of the attacks are criticisms). • BJP focuses more on self-promotion. Among the parties it attacks the most after self-promotion, it is INC (no surprise).
  22. • Increase in attacks during elections weeks when in person

    rallies were held. • Neutral promotional content majorly high even before and after elections. (hinting at round year activity of political parties) • Neutral to attack 3:2 in manual annotation samples. • The ratio is 1:1 in predicted samples (over predicting attack maybe?) • Direct attacks in manual and predicted samples overshadow implicit ones by 2 : 1 and 3 : 1, respectively. [1]: Masud & Chakraboty, Political mud slandering and power dynamics during Indian assembly elections, SNAM Manual vs Large Scale Pseudo Labels