in nature. • Intensity/Severity of hate speech captures the explicitness of hate speech. • High Intensity hate is more likely to contain offensive lexicon, and offensive spans, direct attacks and mentions of target entity. Consuming Coffee is bad, I hate it! (the world can live with this opinion) Let us bomb every coffee shop and kill all coffee makers (this is a threat) Pyramid of Hate [1] [1]: Pyramid of Hate
• Instagram • Youtube Semi- Moderated • Reddit Unmoderated • Gab • 4chan • BitChute • Parler • StormFront Anonymity has lead to increase in anti-social behaviour [1], hate speech being one of them [2]. [1]: J Suler [2]: Luke Munn
post has been made and we are intervening to prevent it further spreading. Proactive countering Intervene before the post goes public Strategies for countering hate speech [3] [3]: Mudit et al.
text that counters the existing hate. • Ask influential members of the community to help spread the counter narrative. Reactive Methods of Intervention [4] Manoel Horta Ribeiro et al. Hate Interventions on Web
al. Data Collection Strategy for Counter Narration • CRAWL: (Real-world samples of both hate and counter-hate) • CROWD: (Real-world samples of hate and synthetic samples of counter-hate) • NICHE: (Synthetic samples of both hate and counter-hate) Characteristics of counter hate dataset [6] Countering hate speech on Twitter [5]
in the study. 50% randomly assigned to the control group • H1: Are prompted users less likely to post the current offensive content. • H2: Are prompted users less likely to post content in future. [7]: Katsaros et al., ICWSM ‘22 User behaviour statistics as a part of intervention study [7] Twitter reply test for offense replies. [7]
is to obtain its normalized (sensitised) form 𝑡` such that the intensity of hatred 𝜙𝑡 is reduced while the meaning still conveys. [1] 𝜙 𝑡` < 𝜙 𝑡 Example of original high intensity vs normalised sentence [8] [8]: Masud et al., Proactively Reducing the Hate Intensity of Online Posts via Hate Speech Normalization, KDD 2022
datasets. • Manually annotated for Hate intensity and hateful spans. • Hate Intensity is marked on a scale of 1-10. • Manual generation of normalised counter-part and its intensity. (k = 0.88) Original and Normalised Intensity Distribution [8] Dataset Stats [8] [8]: Masud et al., Proactively Reducing the Hate Intensity of Online Posts via Hate Speech Normalization, KDD 2022
term, or an abusive term directly attacking a minority group/individual. • A phrase that advocated violent action or hate crime against a group/individual. • Negatively stereotyping a group/individual with unfounded claims or false criminal accusations. • Hashtag(s) supporting one or more of the points as mentioned earlier. Span Labelling [1] • Score[8 − 10]: The sample promotes hate crime and calls for violence against the individual/group. • Score[6 − 7]: The sample is mainly composed of sexist/racist terms or portrays a sense of gender/racial superiority on the part of the person sharing the sample. • Score[4 − 5]: Mainly consists of offensive hashtags, or most hateful phrases are in the form of offensive hashtags. • Score[1 − 3]: The sample uses dark humor or implicit hateful term. Intensity Labelling [1] [8]: Masud et al., Proactively Reducing the Hate Intensity of Online Posts via Hate Speech Normalization, KDD 2022
towards non-hate. • Does not force to change sentiment or opinion. • Evidently leads to less virality. Fig 1: Difference in predicted number of comments per set per iteration. [1] [8]: Masud et al., Proactively Reducing the Hate Intensity of Online Posts via Hate Speech Normalization, KDD 2022
Angry by design: toxic communication and technical architectures 3. Countering Online Hate Speech: An NLP Perspective 4. Automated Content Moderation Increases Adherence to Community Guidelines 5. Empathy-based counterspeech can reduce racist hate speech in a social media field experiment 6. Generating Counter Narratives against Online Hate Speech: Data and Strategies 7. Reconsidering Tweets: Intervening during Tweet Creation Decreases Offensive Content 8. Proactively Reducing the Hate Intensity of Online Posts via Hate Speech Normalization 9. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension