Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥

Moderation Tools and User Safety: Data-Driven A...

Ruth Toner
October 04, 2017

Moderation Tools and User Safety: Data-Driven Approaches at Twitch

With over a billion chat and private messages sent every month, Twitch is not only a great place for streamers and game developers to grow their communities, but also a large-scale moderation challenge. This talk will describe how moderation currently works on Twitch, and how we’ve used data to approach toxic user behavior from three angles: measuring its extent, determining impact, and building tools to make our platform a safer place.

Ruth Toner

October 04, 2017
Tweet

Other Decks in Science

Transcript

  1. GAME UX SUMMIT ’17 #GAMEUXSUMMIT ‘17 / TORONTO Toxicity and

    Moderation: Data-Based Approaches at Twitch Ruth Toner Data Scientist, Twitch Interactive
  2. Introduction TWITCH: o Live streaming and on-demand video o Fourth

    largest source of internet traffic in the US, mostly (but not only!) gaming content u In a single month: o 2.2 Million broadcasters and content creators, including gamers, esports, devs, and non-gaming content o 15 Million Daily Active Viewers o 1billion+ chat and private messages sent 1/28
  3. Twitch Chat– Why Do We Care? 2/28 Chat: - Main

    way users interact with broadcaster - Subscriptions and “cheering” - Key part of funnel to engaged, paying viewers - We want to make being social on Twitch a good experience
  4. Introduction 1 BILLION MESSAGES = HARASSMENT AND ABUSE HAPPEN u

    This talk = how Twitch uses data to understand: o How abuse happens on Twitch o How we build better tools to fight it o How can we combine data science and human insight? 3/28
  5. Human-centric Data Science u Intelligence Augmentation: “The ultimate goal is

    not building machines that think like humans, but designing machines that help humans think better.” v Guszcza(1), Lewis, Evans-Greenwood “Cognitive collaboration: Why humans and computers think better together” Deloitte University Press Jan 2017 4/28 Smaller scale insights The Sweet Spot Good Data Science + UX Pure data, but also “Artificial Stupidity”1 Pure Qualitative Pure Quantitative
  6. Moderation + Data Science 1. Extent How do we describe

    + quantify abuse on Twitch? 2. Impact How do we answer questions about the impact of abuse and our tools? 3. Tools How do we use data to build effective tools to fight abuse? 5/28 The Goal: • Help our content creators can build the communities they want (within limits…) • No one leaves Twitch because they feel unsafe or harassed
  7. 8/28 Any User: Twitch Site-wide Moderation o Reports are sent

    from a user to Twitch’s site-wide Human Admin moderation staff o These admins can issue a Strike: a temporary suspension or permanent ban from Twitch
  8. Data Source: Reports and Strikes u Safety Violation Signal: TWITCH

    TERMS OF SERVICE VIOLATIONS u TOS: Among many other things, basic rules of conduct for broadcasting and chatting (no harassment, threats, impersonation, etc.) u A viewer or broadcaster is reported for violating the basic rules of conduct governing behavior on Twitch, and can receive a strike limiting use of their account. u Human Judgement: u Reports: People mislabel spam as harassment. Behavior was bad but didn’t break ToS. People report each other as a joke. u Strikes: 100% accurate source of data, but not a complete picture of unsafe behavior. 9/28
  9. 10/28 Channel Moderators: Timeouts and Bans Every channels can appoint

    moderators who can: o Time Out chatters (temporary) o Ban chatters (permanent)
  10. Data Source: Timeouts and Bans u Safety Violation: COMMUNITY RULE

    BREAKING u A channel moderator can ban or time-out someone from participating from chat when they break the rules of a community uWe give broadcasters autonomy to decide what conversation is acceptable in their community (within Terms of Service limits…). u Human Judgement: Not all rule violations are safety violations. Moderators also moderate for spam, for links or all-caps, for spoilers, or (again!) as a joke (“Mods plz ban me!”). 11/28
  11. Data Source: AutoMod u Safety Violation: UNACCEPTABLE LANGUAGE u Broadcaster

    decides how ’risky’ they want language to be on their channel, from just removing hate speech to forbidding cursing. u Two Signals: uAutoMod ratings: how risky AutoMod thinks a chat message is. uMod approvals + denials: what the channel moderators thought. u Human Judgement: Missing social context for the messages. 13/28
  12. Data from Moderation Tools u Each Data Source: How safe

    or happy our viewers or broadcasters feel on Twitch u BUT ALSO: False Positives, Noise, Unclear Signals u “A flag is not merely a technical feature: It is a complex interplay between users and platforms, humans and algorithms, and the social norms and regulatory structures of social media.” v Crawford and Gillespie, “What Is A Flag For? Social Media Reporting Tools and the Vocabulary of Complaint” New Media & Society July 2014 u We understand these signals and noise by exploring data and talking to our users 14/28
  13. Example: Two Types of Abuser Question: What does a troll

    look like? u Chatters suspended for harassment share a few things in common: u Multiple channel bans u Younger than average accounts u Higher than expected language risk u However, if we talk to our admins and then take a closer look at our data, it turns out this question is too simple… 15/28 Account Age: Regular vs Suspended User
  14. Example: Two Types of Abuser Better Question: What do different

    types of troll look like? u We see two major subcategories! u Chat Harassers: Higher risk language, young and old accounts alike. u Ban Evader: Younger accounts with low activity and levels of verification. u We need different solutions for different types of abuse u Mixing quantitative analysis and qualitative assessment allowed us to update our intuition about trolling… 16/28 (Suspended) Account Age: Ban Evader vs Harasser
  15. Abuse: Impact NEXT, WE NEED TO ASK THE RIGHT QUESTIONS

    WITH THE RIGHT TOOLS… 17/28 Measuring impact Understanding our data
  16. Data Science Tools: Questions + Problems u We want to

    turn our qualitative user insights into testable hypotheses. u A/B testing: Causal analysis, but ethical considerations + confusion… u Better for smaller product iterations or helper tools. u Quasi-experimental studies: Cheaper, but self selection effects + confounding variables everywhere! u Example: A channel which bans a lot of users may actually be a healthier channel, since they have a staff of moderators and bots. 18/28
  17. Viewership Impacts? u Key Question: How does abusive behaviors impact

    the health of our community? u Reduced Broadcaster RETENTION? u Reduced viewer ENGAGEMENT? u Lots of 3rd party UX and DS research: u Pew 2017 Research – Online Harassment u Riot Games and other industry research u Talking directly to our viewers and broadcasters u Tanya DePass: “How to Keep Safe In the Land of Twitch” https://www.twitch.tv/videos/174334243 19/28 https://www.polygon.com/2012/10/17/3515178/the-league- of-legends-team-of-scientists-trying-to-cure-toxic
  18. Moderation Workload Impact? u Key Question: What is it like

    to actually use our moderation products? u How fast can administrators respond to reports? u How many actions do our human channel moderators need to perform when they moderate a chat room? u What are the gaps in the system? u Start by talking to our user base and performing qualitative studies to identify these pain points, and then try to study and verify them with our quantitative data. 20/28
  19. Growth and Moderation Workload u User complaint: u As chat

    gets bigger and faster, have to mod faster and a larger % of messages u Very busy chats = have a full moderation staff, but moderation efficiency goes down u Solution: Build moderation tools which reduce the amount of work which our moderators need to do per message. 21/28 Mod Action / Message: Extra Human Mod Staff: Moderation Efficiency vs Conversation Speed: Chat Message/Min Chat Message/Min 1 msg 100 min 10 msg 1 second
  20. Impact Study: Chat Rules u Intended impact: Get rid of

    of timeouts and bans caused by misunderstanding of channel rules. u A/B Test: When entering a channel for the first time, chatters were shown control and variant: u Chat rules: click to agree u No chat rules u Results: No significant impact on chat participation, and a statistically significant reduction in timeouts and bans for the ‘click to agree’ variant! 22/28 GOG.com’s Twitch chat rules
  21. Toxicity: Tools LET’S USE THESE LEARNINGS TO BUILD SOMETHING THAT

    MAKES OUR USERS SAFER 23/28 Intervention Measuring impact Understanding our data
  22. AutoMod u Data Product Problem: Can we help broadcasters passively

    filter hate speech, bullying, and sexual language they don’t want on their chat? u Solution: AutoMod - automated filtering of language, based on topic category and perceived level of risk u Algorithm designed using a combination of statistical learning and human qualitative review 24/28
  23. Designing AutoMod u Start with a pre-trained off-the-shelf ML solution

    u Segments and normalizes each chat segment. u Categorizes sentence fragments by risk topic (hate, sex, bullying, etc.) and severity (high risk, medium risk, etc.) u Can handle over ten languages, combos of words and emotes, misspellings, and (important!) attempts to get around the filter. 25/28 Example: Original: “Omg. You should killll yooorseeeeeefff.” Parsed: [ omg ] [ {you/he/she} | should | {self harm} ] no risk Bullying – High Risk Level
  24. Designing AutoMod u Making this work for Twitch: u Compare,

    for sentence fragment f: u Use Lf to flag individual expressions which were obvious false positives or incorrectly rated. u Chose risk thresholds for our preset options, Rule Levels 1-4 u Get it running in the field u Initial dry run: DNC/RNC Conventions 2016 u Small closed beta to refine usability and filter accuracy. 26/28 " ~ log ",*+,,-. + 1 +11,*+,,-. + 1 ",,2 *+, + 1 +11,,2 *+, + 1 For fragment ‘f’ (and message counts Ncat ): AutoMod Risk Likelihood Lf of User Being Banned for That Fragment versus
  25. Maintaining AutoMod u Full opt-in launch of AutoMod on Dec15,

    2016 u Improving Accuracy: Use Approve and Deny actions to determine what AutoMod recommendations our users agree and disagree with. u L’f Factor: Surface list of recommended rule changes, which are then vetted by our admin staff. u Sep 2017: False positives reduced by 33% since launch! u 25% of all chat messages go through AutoMod u Continue to develop based on performance and user feedback... 27/28 ′" ~ log ",.-,5-. + 1 +11,.-,5-. + 1 ",+66728-. + 1 +11,+66728-. + 1 For fragment ‘f’ (and total unique channels Ccat ):
  26. Conclusions u Our Punchline: Quantitative analysis and qualitative research alone

    can’t capture exactly what’s happening with safety in our products and community. u Combine data science with qualitative learnings from our UX team, our admins, and from talking to our viewers and broadcasters for better decisions u Where we apply this: u Extent: Figure out what signal your data is giving you about safety. u Impact: What are the right questions we should be asking, and using what tools and metrics? u Tools: Using these data and questions, we can craft powerful tools for safety! 28/28
  27. Twitch TOS – Relevant Sections u 9. Prohibited Conduct u

    You agree that you will comply with these Terms of Service and Twitch’s Community Guidelines and will not: u i. create, upload, transmit, distribute, or store any content that is inaccurate, unlawful, infringing, defamatory, obscene, pornographic, invasive of privacy or publicity rights, harassing, threatening, abusive, inflammatory, or otherwise objectionable; u ii. impersonate any person or entity, falsely claim an affiliation with any person or entity, or access the Twitch Services accounts of others without permission, forge another person’s digital signature, misrepresent the source, identity, or content of information transmitted via the Twitch Services, or perform any other similar fraudulent activity; u v. defame, harass, abuse, threaten or defraud users of the Twitch Services, or collect, or attempt to collect, personal information about users or third parties without their consent; 30