Antisocial Computing

Justin Cheng Stanford University ANTISOCIAL COMPUTING Explaining and Predicting Negative
Behavior Online

Vieweg, et al. (2010); Kittur, et al. (2013); Burke &
Kraut (2016)

Time (2016); The Atlantic (2016); Vanity Fair (2017)

47% of online users have been harassed Data & Society
(2017)

Popular Science (2013); The Verge (2015); Chicago Sun-Times (2014)

Why is bad behavior so prevalent?  (›°□°)›ớ ᵲᴸᵲ Research Question

Understanding bad behavior helps us build healthier communities Implications Systems
Guidelines Interventions

Antisocial behavior is largely due to sociopaths Prior Work Donath
(1999); Hardaker (2010); Buckels, et al. (2014)

Antisocial behavior is largely due to ordinary people This Work

Antisocial Behavior & Its Spread Talk Outline What causes antisocial
behavior? Can such cascades be predicted? Does it worsen over time? 1 2 3

Data Mining + Crowdsourcing Research Approach Large-scale Analysis + Experiments

Identifying principles of online behavior The Broader Picture Data +
ML + Network Science + HCI

Antisocial Behavior & Its Spread Talk Outline 1 2 3
What causes antisocial behavior? Can such cascades be predicted? Does it worsen over time?

CSCW 2017 (Best Paper); ICWSM 2015 (Honorable Mention) with M.
Bernstein, C. Danescu-Niculescu-Mizil, J. Leskovec CAN ANYONE BECOME A TROLL? Causes of Antisocial Behavior in Online Discussions

CONTENT WARNING!  This talk contains depictions of trolling that use
strong language. !

It also shows that Islam and Christianity teaching women to
dress modest could be right afterall.

It also shows that Islam and Christianity teaching women to
dress modest could be right afterall. Clearly that is the only logical conclusion to this article. Now if you'll excuse me, I need to iron my tarp. I have work on Monday, and I want to appear 'modest'. fail at life. go bomb yourself. Religious nut alert

We studied multiple large comment- based news communities. 470M posts
831M votes 76M users

What is trolling?

What is trolling? Engaging in negatively marked online behavior Taking
pleasure in upsetting others Not following the rules Disrupting a group while staying undercover Donath (1999); Hardaker (2010); Kirman (2012); Schwartz (2008)

Trolling is behavior that occurs outside community norms. Deﬁned using
community guidelines Our Deﬁnition e.g., name-calling, personal attacks, profanity, threats, hate speech, ethnically/racially offensive material

Are trolls just a vocal minority? Donath (1999); Hardaker (2010);
Shachaf & Hara (2010); NYT (2008); Wired (2014); Vox (2014)

How much do trolls troll? Proportion of Banned Users 0
0.1 0.2 0.3 0.4 Proportion of Deleted Posts 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

The distribution of trolls is bimodal Proportion of Banned Users
0 0.1 0.2 0.3 0.4 Proportion of Deleted Posts 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Are there two types of trolls? Proportion of Banned Users
0 0.1 0.2 0.3 0.4 Proportion of Deleted Posts 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Situational trolling? Lifelong trolling?

What if antisocial behavior is situational?

Challenge: how to show that antisocial behavior is situational? Observational
data isn’t causal

Challenge: how to show that antisocial behavior is situational? Experiments
hard to generalize

Simulated Discussion Experiment Large-Scale Analysis Solution: Experiment + Observational Study

Anyone can become a troll Our Hypothesis

“Broken windows” theory Zimbardo (1969); Wilson (1982)

Unpleasant stimuli increase aggression Jones & Bogat (1978); Rotton &
Frey (1985)

Experiment: simulated discussion forum

N=667, 40% female Quiz Discussion Experimental method

Quiz Discussion Experimental method ×

Positive/Negative Mood Positive/Negative Context Experimental method × Quiz Discussion

Easy quiz (positive mood)

Difﬁcult quiz (negative mood)

Positive discussion context

Negative discussion context

People reported being in a worse mood after the difﬁcult
quiz Easy: 12.2 Difﬁcult: 40.8 (POMS mood disturbance [higher scores = worse mood], p<0.01) Manipulation Check

Initial seed posts in the negative context condition perceived worse
Positive: 90% upvoted Negative: 36% upvoted (p < 0.01) Manipulation Check

How did trolling differ across conditions? Two expert raters labeled
posts independently

How did trolling differ across conditions? Positive Mood Negative Mood
Positive Context Negative Context % Troll Posts

Trolling is lowest with positive conditions… Positive Mood Negative Mood
Positive Context 35% Negative Context % Troll Posts

…increases with either negative condition… Positive Mood Negative Mood Positive
Context 35% 49% Negative Context 47% % Troll Posts

…and almost doubles in the worst case Positive Mood Negative
Mood Positive Context 35% 49% Negative Context 47% 68% % Troll Posts (p < 0.05 using a mixed effects logistic regression model)

Negative affect almost triples Positive Mood Negative Mood Positive Context
1.1% 1.4% Negative Context 2.3% 2.9% % Negative Affect Words (LIWC) (p < 0.05)

Hilary is a solid candidate. As a woman, I appreciate
that she's a woman, but it's not the only reason I think she would do well in ofﬁce. Positive Mood + Context

Anyone who votes for her is a complete idiot. These
supporters are why this country is in such bad shape now. Uneducated people. Negative Mood + Context

Bad mood and negative discussion context increase trolling

Simulated Discussion Experiment Large-Scale Analysis of CNN.com Online Experiment +
Observational Study

Can trolling, like mood, vary with the time of day
and day of week? Replicating Mood Golder & Macy (2011) Neg. Affect Time of Day ?

How does trolling vary with time of day? Proportion of
Flagged Posts 0.03 0.033 0.036 0.039 0.042 Time of Day 0 6 12 18 24

Trolling peaks in the evening… Proportion of Flagged Posts 0.03
0.033 0.036 0.039 0.042 Time of Day 0 6 12 18 24 Negative Affect (Golder & Macy)

…and early in the work week. Proportion of Flagged Posts
0.03 0.032 0.034 0.036 0.038 0.04 Day of Week Mon Tue Wed Thu Fri Sat Sun

Trolling peaks when moods are worse Time of Day Proportion
of Flagged Posts Negative Affect Proportion of Downvotes Day of Week

Mood spills over from prior discussions Replicating Mood

… … (Discussion) Mood spills over from prior discussions

… … … ? … (Discussion) (Unrelated Discussions) … ?
… Mood spills over from prior discussions

… … … … (Discussion) … … Mood spills over
from prior discussions

A user who trolled in a previous discussion is twice
as likely to troll in a later, unrelated discussion (p < 0.01) Replicating Mood

The initial post affects subsequent trolling Replicating Context

(Separate discussions of same article) … ? ? ? …
? ? ? The initial post affects subsequent trolling

(Separate discussions of same article) … … The initial post
affects subsequent trolling

An initial troll post increases the subsequent trolling by 63%
(p < 0.01) Replicating Context

Can we predict trolling before it happens? Balanced dataset of
120K posts Logistic regression

Mood Context The User What factors affect trolling? Trolling is
situational Trolling is innate

How predictable are troll posts? User-speciﬁc Mood Discussion Context Combined
AUC 0.00 0.20 0.40 0.60 0.80 0.78 0.74 0.60 0.66

Troll or not? User

Troll or not? User Mood

Troll or not? User Mood Other users { }

Because trolling is situational, ordinary people can end up trolling

Can voting mitigate bad behavior?

Downvoting causes negative behavior to worsen Our Hypothesis

How Antisocial Behavior Worsens ICWSM 2014 with C. Danescu-Niculescu-Mizil, J.
Leskovec CAN ANTISOCIAL BEHAVIOR SPI R A L?

Downvoting causes negative behavior to worsen Our Hypothesis

What effects do evaluations have? Positively evaluated ? ? Negatively
evaluated

What is a positive or negative evaluation?

Deﬁning positive and negative evaluations : 9 1 2 8
N↑  N↑ + N↓ 2  2+8 = = 0.2 p↓ ≤ : N↑  N↑ + N↓ 9  9+1 = = 0.9 p↑ ≥ Positive Evaluation Negative Evaluation (validated using a crowdsourcing experiment)

What effects do evaluations have? Positively evaluated ? ? Negatively
evaluated

Does feedback encourage better behavior? Skinner (1938)

Or is bad stronger than good? Brinko (1993); Baumeister, et
al. (2001)

Four large comment-based news communities

What effects do evaluations have? Positively evaluated Negatively evaluated

What effects do evaluations have? … … … …

What effects do evaluations have? … … … … Before
After vs. Before After vs.

Challenge: how to compare different users and posts? Aren’t downvoted
users/posts inherently worse?

Solution: propensity score matching PSM: Rosenbaum (1983); CEM: Iacus, et
al. (2012) Positively evaluated Negatively evaluated

Match on text quality Similar text quality q } ≈

Computing text quality Learn p with bigrams  (binomial regression) 1
3 lorem ipsum… q = ? Lorem… ? ? 9 2 lorem ipsum… … Text quality q is predicted p

Validating text quality Manually label subset (n=171) using crowdsourcing lorem
ipsum… Good Bad Good Good # Good  # Total q’=

Validating text quality Manually label subset (n=171) using crowdsourcing lorem
ipsum… Good Bad Good Good # Good  # Total q’= ? ? corr(q’, p) = corr(q’, q) =

Validating text quality (n.s.) (p < 0.01) p: actual proportion
of upvotes q: text quality (predicted proportion) q’: crowd guess corr(q’, q) = corr(q’, p) = 0.11 0.25

Validating text quality |Residuals| 0 0.2 0.4 0.6 0.8 1
0.6 0.7 0.8 0.9 1 |q − q’| (n.s. using a Breusch-Pagan test) Text Quality q

Match on text quality Similar text quality q(c↑ )=q(c↓ )
} ≈

…as well as other covariates Similar history (# posts, overall
proportion of upvotes, etc.) { ≈ … … ≈

…as well as other covariates ≈ … … … …
≈

How are subsequent posts evaluated? ≈ … … … …
≈

How much are evaluations due to textual or community effects?

How much are evaluations due to textual effects (i.e., people
writing worse)? f***ing a****** i.e., downvoting because of post content

How much are evaluations due to community effects (i.e., inherent
bias)? We dislike you. i.e., downvoting because of community dislikes author We dislike you.

Do people write better/worse after a positive/negative evaluation? Textual Effects

≈ … … … … Better/Worse? Do people write better/worse
after a positive/negative evaluation? ≈

Text quality drops signiﬁcantly after a negative evaluation… (p <
0.05, mean effect size r = 0.18) … … Negativity bias

…but doesn’t change after a positive evaluation … … (n.s.)
Negativity bias

How does community bias change after an evaluation? Community Effects

… … … … How does community bias change after
an evaluation?

Measuring community bias N↑ q N↓ N↑  N↑ + N↓
= 0.5 p(c) = = 0.8 q(c) p(c) q(c) = 0.3 Prop. Upvotes Text Quality Community Bias − −

Community bias increase more after a negative than positive evaluation
(p < 0.01, mean effect size r = 0.13) … … Halo effect

More positive Negative Eval. Positive Eval.

More positive Negative Eval. Positive Eval. Similar text quality  q(c↑
)=q(c↓ )

More positive Before Negative Eval. Positive Eval. Similar history Similar
text quality  q(c↑ )=q(c↓ )

More positive Before After Negative Eval. Positive Eval. Similar history
Similar text quality  q(c↑ )=q(c↓ ) Worse text quality  q(c↑(1,3) ) > q(c↓(1,3) ) *

More positive Before After Negative Eval. Positive Eval. Similar history
Similar text quality  q(c↑ )=q(c↓ ) Worse text quality  q(c↑(1,3) ) > q(c↓(1,3) ) Worse perception  q(c↓(1,3) ) - p(c↓(1,3) ) > q(c↑(1,3) ) - p(c↑(1,3) ) * *

What happens to negatively-evaluated users?

They post worse content Perceptions of them become worse They
post more frequently* They evaluate others more negatively* * More details in our ICWSM 2014 paper (http://bit.ly/feedback-paper) What happens to negatively-evaluated users?

Trolls may start out normal, but tip into a spiral
and never recover

Do communities worsen over time?

Communities may worsen over time (?) Proportion of Upvotes 0.6
0.65 0.7 0.75 0.8 Time December 2012 February 2013 April 2013 June 2013 August 2013

The Predictability of Information Cascades in Social Networks CAN C
S BE PREDICTED? AS A C DE WWW 2014; WWW 2016 with L. Adamic, P. A. Dow, J. Kleinberg, J. Leskovec

Rumors on Facebook ICWSM 2014 (with A. Friggeri, L. Adamic,
and D. Eckles)

Same rumor, different popularity

Are these cascades predictable?

Are cascades unpredictable?

Large cascades are rare Empirical CCDF 0.00 0.20 0.40 0.60
0.80 1.00 Cascade size 0 200 400 600 800 1000 0.09 100

“Increasing the strength of social inﬂuence increased both inequality and
unpredictability of success.” Salganik, Dodds & Watts (2006)

Cascades can recur after long periods

Cascades are predictable Our Hypothesis size, structure, content even if
they recur

How do we begin to predict cascade growth?

Challenge: how to predict cascade growth? ? k=5 reshares  observed

Challenge: how to predict cascade growth? ? Will a cascade
get 100 reshares? Exactly how big will a small cascade get? Only consider the largest cascades?

Challenge: how to predict cascade growth? ? Will a cascade
get 100 reshares? Exactly how big will a small cascade get? Only consider the largest cascades? class imbalance outliers skew results selection bias

Solution: will a cascade reach the median? ? ≤ the
median f(k) ≥ the median f(k) k=5 reshares  observed

Solution: will a cascade double in size? ? ≤ the
median f(k) ≥ the median f(k) k=5 reshares  observed

Given that a cascade has obtained k reshares, will it
double in size? balanced track growth over time Cascade Growth Prediction Problem

Reshare cascades on Facebook 70M cascades 5B reshares Activity over
28 days

Content  has overlaid text captions … User  friend count  gender
… Structural  tree depth outdegree … Temporal  time between shares change in time … What factors affect predictability?

How predictable is cascade doubling? All Temporal All but temporal
Structural User Content AUC (k=5) 0.00 0.23 0.45 0.68 0.90 0.58 0.71 0.74 0.79 0.87 0.88 All but temporal

Given that a cascade has obtained k reshares, will it
double in size? Cascade Growth Prediction Problem

Given that a cascade has obtained 5 reshares, will it

Given that a cascade has obtained 100 reshares, will it

How does performance change with k? k = 5 k
> 10 k = 10 k > 20

> 10 k = 10 k > 20 Less data More data

> 10 k = 10 k > 20 Shorter-term Longer-term

Easier to predict larger cascades doubling Accuracy 0.78 0.79 0.8
0.81 0.82 Number of reshares observed, k 0 25 50 75 100

Cascade growth is predictable

Cascade structure is predictable AUC = 0.80 for predicting structural
virality * More details in our WWW 2014 paper (http://bit.ly/memes-paper) vs.

Cascade recurrence is predictable AUC = 0.89 for predicting a
subsequent burst * More details in our WWW 2016 paper (http://bit.ly/cascades-paper) vs.

What we now know What we thought Trolls are a
vocal minority Trolls can be ordinary people Trolling is innate Trolling can spiral from a single bad post Cascades can be predicted Cascades are unpredictable ANTISOCIAL COMPUTING

Predicting the demise of communities Proportion of Upvotes 0.6 0.65
0.7 0.75 0.8 Time December February April June August Future Directions

Designing prosocial discussion platforms Future Directions

Munger (2016) Introducing conversation mediators Future Directions Don’t be a
n****r. Hey man, just remember that there are real people who are hurt when you harass them with that kind of language. (e.g., bots)

WWW 2017 (with S. Kumar, J. Leskovec, and V.S. Subrahmanian)
Identifying different types of trolling Future Directions Possibly the best blog I’ve ever read major props to you Thanks. I knew Marvel fans would try to ﬂame me, but they have nothing other than “oh that’s your opinion” Quit talking to yourself […] (e.g., sockpuppets)

Addressing polarization Future Directions Measuring algorithmic impact Tracking cascades at
scale

Holistic approaches for analyzing and building social systems Research Approach

Holistic approaches for analyzing and building social systems Large-scale Analysis
Experimentation + Macro-scale Micro-scale + Understand Build + Research Approach

Multi-methods analyses identify patterns in data, verify hypotheses, make predictions,
and develop social systems. Multi-methods analyses identify patterns in data, verify hypotheses, make predictions, and inform the design of better social systems.

Jure Leskovec Michael Bernstein Jon Kleinberg Lada Adamic Thank you!

James Landay Jeff Hancock Cristian Danescu-Niculescu-Mizil Dan Cosley Thank you!

Thank you!

Thank you! Stanford HCI Group SNAP Group Stanford VPGE Microsoft
Research Facebook Pinterest Disqus

Justin Cheng / @jcccf / clr3.com Stanford University More resources
and credits: http://bit.ly/jobtalkcredits ANTISOCIAL COMPUTING Explaining and Predicting Negative Behavior Online

Antisocial Computing

Antisocial Computing

More Decks by Justin Cheng

Other Decks in Research

Featured

Transcript