Controversy on Social Media: Collective Attention, Echo Chambers, and Price of Bipartisanship

Controversy on Social Media: Collective Attention, Echo Chambers, and Price
of Bipartisanship Gianmarco De Francisci Morales ISI Foundation with  Kiran Garimella  Aristides Gionis   Michael Mathioudakis

Controversy: from Latin contra (against) vertere (turn) “turned against, disputed”

There are two sides to every story

Goal Understand how controversies   unfold in social media

Goal Understand how controversies   unfold in social media Network
→ Measure of Controversy

→ Measure of Controversy Network + Time → Collective Attention

→ Measure of Controversy Network + Time → Collective Attention Network + Content → Echo Chambers

Quantifying Controversy  in Social Media     WSDM 2016

Black/Blue or White/Gold?

Desiderata In the wild Not necessarily political No domain knowledge
Language independent Allows comparison

Problem Formulation Graph-based unsupervised formulation Conversation graph for a topic
(endorsements) Find partition of graph (represents 2 sides) Measure distance between partitions (random walks)

Example

Example #beefban #марш #sxsw #germanwings

Example #beefban #марш #sxsw #germanwings Controversial Non controversial

Pipeline • Retweets • Follow • Mentions • Content •
METIS • Spectral • Label   propagation • Random walk • Edge betweenness • 2d embedding • Sentiment variance

Random Walk

Random Walk X Y

Controversy Detection

The Effect of Collective Attention on Controversial Debates on Social
media    WebSci 2017 (Best Paper Award)

The Effect of Collective Attention on Controversial Debates on Social
media

"Trump taxes" on Google

"Trump taxes" on Google Rachel Maddow show on 2005 tax
return

Obamacare on Twitter

Gun Control on Twitter

Literature so far Controversial debates examined in isolation As static
snapshots

Contribution Controversial debates are dynamic They change with collective attention
Analyze controversial debates over time Particularly when collective attention increases When external ‘event’ happens

Data Twitter 4 longitudinal polarized topics Obamacare, Abortion, Gun control,
Fracking 5 years (2011 -- 2016) Hundreds of thousands of users Millions of tweets

Deﬁnitions Retweet Graph Reply Graph Core Users

Retweet Graph

Reply graph

Core Core Users

Experiments

Experiments Compare these  two points

Retweet Graph

Retweet Graph 1) New users enter the discussion

Retweet Graph 2) Most retweets to existing core users 1)
New users enter the discussion

New users enter the discussion 3) Cross-side retweets decrease

New users enter the discussion 3) Cross-side retweets decrease 4) Within-side retweets increase

Controversy Measure Figure 2: RWC score as a function of
the activity in the retweet network. An increase in interest in the controversial topic corresponds to an increase in the controversy score of the retweet network. 5.1 Network F t r s w

Core-Periphery Openness Figure 12: Core–periphery openness as a function of
activity in the retweet network. As the interest increases, the number of core-periphery edges, normalized by the expected number of edges in a random network, increases. This sug- gests a propensity of periphery nodes to connect with the core nodes when interest increases.

Reply Graph Cross-side edges increase: more discussion Attention increases

Content Pro Life Pro Choice Normal  Condition Attention  Increase

Content Pro Life Pro Choice Normal  Condition Attention  Increase Content
becomes uniform across the sides

Long-Term Polarization

Summary Controversial debates during external events Polarization increases Retweet graph
becomes hierarchical (core-periphery) More replies across sides Content becomes more uniform Many more results in the paper!

Political Discourse on Social Media Echo Chambers, Gatekeepers,   and
the Price of Bipartisanship WWW 2018

Political Discourse on  Social Media Characterized by heavy polarization Emergence
of echo chambers ("Hear your own voice") Might hamper deliberative process in democracy Lack of shared world view Concern expressed by former US Presidents, Facebook, Twitter, and more

Polarization Cause Selective exposure? People see only content that agrees
with their pre- existing opinion Biased assimilation? People pay more attention to content that agrees with their pre-existing opinion

Echo Chamber Deﬁnition Echo = opinion Chamber = network Joint
content + network deﬁnition Echo chamber = political leaning of content that users receive from network agrees with that of content they share to the network

Production/Consumption Consumption What you receive in your feed What your
followees tweet Production What you tweet

Political Leaning Scores Based on source of the content (500
domains) Score derived by self-declared afﬁliation of sharers on FB FoxNews.com is aligned with conservatives (CP = 0.9),  HufﬁngtonPost.com is aligned with liberals (CP = 0.17)

Production/Consumption Scores Polarity scores based on “content” leaning (from source)
Production score Average political leaning of the content the user tweets Consumption score Average political leaning of the content the user receives on their feed Results of selection by the user

δ-partisanship f s n- r e m- ., s n
e e k Figure 1: Example showing the denition of -partisan users. The dotted red lines are drawn at and 1- . Users on the left of the leftmost dashed red line or right of the rightmost one are -partisan.

δ-{partisan,consumer,gatekeeper} δ-partisan: produces content with polarity beyond δ δ-bipartisan: produces
content with polarity within δ δ-consumer: consumes content with polarity beyond δ δ-gatekeeper: δ-partisan but not δ-consumer consumes from both sides but produces content aligned with only one side blocks information ﬂow towards its community

Network Measures Network-based latent-space user polarity Based on following politicians
with aligned ideology Network centrality (PageRank) Local clustering coefﬁcient Retweet/favorite rates and volumes

(a) (b) (c) (d) (e) (f) (g) (h) (i) (j)
Figure 3: Distribution of production and consumption polarity, for P (rst row) and NP (second row) datasets. The scatter plots display the production ( x -axis) and consumption ( -axis) polarities of each user in a dataset. Colors indicate user polarity sign, following [6] (grey = democrat, yellow = republican). The one-dimensional plots along the axes show the distributions of the production and consumption polarities for democrats and republicans. Correlation (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) Figure 3: Distribution of production and consumption polarity, for P (rst row) and NP (second row) datasets. The scatter plots display the production ( x -axis) and consumption ( -axis) polarities of each user in a dataset. Colors indicate user polarity sign, following [6] (grey = democrat, yellow = republican). The one-dimensional plots along the axes show the distributions of the production and consumption polarities for democrats and republicans.

Correlation: Gun Control

(f) (g) (h) (i) (j) Figure 3: Distribution of production
and consumption polarity, for P (rst row) and NP (second row) datasets. The scatter plots display the production ( x -axis) and consumption ( -axis) polarities of each user in a dataset. Colors indicate user polarity sign, following [6] (grey = democrat, yellow = republican). The one-dimensional plots along the axes show the distributions of the production and consumption polarities for democrats and republicans. (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) Figure 4: Top: Production polarity variance vs. production polarity (mean). Bottom: Consumption polarity variance vs. consumption polarity (mean). However, dierently from the rest of the side they align with, they show a lower clustering coecient, an indication that they are not completely embedded in a single community. Given that they receive content also from the opposing side, this result is to be Finally, given that both partisans and gatekeepers sport higher centrality, we compare their PageRank values directly and nd that there is a signicant dierence: partisans have a higher PageRank compared to gatekeepers (gure not shown). This eect is more Variance (f) (g) (h) (i) (j) Figure 3: Distribution of production and consumption polarity, for P (rst row) and NP (second row) datasets. The scatter plots display the production ( x -axis) and consumption ( -axis) polarities of each user in a dataset. Colors indicate user polarity sign, following [6] (grey = democrat, yellow = republican). The one-dimensional plots along the axes show the distributions of the production and consumption polarities for democrats and republicans. (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) Figure 4: Top: Production polarity variance vs. production polarity (mean). Bottom: Consumption polarity variance vs. consumption polarity (mean). However, dierently from the rest of the side they align with, they show a lower clustering coecient, an indication that they are not completely embedded in a single community. Given that they receive content also from the opposing side, this result is to be Finally, given that both partisans and gatekeepers sport higher centrality, we compare their PageRank values directly and nd that there is a signicant dierence: partisans have a higher PageRank compared to gatekeepers (gure not shown). This eect is more

Variance (b) (c)

0.0 1.0 2.0 0.2 0.3 0.4 Large Threshold δ partisan
bipartisan (a) 0.0 1.0 2.0 0.2 0.3 0.4 Combined Threshold δ partisan bipartisan (b) −0.5 0.5 1.5 2.5 0.2 0.3 0.4 Guncontrol Threshold δ partisan bipartisan (c) 0.0 1.0 2.0 0.2 0.3 0.4 Obamacare Threshold δ partisan bipartisan (d) 0.0 1.0 2.0 0.2 0.3 0.4 Abortion Threshold δ partisan bipartisan (e) Figure 5: Absolute value of the user polarity scores for -partisan and -bipartisan users. 5e−07 2e−06 1e−05 0.2 0.3 0.4 Large Threshold δ partisan bipartisan (a) 2e−05 2e−04 2e−03 0.2 0.3 0.4 Combined Threshold δ partisan bipartisan (b) 2e−05 2e−04 2e−03 0.2 0.3 0.4 Guncontrol Threshold δ partisan bipartisan (c) 1e−05 1e−04 1e−03 0.2 0.3 0.4 Obamacare Threshold δ partisan bipartisan (d) 1e−05 1e−04 1e−03 0.2 0.3 0.4 Abortion Threshold δ partisan bipartisan (e) Figure 6: Pagerank for -partisan and -bipartisan users. ble 3: Comparison between -gatekeeper users and a ran- m sample of normal users. A 3 indicates that the corre- onding property is signicantly higher for gatekeepers < 0.001) for at least 4 of the 6 thresholds used. A mi- s next to the checkmark (-) indicates that the property is nicantly lower. Table 4: Accuracy for prediction of users who are pa sans ( p ) or gatekeepers ( ). (net) indicates network and p le features only, ( n -gram) indicates just n-gram featur The last two columns show results for all features combin p (net) (net) p (n-gram) (n-gram) p Price of Bipartisanship 0.0 1.0 2.0 0.2 0.3 0.4 Large Threshold δ partisan bipartisan (a) 0.0 1.0 2.0 0.2 0.3 0.4 Combined Threshold δ partisan bipartisan (b) −0.5 0.5 1.5 2.5 0.2 0.3 0.4 Guncontrol Threshold δ partisan bipartisan (c) 0.0 1.0 2.0 0.2 0.3 0.4 Obamacare Threshold δ partisan bipartisan (d) 0.0 1.0 2.0 0.2 0.3 0.4 Abortion Threshold δ partisan bipartisan (e) Figure 5: Absolute value of the user polarity scores for -partisan and -bipartisan users. 5e−07 2e−06 1e−05 0.2 0.3 0.4 Large Threshold δ partisan bipartisan (a) 2e−05 2e−04 2e−03 0.2 0.3 0.4 Combined Threshold δ partisan bipartisan (b) 2e−05 2e−04 2e−03 0.2 0.3 0.4 Guncontrol Threshold δ partisan bipartisan (c) 1e−05 1e−04 1e−03 0.2 0.3 0.4 Obamacare Threshold δ partisan bipartisan (d) 1e−05 1e−04 1e−03 0.2 0.3 0.4 Abortion Threshold δ partisan bipartisan (e) Figure 6: Pagerank for -partisan and -bipartisan users. ble 3: Comparison between -gatekeeper users and a ran- m sample of normal users. A 3 indicates that the corre- onding property is signicantly higher for gatekeepers < 0.001) for at least 4 of the 6 thresholds used. A mi- s next to the checkmark (-) indicates that the property is nicantly lower. Table 4: Accuracy for prediction of users who are pa sans ( p ) or gatekeepers ( ). (net) indicates network and p le features only, ( n -gram) indicates just n-gram featur The last two columns show results for all features combin p (net) (net) p (n-gram) (n-gram) p

Price of Bipartisanship: PR hreshold δ (b) Threshold δ (c)
Thresh (d) value of the user polarity scores for -partisan and 0.3 0.4 ombined hreshold δ partisan bipartisan (b) 2e−05 2e−04 2e−03 0.2 0.3 0.4 Guncontrol Threshold δ partisan bipartisan (c) 1e−05 1e−04 1e−03 0.2 0.3 Obama Thresh (d)

Partisans vs Bipartisans  Gatekeepers vs Non-gatekeepers ons d of g
of pro- urce pro- the and ddi- for ties. (for and mp- also pro- rces. ach The Table 2: Comparison of various features for partisans & bipartisans and gatekeepers & non-gatekeepers. A 3 indicates that the corresponding feature is signicantly higher for the group of the column ( p < 0.001) for at least 4 of the 6 thresholds used, for most datasets. A minus next to the checkmark (-) indicates that the feature is signicantly lower. Features Partisans Gatekeepers PageRank 3 3 clustering coecient 3 (-) 3 (-) user polarity 3 (-) 3 (-) degree 3 3 retweet rate 3 7 retweet volume 3 7 favorite rate 3 7 favorite volume 3 7 # followers 7 7 # friends 7 7 # tweets 7 7 age on Twitter 7 7 datasets).9 A “3 (-)” means that the property is signicantly lower

Conclusions How to quantify controversy of a topic discussed in
social media Collective attention increases polarization Periphery users tend to retweet core ones Interplay between content and network in echo chambers Evidence of selective exposure Bipartisan users pay a price in terms of network centrality   and content appreciation

What's next? Joint opinion formation + network generation model Temporal
dynamics of the process Application to other contexts (Reddit, Facebook) Interventions: can we do something about it?

Thanks! Ask me two questions! 56 @gdfm7 [email protected]

Controversy on Social Media: Collective Attenti...

Controversy on Social Media: Collective Attention, Echo Chambers, and Price of Bipartisanship

More Decks by Gianmarco De Francisci Morales

Other Decks in Research

Featured

Transcript