Slide 1

Slide 1 text

Controversy on Social Media: Collective Attention, Echo Chambers, and Price of Bipartisanship Gianmarco De Francisci Morales ISI Foundation with
 Kiran Garimella
 Aristides Gionis 
 Michael Mathioudakis

Slide 2

Slide 2 text

Controversy: from Latin contra (against) vertere (turn) “turned against, disputed”

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

There are two sides to every story

Slide 8

Slide 8 text

No content

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

Goal

Slide 12

Slide 12 text

Goal Understand how controversies 
 unfold in social media

Slide 13

Slide 13 text

Goal Understand how controversies 
 unfold in social media Network → Measure of Controversy

Slide 14

Slide 14 text

Goal Understand how controversies 
 unfold in social media Network → Measure of Controversy Network + Time → Collective Attention

Slide 15

Slide 15 text

Goal Understand how controversies 
 unfold in social media Network → Measure of Controversy Network + Time → Collective Attention Network + Content → Echo Chambers

Slide 16

Slide 16 text

Quantifying Controversy
 in Social Media 
 
 WSDM 2016

Slide 17

Slide 17 text

Black/Blue or White/Gold?

Slide 18

Slide 18 text

Desiderata In the wild Not necessarily political No domain knowledge Language independent Allows comparison

Slide 19

Slide 19 text

Problem Formulation Graph-based unsupervised formulation Conversation graph for a topic (endorsements) Find partition of graph (represents 2 sides) Measure distance between partitions (random walks)

Slide 20

Slide 20 text

Example

Slide 21

Slide 21 text

Example #beefban #марш #sxsw #germanwings

Slide 22

Slide 22 text

Example #beefban #марш #sxsw #germanwings Controversial Non controversial

Slide 23

Slide 23 text

Pipeline • Retweets • Follow • Mentions • Content • METIS • Spectral • Label 
 propagation • Random walk • Edge betweenness • 2d embedding • Sentiment variance

Slide 24

Slide 24 text

Random Walk

Slide 25

Slide 25 text

Random Walk X Y

Slide 26

Slide 26 text

Random Walk X Y

Slide 27

Slide 27 text

Controversy Detection

Slide 28

Slide 28 text

The Effect of Collective Attention on Controversial Debates on Social media
 
 WebSci 2017 (Best Paper Award)

Slide 29

Slide 29 text

The Effect of Collective Attention on Controversial Debates on Social media

Slide 30

Slide 30 text

The Effect of Collective Attention on Controversial Debates on Social media

Slide 31

Slide 31 text

The Effect of Collective Attention on Controversial Debates on Social media

Slide 32

Slide 32 text

The Effect of Collective Attention on Controversial Debates on Social media

Slide 33

Slide 33 text

"Trump taxes" on Google

Slide 34

Slide 34 text

"Trump taxes" on Google Rachel Maddow show on 2005 tax return

Slide 35

Slide 35 text

Obamacare on Twitter

Slide 36

Slide 36 text

Gun Control on Twitter

Slide 37

Slide 37 text

Literature so far Controversial debates examined in isolation As static snapshots

Slide 38

Slide 38 text

Contribution Controversial debates are dynamic They change with collective attention Analyze controversial debates over time Particularly when collective attention increases When external ‘event’ happens

Slide 39

Slide 39 text

Data Twitter 4 longitudinal polarized topics Obamacare, Abortion, Gun control, Fracking 5 years (2011 -- 2016) Hundreds of thousands of users Millions of tweets

Slide 40

Slide 40 text

Definitions Retweet Graph Reply Graph Core Users

Slide 41

Slide 41 text

Retweet Graph

Slide 42

Slide 42 text

Reply graph

Slide 43

Slide 43 text

Core

Slide 44

Slide 44 text

Core Core Users

Slide 45

Slide 45 text

Experiments

Slide 46

Slide 46 text

Experiments Compare these
 two points

Slide 47

Slide 47 text

Retweet Graph

Slide 48

Slide 48 text

Retweet Graph 1) New users enter the discussion

Slide 49

Slide 49 text

Retweet Graph 2) Most retweets to existing core users 1) New users enter the discussion

Slide 50

Slide 50 text

Retweet Graph 2) Most retweets to existing core users 1) New users enter the discussion 3) Cross-side retweets decrease

Slide 51

Slide 51 text

Retweet Graph 2) Most retweets to existing core users 1) New users enter the discussion 3) Cross-side retweets decrease 4) Within-side retweets increase

Slide 52

Slide 52 text

Controversy Measure Figure 2: RWC score as a function of the activity in the retweet network. An increase in interest in the controversial topic corresponds to an increase in the controversy score of the retweet network. 5.1 Network F t r s w

Slide 53

Slide 53 text

Core-Periphery Openness Figure 12: Core–periphery openness as a function of activity in the retweet network. As the interest increases, the num- ber of core-periphery edges, normalized by the expected number of edges in a random network, increases. This sug- gests a propensity of periphery nodes to connect with the core nodes when interest increases.

Slide 54

Slide 54 text

Reply Graph Cross-side edges increase: more discussion Attention increases

Slide 55

Slide 55 text

Content Pro Life Pro Choice Normal
 Condition Attention
 Increase

Slide 56

Slide 56 text

Content Pro Life Pro Choice Normal
 Condition Attention
 Increase Content becomes uniform across the sides

Slide 57

Slide 57 text

Long-Term Polarization

Slide 58

Slide 58 text

Summary Controversial debates during external events Polarization increases Retweet graph becomes hierarchical (core-periphery) More replies across sides Content becomes more uniform Many more results in the paper!

Slide 59

Slide 59 text

Political Discourse on Social Media Echo Chambers, Gatekeepers, 
 and the Price of Bipartisanship WWW 2018

Slide 60

Slide 60 text

Political Discourse on
 Social Media Characterized by heavy polarization Emergence of echo chambers ("Hear your own voice") Might hamper deliberative process in democracy Lack of shared world view Concern expressed by former US Presidents, Facebook, Twitter, and more

Slide 61

Slide 61 text

Polarization Cause Selective exposure? People see only content that agrees with their pre- existing opinion Biased assimilation? People pay more attention to content that agrees with their pre-existing opinion

Slide 62

Slide 62 text

Echo Chamber Definition Echo = opinion Chamber = network Joint content + network definition Echo chamber = political leaning of content that users receive from network agrees with that of content they share to the network

Slide 63

Slide 63 text

Production/Consumption Consumption What you receive in your feed What your followees tweet Production What you tweet

Slide 64

Slide 64 text

Political Leaning Scores Based on source of the content (500 domains) Score derived by self-declared affiliation of sharers on FB FoxNews.com is aligned with conservatives (CP = 0.9),
 HuffingtonPost.com is aligned with liberals (CP = 0.17)

Slide 65

Slide 65 text

Production/Consumption Scores Polarity scores based on “content” leaning (from source) Production score Average political leaning of the content the user tweets Consumption score Average political leaning of the content the user receives on their feed Results of selection by the user

Slide 66

Slide 66 text

δ-partisanship f s n- r e m- ., s n e e k Figure 1: Example showing the de￿nition of -partisan users. The dotted red lines are drawn at and 1- . Users on the left of the leftmost dashed red line or right of the rightmost one are -partisan.

Slide 67

Slide 67 text

δ-{partisan,consumer,gatekeeper} δ-partisan: produces content with polarity beyond δ δ-bipartisan: produces content with polarity within δ δ-consumer: consumes content with polarity beyond δ δ-gatekeeper: δ-partisan but not δ-consumer consumes from both sides but produces content aligned with only one side blocks information flow towards its community

Slide 68

Slide 68 text

Network Measures Network-based latent-space user polarity Based on following politicians with aligned ideology Network centrality (PageRank) Local clustering coefficient Retweet/favorite rates and volumes

Slide 69

Slide 69 text

(a) (b) (c) (d) (e) (f) (g) (h) (i) (j) Figure 3: Distribution of production and consumption polarity, for P￿￿￿￿￿￿￿￿ (￿rst row) and N￿￿￿P￿￿￿￿￿￿￿￿ (second row) datasets. The scatter plots display the production ( x -axis) and consumption ( -axis) polarities of each user in a dataset. Colors indicate user polarity sign, following [6] (grey = democrat, yellow = republican). The one-dimensional plots along the axes show the distributions of the production and consumption polarities for democrats and republicans. Correlation (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) Figure 3: Distribution of production and consumption polarity, for P￿￿￿￿￿￿￿￿ (￿rst row) and N￿￿￿P￿￿￿￿￿￿￿￿ (second row) datasets. The scatter plots display the production ( x -axis) and consumption ( -axis) polarities of each user in a dataset. Colors indicate user polarity sign, following [6] (grey = democrat, yellow = republican). The one-dimensional plots along the axes show the distributions of the production and consumption polarities for democrats and republicans.

Slide 70

Slide 70 text

Correlation: Gun Control

Slide 71

Slide 71 text

(f) (g) (h) (i) (j) Figure 3: Distribution of production and consumption polarity, for P￿￿￿￿￿￿￿￿ (￿rst row) and N￿￿￿P￿￿￿￿￿￿￿￿ (second row) datasets. The scatter plots display the production ( x -axis) and consumption ( -axis) polarities of each user in a dataset. Colors indicate user polarity sign, following [6] (grey = democrat, yellow = republican). The one-dimensional plots along the axes show the distributions of the production and consumption polarities for democrats and republicans. (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) Figure 4: Top: Production polarity variance vs. production polarity (mean). Bottom: Consumption polarity variance vs. con- sumption polarity (mean). However, di￿erently from the rest of the side they align with, they show a lower clustering coe￿cient, an indication that they are not completely embedded in a single community. Given that they receive content also from the opposing side, this result is to be Finally, given that both partisans and gatekeepers sport higher centrality, we compare their PageRank values directly and ￿nd that there is a signi￿cant di￿erence: partisans have a higher PageRank compared to gatekeepers (￿gure not shown). This e￿ect is more Variance (f) (g) (h) (i) (j) Figure 3: Distribution of production and consumption polarity, for P￿￿￿￿￿￿￿￿ (￿rst row) and N￿￿￿P￿￿￿￿￿￿￿￿ (second row) datasets. The scatter plots display the production ( x -axis) and consumption ( -axis) polarities of each user in a dataset. Colors indicate user polarity sign, following [6] (grey = democrat, yellow = republican). The one-dimensional plots along the axes show the distributions of the production and consumption polarities for democrats and republicans. (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) Figure 4: Top: Production polarity variance vs. production polarity (mean). Bottom: Consumption polarity variance vs. con- sumption polarity (mean). However, di￿erently from the rest of the side they align with, they show a lower clustering coe￿cient, an indication that they are not completely embedded in a single community. Given that they receive content also from the opposing side, this result is to be Finally, given that both partisans and gatekeepers sport higher centrality, we compare their PageRank values directly and ￿nd that there is a signi￿cant di￿erence: partisans have a higher PageRank compared to gatekeepers (￿gure not shown). This e￿ect is more

Slide 72

Slide 72 text

Variance (b) (c)

Slide 73

Slide 73 text

0.0 1.0 2.0 0.2 0.3 0.4 Large Threshold δ partisan bipartisan (a) 0.0 1.0 2.0 0.2 0.3 0.4 Combined Threshold δ partisan bipartisan (b) −0.5 0.5 1.5 2.5 0.2 0.3 0.4 Guncontrol Threshold δ partisan bipartisan (c) 0.0 1.0 2.0 0.2 0.3 0.4 Obamacare Threshold δ partisan bipartisan (d) 0.0 1.0 2.0 0.2 0.3 0.4 Abortion Threshold δ partisan bipartisan (e) Figure 5: Absolute value of the user polarity scores for -partisan and -bipartisan users. 5e−07 2e−06 1e−05 0.2 0.3 0.4 Large Threshold δ partisan bipartisan (a) 2e−05 2e−04 2e−03 0.2 0.3 0.4 Combined Threshold δ partisan bipartisan (b) 2e−05 2e−04 2e−03 0.2 0.3 0.4 Guncontrol Threshold δ partisan bipartisan (c) 1e−05 1e−04 1e−03 0.2 0.3 0.4 Obamacare Threshold δ partisan bipartisan (d) 1e−05 1e−04 1e−03 0.2 0.3 0.4 Abortion Threshold δ partisan bipartisan (e) Figure 6: Pagerank for -partisan and -bipartisan users. ble 3: Comparison between -gatekeeper users and a ran- m sample of normal users. A 3 indicates that the corre- onding property is signi￿cantly higher for gatekeepers < 0.001) for at least 4 of the 6 thresholds used. A mi- s next to the checkmark (-) indicates that the property is ni￿cantly lower. Table 4: Accuracy for prediction of users who are pa sans ( p ) or gatekeepers ( ). (net) indicates network and p ￿le features only, ( n -gram) indicates just n-gram featur The last two columns show results for all features combin p (net) (net) p (n-gram) (n-gram) p Price of Bipartisanship 0.0 1.0 2.0 0.2 0.3 0.4 Large Threshold δ partisan bipartisan (a) 0.0 1.0 2.0 0.2 0.3 0.4 Combined Threshold δ partisan bipartisan (b) −0.5 0.5 1.5 2.5 0.2 0.3 0.4 Guncontrol Threshold δ partisan bipartisan (c) 0.0 1.0 2.0 0.2 0.3 0.4 Obamacare Threshold δ partisan bipartisan (d) 0.0 1.0 2.0 0.2 0.3 0.4 Abortion Threshold δ partisan bipartisan (e) Figure 5: Absolute value of the user polarity scores for -partisan and -bipartisan users. 5e−07 2e−06 1e−05 0.2 0.3 0.4 Large Threshold δ partisan bipartisan (a) 2e−05 2e−04 2e−03 0.2 0.3 0.4 Combined Threshold δ partisan bipartisan (b) 2e−05 2e−04 2e−03 0.2 0.3 0.4 Guncontrol Threshold δ partisan bipartisan (c) 1e−05 1e−04 1e−03 0.2 0.3 0.4 Obamacare Threshold δ partisan bipartisan (d) 1e−05 1e−04 1e−03 0.2 0.3 0.4 Abortion Threshold δ partisan bipartisan (e) Figure 6: Pagerank for -partisan and -bipartisan users. ble 3: Comparison between -gatekeeper users and a ran- m sample of normal users. A 3 indicates that the corre- onding property is signi￿cantly higher for gatekeepers < 0.001) for at least 4 of the 6 thresholds used. A mi- s next to the checkmark (-) indicates that the property is ni￿cantly lower. Table 4: Accuracy for prediction of users who are pa sans ( p ) or gatekeepers ( ). (net) indicates network and p ￿le features only, ( n -gram) indicates just n-gram featur The last two columns show results for all features combin p (net) (net) p (n-gram) (n-gram) p

Slide 74

Slide 74 text

Price of Bipartisanship: PR hreshold δ (b) Threshold δ (c) Thresh (d) value of the user polarity scores for -partisan and 0.3 0.4 ombined hreshold δ partisan bipartisan (b) 2e−05 2e−04 2e−03 0.2 0.3 0.4 Guncontrol Threshold δ partisan bipartisan (c) 1e−05 1e−04 1e−03 0.2 0.3 Obama Thresh (d)

Slide 75

Slide 75 text

Partisans vs Bipartisans
 Gatekeepers vs Non-gatekeepers ons d of g of pro- urce pro- the and ddi- for ties. (for and mp- also pro- rces. ach The Table 2: Comparison of various features for partisans & bi- partisans and gatekeepers & non-gatekeepers. A 3 indicates that the corresponding feature is signi￿cantly higher for the group of the column ( p < 0.001) for at least 4 of the 6 thresh- olds used, for most datasets. A minus next to the check- mark (-) indicates that the feature is signi￿cantly lower. Features Partisans Gatekeepers PageRank 3 3 clustering coe￿cient 3 (-) 3 (-) user polarity 3 (-) 3 (-) degree 3 3 retweet rate 3 7 retweet volume 3 7 favorite rate 3 7 favorite volume 3 7 # followers 7 7 # friends 7 7 # tweets 7 7 age on Twitter 7 7 datasets).9 A “3 (-)” means that the property is signi￿cantly lower

Slide 76

Slide 76 text

Conclusions How to quantify controversy of a topic discussed in social media Collective attention increases polarization Periphery users tend to retweet core ones Interplay between content and network in echo chambers Evidence of selective exposure Bipartisan users pay a price in terms of network centrality 
 and content appreciation

Slide 77

Slide 77 text

What's next? Joint opinion formation + network generation model Temporal dynamics of the process Application to other contexts (Reddit, Facebook) Interventions: can we do something about it?

Slide 78

Slide 78 text

Thanks! Ask me two questions! 56 @gdfm7 [email protected]