Controversy on Social Media: Collective Attention, Echo Chambers, and Price of Bipartisanship

Controversy on Social Media: Collective Attention, Echo Chambers, and Price of Bipartisanship

How do we discuss controversial topics on social media? 
Answering this question is not only interesting from a societal point of view, but also has concrete implications for policy makers, news agencies, and internet companies.
In this talk, we first take a look at how collective attention, which is typically related to external events that increase the visibility of the topic, changes the debate. Our analysis shows that, in long-lived controversial debates on Twitter, increased collective attention is associated with increased network polarization.
Then, we show how content and network interact in the formation of echo chambers. As expected, Twitter users are mostly exposed to political opinions that agree with their own. In addition, users who try to bridge the echo chambers by sharing content with diverse leaning have to pay a “price of bipartisanship” in terms of their network centrality and content appreciation.

4715c0947b4e0ca3bec820d8051aa45a?s=128

Gianmarco De Francisci Morales

September 19, 2018
Tweet

Transcript

  1. Controversy on Social Media: Collective Attention, Echo Chambers, and Price

    of Bipartisanship Gianmarco De Francisci Morales ISI Foundation with
 Kiran Garimella
 Aristides Gionis 
 Michael Mathioudakis
  2. Controversy: from Latin contra (against) vertere (turn) “turned against, disputed”

  3. None
  4. None
  5. None
  6. None
  7. There are two sides to every story

  8. None
  9. None
  10. None
  11. Goal

  12. Goal Understand how controversies 
 unfold in social media

  13. Goal Understand how controversies 
 unfold in social media Network

    → Measure of Controversy
  14. Goal Understand how controversies 
 unfold in social media Network

    → Measure of Controversy Network + Time → Collective Attention
  15. Goal Understand how controversies 
 unfold in social media Network

    → Measure of Controversy Network + Time → Collective Attention Network + Content → Echo Chambers
  16. Quantifying Controversy
 in Social Media 
 
 WSDM 2016

  17. Black/Blue or White/Gold?

  18. Desiderata In the wild Not necessarily political No domain knowledge

    Language independent Allows comparison
  19. Problem Formulation Graph-based unsupervised formulation Conversation graph for a topic

    (endorsements) Find partition of graph (represents 2 sides) Measure distance between partitions (random walks)
  20. Example

  21. Example #beefban #марш #sxsw #germanwings

  22. Example #beefban #марш #sxsw #germanwings Controversial Non controversial

  23. Pipeline • Retweets • Follow • Mentions • Content •

    METIS • Spectral • Label 
 propagation • Random walk • Edge betweenness • 2d embedding • Sentiment variance
  24. Random Walk

  25. Random Walk X Y

  26. Random Walk X Y

  27. Controversy Detection

  28. The Effect of Collective Attention on Controversial Debates on Social

    media
 
 WebSci 2017 (Best Paper Award)
  29. The Effect of Collective Attention on Controversial Debates on Social

    media
  30. The Effect of Collective Attention on Controversial Debates on Social

    media
  31. The Effect of Collective Attention on Controversial Debates on Social

    media
  32. The Effect of Collective Attention on Controversial Debates on Social

    media
  33. "Trump taxes" on Google

  34. "Trump taxes" on Google Rachel Maddow show on 2005 tax

    return
  35. Obamacare on Twitter

  36. Gun Control on Twitter

  37. Literature so far Controversial debates examined in isolation As static

    snapshots
  38. Contribution Controversial debates are dynamic They change with collective attention

    Analyze controversial debates over time Particularly when collective attention increases When external ‘event’ happens
  39. Data Twitter 4 longitudinal polarized topics Obamacare, Abortion, Gun control,

    Fracking 5 years (2011 -- 2016) Hundreds of thousands of users Millions of tweets
  40. Definitions Retweet Graph Reply Graph Core Users

  41. Retweet Graph

  42. Reply graph

  43. Core

  44. Core Core Users

  45. Experiments

  46. Experiments Compare these
 two points

  47. Retweet Graph

  48. Retweet Graph 1) New users enter the discussion

  49. Retweet Graph 2) Most retweets to existing core users 1)

    New users enter the discussion
  50. Retweet Graph 2) Most retweets to existing core users 1)

    New users enter the discussion 3) Cross-side retweets decrease
  51. Retweet Graph 2) Most retweets to existing core users 1)

    New users enter the discussion 3) Cross-side retweets decrease 4) Within-side retweets increase
  52. Controversy Measure Figure 2: RWC score as a function of

    the activity in the retweet network. An increase in interest in the controversial topic corresponds to an increase in the controversy score of the retweet network. 5.1 Network F t r s w
  53. Core-Periphery Openness Figure 12: Core–periphery openness as a function of

    activity in the retweet network. As the interest increases, the num- ber of core-periphery edges, normalized by the expected number of edges in a random network, increases. This sug- gests a propensity of periphery nodes to connect with the core nodes when interest increases.
  54. Reply Graph Cross-side edges increase: more discussion Attention increases

  55. Content Pro Life Pro Choice Normal
 Condition Attention
 Increase

  56. Content Pro Life Pro Choice Normal
 Condition Attention
 Increase Content

    becomes uniform across the sides
  57. Long-Term Polarization

  58. Summary Controversial debates during external events Polarization increases Retweet graph

    becomes hierarchical (core-periphery) More replies across sides Content becomes more uniform Many more results in the paper!
  59. Political Discourse on Social Media Echo Chambers, Gatekeepers, 
 and

    the Price of Bipartisanship WWW 2018
  60. Political Discourse on
 Social Media Characterized by heavy polarization Emergence

    of echo chambers ("Hear your own voice") Might hamper deliberative process in democracy Lack of shared world view Concern expressed by former US Presidents, Facebook, Twitter, and more
  61. Polarization Cause Selective exposure? People see only content that agrees

    with their pre- existing opinion Biased assimilation? People pay more attention to content that agrees with their pre-existing opinion
  62. Echo Chamber Definition Echo = opinion Chamber = network Joint

    content + network definition Echo chamber = political leaning of content that users receive from network agrees with that of content they share to the network
  63. Production/Consumption Consumption What you receive in your feed What your

    followees tweet Production What you tweet
  64. Political Leaning Scores Based on source of the content (500

    domains) Score derived by self-declared affiliation of sharers on FB FoxNews.com is aligned with conservatives (CP = 0.9),
 HuffingtonPost.com is aligned with liberals (CP = 0.17)
  65. Production/Consumption Scores Polarity scores based on “content” leaning (from source)

    Production score Average political leaning of the content the user tweets Consumption score Average political leaning of the content the user receives on their feed Results of selection by the user
  66. δ-partisanship f s n- r e m- ., s n

    e e k Figure 1: Example showing the de￿nition of -partisan users. The dotted red lines are drawn at and 1- . Users on the left of the leftmost dashed red line or right of the rightmost one are -partisan.
  67. δ-{partisan,consumer,gatekeeper} δ-partisan: produces content with polarity beyond δ δ-bipartisan: produces

    content with polarity within δ δ-consumer: consumes content with polarity beyond δ δ-gatekeeper: δ-partisan but not δ-consumer consumes from both sides but produces content aligned with only one side blocks information flow towards its community
  68. Network Measures Network-based latent-space user polarity Based on following politicians

    with aligned ideology Network centrality (PageRank) Local clustering coefficient Retweet/favorite rates and volumes
  69. (a) (b) (c) (d) (e) (f) (g) (h) (i) (j)

    Figure 3: Distribution of production and consumption polarity, for P￿￿￿￿￿￿￿￿ (￿rst row) and N￿￿￿P￿￿￿￿￿￿￿￿ (second row) datasets. The scatter plots display the production ( x -axis) and consumption ( -axis) polarities of each user in a dataset. Colors indicate user polarity sign, following [6] (grey = democrat, yellow = republican). The one-dimensional plots along the axes show the distributions of the production and consumption polarities for democrats and republicans. Correlation (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) Figure 3: Distribution of production and consumption polarity, for P￿￿￿￿￿￿￿￿ (￿rst row) and N￿￿￿P￿￿￿￿￿￿￿￿ (second row) datasets. The scatter plots display the production ( x -axis) and consumption ( -axis) polarities of each user in a dataset. Colors indicate user polarity sign, following [6] (grey = democrat, yellow = republican). The one-dimensional plots along the axes show the distributions of the production and consumption polarities for democrats and republicans.
  70. Correlation: Gun Control

  71. (f) (g) (h) (i) (j) Figure 3: Distribution of production

    and consumption polarity, for P￿￿￿￿￿￿￿￿ (￿rst row) and N￿￿￿P￿￿￿￿￿￿￿￿ (second row) datasets. The scatter plots display the production ( x -axis) and consumption ( -axis) polarities of each user in a dataset. Colors indicate user polarity sign, following [6] (grey = democrat, yellow = republican). The one-dimensional plots along the axes show the distributions of the production and consumption polarities for democrats and republicans. (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) Figure 4: Top: Production polarity variance vs. production polarity (mean). Bottom: Consumption polarity variance vs. con- sumption polarity (mean). However, di￿erently from the rest of the side they align with, they show a lower clustering coe￿cient, an indication that they are not completely embedded in a single community. Given that they receive content also from the opposing side, this result is to be Finally, given that both partisans and gatekeepers sport higher centrality, we compare their PageRank values directly and ￿nd that there is a signi￿cant di￿erence: partisans have a higher PageRank compared to gatekeepers (￿gure not shown). This e￿ect is more Variance (f) (g) (h) (i) (j) Figure 3: Distribution of production and consumption polarity, for P￿￿￿￿￿￿￿￿ (￿rst row) and N￿￿￿P￿￿￿￿￿￿￿￿ (second row) datasets. The scatter plots display the production ( x -axis) and consumption ( -axis) polarities of each user in a dataset. Colors indicate user polarity sign, following [6] (grey = democrat, yellow = republican). The one-dimensional plots along the axes show the distributions of the production and consumption polarities for democrats and republicans. (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) Figure 4: Top: Production polarity variance vs. production polarity (mean). Bottom: Consumption polarity variance vs. con- sumption polarity (mean). However, di￿erently from the rest of the side they align with, they show a lower clustering coe￿cient, an indication that they are not completely embedded in a single community. Given that they receive content also from the opposing side, this result is to be Finally, given that both partisans and gatekeepers sport higher centrality, we compare their PageRank values directly and ￿nd that there is a signi￿cant di￿erence: partisans have a higher PageRank compared to gatekeepers (￿gure not shown). This e￿ect is more
  72. Variance (b) (c)

  73. 0.0 1.0 2.0 0.2 0.3 0.4 Large Threshold δ partisan

    bipartisan (a) 0.0 1.0 2.0 0.2 0.3 0.4 Combined Threshold δ partisan bipartisan (b) −0.5 0.5 1.5 2.5 0.2 0.3 0.4 Guncontrol Threshold δ partisan bipartisan (c) 0.0 1.0 2.0 0.2 0.3 0.4 Obamacare Threshold δ partisan bipartisan (d) 0.0 1.0 2.0 0.2 0.3 0.4 Abortion Threshold δ partisan bipartisan (e) Figure 5: Absolute value of the user polarity scores for -partisan and -bipartisan users. 5e−07 2e−06 1e−05 0.2 0.3 0.4 Large Threshold δ partisan bipartisan (a) 2e−05 2e−04 2e−03 0.2 0.3 0.4 Combined Threshold δ partisan bipartisan (b) 2e−05 2e−04 2e−03 0.2 0.3 0.4 Guncontrol Threshold δ partisan bipartisan (c) 1e−05 1e−04 1e−03 0.2 0.3 0.4 Obamacare Threshold δ partisan bipartisan (d) 1e−05 1e−04 1e−03 0.2 0.3 0.4 Abortion Threshold δ partisan bipartisan (e) Figure 6: Pagerank for -partisan and -bipartisan users. ble 3: Comparison between -gatekeeper users and a ran- m sample of normal users. A 3 indicates that the corre- onding property is signi￿cantly higher for gatekeepers < 0.001) for at least 4 of the 6 thresholds used. A mi- s next to the checkmark (-) indicates that the property is ni￿cantly lower. Table 4: Accuracy for prediction of users who are pa sans ( p ) or gatekeepers ( ). (net) indicates network and p ￿le features only, ( n -gram) indicates just n-gram featur The last two columns show results for all features combin p (net) (net) p (n-gram) (n-gram) p Price of Bipartisanship 0.0 1.0 2.0 0.2 0.3 0.4 Large Threshold δ partisan bipartisan (a) 0.0 1.0 2.0 0.2 0.3 0.4 Combined Threshold δ partisan bipartisan (b) −0.5 0.5 1.5 2.5 0.2 0.3 0.4 Guncontrol Threshold δ partisan bipartisan (c) 0.0 1.0 2.0 0.2 0.3 0.4 Obamacare Threshold δ partisan bipartisan (d) 0.0 1.0 2.0 0.2 0.3 0.4 Abortion Threshold δ partisan bipartisan (e) Figure 5: Absolute value of the user polarity scores for -partisan and -bipartisan users. 5e−07 2e−06 1e−05 0.2 0.3 0.4 Large Threshold δ partisan bipartisan (a) 2e−05 2e−04 2e−03 0.2 0.3 0.4 Combined Threshold δ partisan bipartisan (b) 2e−05 2e−04 2e−03 0.2 0.3 0.4 Guncontrol Threshold δ partisan bipartisan (c) 1e−05 1e−04 1e−03 0.2 0.3 0.4 Obamacare Threshold δ partisan bipartisan (d) 1e−05 1e−04 1e−03 0.2 0.3 0.4 Abortion Threshold δ partisan bipartisan (e) Figure 6: Pagerank for -partisan and -bipartisan users. ble 3: Comparison between -gatekeeper users and a ran- m sample of normal users. A 3 indicates that the corre- onding property is signi￿cantly higher for gatekeepers < 0.001) for at least 4 of the 6 thresholds used. A mi- s next to the checkmark (-) indicates that the property is ni￿cantly lower. Table 4: Accuracy for prediction of users who are pa sans ( p ) or gatekeepers ( ). (net) indicates network and p ￿le features only, ( n -gram) indicates just n-gram featur The last two columns show results for all features combin p (net) (net) p (n-gram) (n-gram) p
  74. Price of Bipartisanship: PR hreshold δ (b) Threshold δ (c)

    Thresh (d) value of the user polarity scores for -partisan and 0.3 0.4 ombined hreshold δ partisan bipartisan (b) 2e−05 2e−04 2e−03 0.2 0.3 0.4 Guncontrol Threshold δ partisan bipartisan (c) 1e−05 1e−04 1e−03 0.2 0.3 Obama Thresh (d)
  75. Partisans vs Bipartisans
 Gatekeepers vs Non-gatekeepers ons d of g

    of pro- urce pro- the and ddi- for ties. (for and mp- also pro- rces. ach The Table 2: Comparison of various features for partisans & bi- partisans and gatekeepers & non-gatekeepers. A 3 indicates that the corresponding feature is signi￿cantly higher for the group of the column ( p < 0.001) for at least 4 of the 6 thresh- olds used, for most datasets. A minus next to the check- mark (-) indicates that the feature is signi￿cantly lower. Features Partisans Gatekeepers PageRank 3 3 clustering coe￿cient 3 (-) 3 (-) user polarity 3 (-) 3 (-) degree 3 3 retweet rate 3 7 retweet volume 3 7 favorite rate 3 7 favorite volume 3 7 # followers 7 7 # friends 7 7 # tweets 7 7 age on Twitter 7 7 datasets).9 A “3 (-)” means that the property is signi￿cantly lower
  76. Conclusions How to quantify controversy of a topic discussed in

    social media Collective attention increases polarization Periphery users tend to retweet core ones Interplay between content and network in echo chambers Evidence of selective exposure Bipartisan users pay a price in terms of network centrality 
 and content appreciation
  77. What's next? Joint opinion formation + network generation model Temporal

    dynamics of the process Application to other contexts (Reddit, Facebook) Interventions: can we do something about it?
  78. Thanks! Ask me two questions! 56 @gdfm7 gdfm@acm.org