$30 off During Our Annual Pro Sale. View Details »

rinkou_02

tom--bo
July 20, 2016

 rinkou_02

Learning "Information network or Social network?: the structure of the twitter follow graph"

tom--bo

July 20, 2016
Tweet

More Decks by tom--bo

Other Decks in Research

Transcript

  1. Information  network  or  Social  network?:   the  structure  of  the

     twitter  follow  graph tom_̲_̲bo Proceedings  of  the  23rd International  Conference  on  World  Wide  Web, Seth  A.  Myers,  Aneesh Sharma,  Pankaj  Gupta,  Jimmy  Lin 2014,  Pages  494-‑498
  2. *OUSPEVDUJPO • This  paper  provides • a  characterization  of  the

     topological  features  of  the  Twitter  follow   graph • analyzing  properties  such  as  ... • degree  distributions • connected  components • shortest  path  lengths • clustering  coefficients • degree  assortativity • Goal   1. present  a  set  of  authoritative  descriptive  statistics 2. use  these  characterizations  to  offer  new  insight  into  a  question which  is  “  Is  Twitter  a  social  network  or  an  information  network?
  3. %FGJOJUJPO • Social  network   • high degree  assortativity •

    small shortest  path  lengths • large connected  components • high clustering  coefficients • high degree  of  reciprocity • Information  network • large vertex  degrees • a  lack of  reciprocity • large two-‑hop  neighborhoods 4PDJBMOFUXPSL*OGPSNBUJPOOFUXPSL
  4. %BUB • Twitter  follow  graph  (second  half  of  2012) •

    175  million  active  users • 20  billion  edges • 3  countries  and  complete • Brazil  (BR) • Japan  (JP) • United  States  (US) • Graph • Directed  graph • 42%  reciprocated • (4  billion  undirected  edges)
  5. $POUSBTU • contrasted  with  studies  of  other  social  networks •

    Facebook • 721  million  vertices • 68.7  billion  undirected  edges • MSN  Messenger • 180  million  vertices • 1.3  billion  undirected  edges
  6. (SBQI$IBSBDUFSJTUJDT • Twitter  follow  graph  is  directed. • inbound  degree

    • in-‑degree • the  number  of  users  who  follow  them • outbound  degree • out-‑degree • the  number  of  users  who  they  follow %FHSFF%JTUSJCVUJPOT
  7. (SBQI$IBSBDUFSJTUJDT %FHSFF%JTUSJCVUJPOT

  8. (SBQI$IBSBDUFSJTUJDT • heavy  tail (  both  in-‑degree  and  out-‑degree) •

    some  users  follow  hundreds  of  thousands  of  accounts • celebrities  choose  to  reciprocate  the  follows  for  their  fans • ”non-‑social”  behavior • individuals  can  only  maintaiin around  150  relationships • inconsistent  with  that  of  social  network • too  many  social  relationships  as  the  out-‑degrees %FHSFF%JTUSJCVUJPOT
  9. (SBQI$IBSBDUFSJTUJDT • strongly/weakly  connected • weakly  connected • connectivity  ignores

     edge  direction • strongly  connected • a  pair  of  vertices  must  be  reachable  through  a  directed  path $POOFDUFE$PNQPOFOUT
  10. (SBQI$IBSBDUFSJTUJDT $POOFDUFE$PNQPOFOUT • weakly  connected  component • largest  =>  92.9%

     of  all  active  users • strongly  connected  component • lagees =>  68.7% • less  than  Facebook  and  MSN  Messenger  (99%) • Twitter  graph  is  less  well  connected  as  social  network.
  11. (SBQI$IBSBDUFSJTUJDT • Shortest  Path  Lengths  :  the  number  of  traversals

     along   edges  required  to  reach  one  from  another • Infeasible  to  identify  exact  shortest  path  lengths • Hyper  ANF  algorithm • probabilistic  estimation • HyperLogLog counter 4IPSUFTU1BUI-FOHUIT
  12. (SBQI$IBSBDUFSJTUJDT 4IPSUFTU1BUI-FOHUIT

  13. (SBQI$IBSBDUFSJTUJDT • Twitter  average • 4.17 • mutual  =>  4.05

    • Other • Facebook  =>  4.74 • MSN  messenger  =>  6.6  (mutual) • Twitter  follow  graph  exhibits  properties  that  are  consistent   with  a  social  network 4IPSUFTU1BUI-FOHUIT
  14. (SBQI$IBSBDUFSJTUJDT • the  fraction  of  users  whose  friends  are  themselves

     friends • Twitterʼ’s  clustering  coefficient  is  lower  than  Facebook,  but   still  high. $MVTUFSJOH$PFGGJDJFOU
  15. (SBQI$IBSBDUFSJTUJDT • idiosyncrasy  in  the  Japan  subgraph • higher  clustering

     cofficient • reciprocity  is  much  higher • higher  edge  to  vertex  ratio • increase  at  degree  of  200  and  peaks  at  1000 • (possible  explanation)  members  of  clique • Twitter  mutual  graph  exhibits  characteristics  that  are   consistent  with  a  social  networks $MVTUFSJOH$PFGGJDJFOU
  16. (SBQI$IBSBDUFSJTUJDT • set  of  vertices  that  are  neighbors  of  a

     vertexʼ’s  neighbors • outbound/inbound • outbound  two-‑hop  neighborhood  characterizes =>  “information  gathering  potential” • inbound  two-‑hop  neighborhood  characterizes =>  “information  dissemination  potential” • unique/non-‑unique • unique  :  no  overlap • non-‑unique  :  simply  sum  of  two-‑hop  neighbourds 5XP)PQ/FJHICPSIPPET
  17. (SBQI$IBSBDUFSJTUJDT 5XP)PQ/FJHICPSIPPET Twitter behaves  efficiently  as  an  information  network.

  18. (SBQI$IBSBDUFSJTUJDT • preference  for  a  graphʼ’s  vertices  to  attach  to

     others  that  are   similar  (or  disimilar)  in  degree • between  0.1  and  0.4  in  social  network • Facebook  :  0.226 %FHSFF"TTPSUBUJWJUZ
  19. (SBQI$IBSBDUFSJTUJDT • SOD  vs.  DOD • positive  correlation  0.272 •

    “the  more  people  you  follow,  the  more  people  that  those  people  are   likely  to  follow” • SID  vs.  DOD • positive  correlation  0.241 • “the  more  popular  you  are,  the  people  you  follow  will  end  to  follow   more  people” %FHSFF"TTPSUBUJWJUZ
  20. (SBQI$IBSBDUFSJTUJDT • SOD  vs.  DID • negative  correlation  -‑0.118 •

    “the  more  people  you  follow,  the  less  popular  those  people  are   likely  to  be” • SID  vs.  DID • negative  correlation  -‑0.296 • “the  more  popular  you  are,  the  less  popular  the  people  you  follow   are” %FHSFF"TTPSUBUJWJUZ
  21. %JTTDVTTJPO • Twitter  behaves  more  like  an  information  network,  but

     other   analyses  show  that  it  exhibits  characteristics  consistent  with   social  networks. • Twitter  starts  as  information  network, evolves  to  behave  more  like  a  social  network • New  user  choose  popular  accounts(preferential  attachment) • ↓ • User  follows  more  people  and  become  more  ”experienced”,   the  user  discovers  a  community  with  which  to  engage  
  22. 'VUVSFXPSLBOE$PODMVTJPO • Summary • This  paper  present  evidence  that  Twitter

     differs  from  previously-‑ studied  social  networks   • Also  social  properties  as  well • Hypothesis • There  are  two  major  “modes” • Information  consumption • reciprocated  social  ties • Further  analyze  this  mixture  is  needed • Intuitive  level,  this  hybrid  structure  seems  to  be  plausible