Information network or Social network?: the structure of the twitter follow graph tom_̲_̲bo Proceedings of the 23rd International Conference on World Wide Web, Seth A. Myers, Aneesh Sharma, Pankaj Gupta, Jimmy Lin 2014, Pages 494-‑498
*OUSPEVDUJPO • This paper provides • a characterization of the topological features of the Twitter follow graph • analyzing properties such as ... • degree distributions • connected components • shortest path lengths • clustering coefficients • degree assortativity • Goal 1. present a set of authoritative descriptive statistics 2. use these characterizations to offer new insight into a question which is “ Is Twitter a social network or an information network?
%FGJOJUJPO • Social network • high degree assortativity • small shortest path lengths • large connected components • high clustering coefficients • high degree of reciprocity • Information network • large vertex degrees • a lack of reciprocity • large two-‑hop neighborhoods 4PDJBMOFUXPSL*OGPSNBUJPOOFUXPSL
(SBQI$IBSBDUFSJTUJDT • Twitter follow graph is directed. • inbound degree • in-‑degree • the number of users who follow them • outbound degree • out-‑degree • the number of users who they follow %FHSFF%JTUSJCVUJPOT
(SBQI$IBSBDUFSJTUJDT • heavy tail ( both in-‑degree and out-‑degree) • some users follow hundreds of thousands of accounts • celebrities choose to reciprocate the follows for their fans • ”non-‑social” behavior • individuals can only maintaiin around 150 relationships • inconsistent with that of social network • too many social relationships as the out-‑degrees %FHSFF%JTUSJCVUJPOT
(SBQI$IBSBDUFSJTUJDT • strongly/weakly connected • weakly connected • connectivity ignores edge direction • strongly connected • a pair of vertices must be reachable through a directed path $POOFDUFE$PNQPOFOUT
(SBQI$IBSBDUFSJTUJDT $POOFDUFE$PNQPOFOUT • weakly connected component • largest => 92.9% of all active users • strongly connected component • lagees => 68.7% • less than Facebook and MSN Messenger (99%) • Twitter graph is less well connected as social network.
(SBQI$IBSBDUFSJTUJDT • Shortest Path Lengths : the number of traversals along edges required to reach one from another • Infeasible to identify exact shortest path lengths • Hyper ANF algorithm • probabilistic estimation • HyperLogLog counter 4IPSUFTU1BUI-FOHUIT
(SBQI$IBSBDUFSJTUJDT • the fraction of users whose friends are themselves friends • Twitterʼ’s clustering coefficient is lower than Facebook, but still high. $MVTUFSJOH$PFGGJDJFOU
(SBQI$IBSBDUFSJTUJDT • idiosyncrasy in the Japan subgraph • higher clustering cofficient • reciprocity is much higher • higher edge to vertex ratio • increase at degree of 200 and peaks at 1000 • (possible explanation) members of clique • Twitter mutual graph exhibits characteristics that are consistent with a social networks $MVTUFSJOH$PFGGJDJFOU
(SBQI$IBSBDUFSJTUJDT • preference for a graphʼ’s vertices to attach to others that are similar (or disimilar) in degree • between 0.1 and 0.4 in social network • Facebook : 0.226 %FHSFF"TTPSUBUJWJUZ
(SBQI$IBSBDUFSJTUJDT • SOD vs. DOD • positive correlation 0.272 • “the more people you follow, the more people that those people are likely to follow” • SID vs. DOD • positive correlation 0.241 • “the more popular you are, the people you follow will end to follow more people” %FHSFF"TTPSUBUJWJUZ
(SBQI$IBSBDUFSJTUJDT • SOD vs. DID • negative correlation -‑0.118 • “the more people you follow, the less popular those people are likely to be” • SID vs. DID • negative correlation -‑0.296 • “the more popular you are, the less popular the people you follow are” %FHSFF"TTPSUBUJWJUZ
%JTTDVTTJPO • Twitter behaves more like an information network, but other analyses show that it exhibits characteristics consistent with social networks. • Twitter starts as information network, evolves to behave more like a social network • New user choose popular accounts(preferential attachment) • ↓ • User follows more people and become more ”experienced”, the user discovers a community with which to engage
'VUVSFXPSLBOE$PODMVTJPO • Summary • This paper present evidence that Twitter differs from previously-‑ studied social networks • Also social properties as well • Hypothesis • There are two major “modes” • Information consumption • reciprocated social ties • Further analyze this mixture is needed • Intuitive level, this hybrid structure seems to be plausible