Upgrade to Pro — share decks privately, control downloads, hide ads and more …

SNAMS-2019

Isa Dutse
October 24, 2019

 SNAMS-2019

Slides for SNAMS 2019

Isa Dutse

October 24, 2019
Tweet

More Decks by Isa Dutse

Other Decks in Research

Transcript

  1. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Outline Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion /
  2. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    The eccentricity of Twitter connections challenges mining tasks e.g. communitydetection & contentveracity (a) dyads (b) RT and hashtag (c) @mention /
  3. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    The eccentricity of Twitter connections challenges mining tasks e.g. communitydetection & contentveracity (a) dyads (b) RT and hashtag (c) @mention (d) Relative proportion Figure: Connection types on Twitter /
  4. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Di ficult to keep track of socially cohesive groups on Twitter ... the large network size a fects user’s ability to maintain a cohesive social relationship Twitter: ≈ M daily users & ≈ M content items www.omnicoreagency.com/twitter-statistics /
  5. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Di ficult to keep track of socially cohesive groups on Twitter ... the large network size a fects user’s ability to maintain a cohesive social relationship Twitter: ≈ M daily users & ≈ M content items ... inverse relationship between socialcohesion and networksize [ ] Figure: Classification of social groups and the degree of cohesiveness www.omnicoreagency.com/twitter-statistics /
  6. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Level of trust is stronger among cliques[ ] a group of nodes with reciprocal ties will be more helpful in mining tasks /
  7. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Level of trust is stronger among cliques[ ] a group of nodes with reciprocal ties will be more helpful in mining tasks cognitivebalancetheory: strong ties in a small group prevent unethical behaviour[ ] /
  8. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Level of trust is stronger among cliques[ ] a group of nodes with reciprocal ties will be more helpful in mining tasks cognitivebalancetheory: strong ties in a small group prevent unethical behaviour[ ] ... and socialhomophily [ ] /
  9. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Level of trust is stronger among cliques[ ] a group of nodes with reciprocal ties will be more helpful in mining tasks cognitivebalancetheory: strong ties in a small group prevent unethical behaviour[ ] ... and socialhomophily [ ] focus: Simmeliantie [ ], a strong social relationship within groupsofthreeormore synonymous to transitivity: people become friends with a friend-of-a-friend more easily[ ] /
  10. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Basic units of structural relationships – dyadic and transitive ties given the sets of nodes a, b, c, ...., n ∈ V and edges e , e , ...en ∈ E V, E ∈ D our target is to find the likelihood of reciprocity p(Ra,b ) between any pair of nodes /
  11. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Basic units of structural relationships – dyadic and transitive ties given the sets of nodes a, b, c, ...., n ∈ V and edges e , e , ...en ∈ E V, E ∈ D our target is to find the likelihood of reciprocity p(Ra,b ) between any pair of nodes we collected and analysed user profiles from m accounts (Table ) Table: Dataset Summary. C: Category; S: Seed Size; V: Visited users; P: Pairwise ties; T: Transitive ties; D: Search duration C S V P T D (min.) :verified , , , – , :verified , , , , – , :verified , , , , , :unverified , , , – , :unverified , , , , – , :unverified , , , , , ego-Twitter , – – – – /
  12. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    User-centric attributes to estimate likelihood of reciprocity propose a model to predict ties between users /
  13. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    User-centric attributes to estimate likelihood of reciprocity propose a model to predict ties between users modelled as functions of easily accessible features f ∈ X Figure: (a) Possible triads and (b) relevant features /
  14. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Distribution of ties and relevant metrics in the data For each user a followed by m users: a b = , if there is a reciprocal tie, , between a and b ∈ m a b = , otherwise /
  15. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Distribution of ties and relevant metrics in the data For each user a followed by m users: a b = , if there is a reciprocal tie, , between a and b ∈ m a b = , otherwise Usercategory: a higher proportion of reciprocal ties exists in the unverified users category /
  16. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    ECDF of relevant ties x axis: the measuredquantity y axis: a fraction of the data (%) Figure: ECDF of relevant ties /
  17. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Observations on the likelihood of reciprocity a user’s network size is likely to grow if the user: is verified has manyfollowers /
  18. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Observations on the likelihood of reciprocity a user’s network size is likely to grow if the user: is verified has manyfollowers the likelihoodofreciprocity is high if: /
  19. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Observations on the likelihood of reciprocity a user’s network size is likely to grow if the user: is verified has manyfollowers the likelihoodofreciprocity is high if: the user is unverified has a relatively large network /
  20. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Observations on the likelihood of reciprocity a user’s network size is likely to grow if the user: is verified has manyfollowers the likelihoodofreciprocity is high if: the user is unverified has a relatively large network unverifiedusers are more likely to reciprocate a followership /
  21. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Observations on the likelihood of reciprocity a user’s network size is likely to grow if the user: is verified has manyfollowers the likelihoodofreciprocity is high if: the user is unverified has a relatively large network unverifiedusers are more likely to reciprocate a followership users with large networks (> K) have a low proportion of reciprocated ties /
  22. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Likelihood of reciprocity and profile attributes Figure: The e fect of user’s attributes in enabling reciprocal ties /
  23. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Bayesian process of simulating the real data and making inference why? Due to the rarity of transitiveties /
  24. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Bayesian process of simulating the real data and making inference why? Due to the rarity of transitiveties Given a set of nodes X, Bayesian analysis is useful to estimate: mean reciprocity, mean values for features such as indegree,outdegree in the ground-truth data /
  25. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Bayesian process of simulating the real data and making inference why? Due to the rarity of transitiveties Given a set of nodes X, Bayesian analysis is useful to estimate: mean reciprocity, mean values for features such as indegree,outdegree in the ground-truth data we use the information represented into features to propose a prediction model reciprocitye fect: to understand why some users have reciprocated ties, and some not ... we propose a generative model to improve the prediction of reciprocal ties /
  26. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    A Log-linearmodel as a linear combination of the user’s attributes (Eq. ) the model enables the simulation of the observed data D and generate a synthetic version ˆ D indistinguishable from the observed information D yi = βui + γcui + ui ( ) where βui, γcui and ui denote meanreciprocityamongusers, meanreciprocitybetweenusers’ categories and errorterm respectively parameters in Eq. are treated as random variables specified by probability distribution functions p(·) Priorθ andlikelihoodf(y|θ, x)– represent set of variables that are likely to characterise the data informed by previous knowledge about the data we assume θi comes from a probability distribution that describes the individual di ference among users posteriorp(θ|D)orp(θ|y, x) is given as a function of the likelihood and the prior which is simply the evidence in the data based on Bayes’rule. The rule entails updating beliefs about θ given the observed data D /
  27. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Likelihood maximisation given the observed data Data sampling and posterior distribution Figure: Sampling results showing the errorterm,indegree and outdegree. Some of the samples are unstable, as evidenced by the perturbations in the results in the second column. /
  28. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Reciprocity Prediction: doessimilaritybetweenusers’attributesinducereciprocity? we use attributes of nodes to predict reciprocal tie we focus on easily accessible attributes: Xf , networksize,indegree(ind),outdegree(out), category(cat), thus: {ind, out, cat} ⊂ Xf for a pair of nodes vi, vj, their corresponding features are given by: Xfvi = {indvi , outvi , catvi }, Xfvj = {indvj , outvj , catvj } /
  29. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Reciprocity Prediction: doessimilaritybetweenusers’attributesinducereciprocity? we use attributes of nodes to predict reciprocal tie we focus on easily accessible attributes: Xf , networksize,indegree(ind),outdegree(out), category(cat), thus: {ind, out, cat} ⊂ Xf for a pair of nodes vi, vj, their corresponding features are given by: Xfvi = {indvi , outvi , catvi }, Xfvj = {indvj , outvj , catvj } ... ratio of corresponding attributes, e.g.indvi indvj ∈ R ∀f ∈ Xfvi,vj ... if the ration lies [ . , . ], return similarity ( ), else dissimilarity ( ) ... the resulting binaries are used to compute the overall similarity between pairs using JaccardSimilarityCoe ficient,J (Eq. ): J(Xfvi , Xfvj ) = |Xfvi ∩ Xfvj | |Xfvi ∪ Xfvj | ( ) /
  30. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Each prediction is associated with a decision error ... modelled using a probabilisticpreferencemodel or responseprobability response probability capture various scenarios in which an actor is o fered a set of features ... and the decision process is associated with a constant probability of making an error in the choice, tremblinghanderror [ ] thus, the error term vi,vj (Eq. ) is given as a function of the similarity index J(vi, vj ) (Eq. ), between pairs: vi,vj = ζ × ( + log(J(vi, vj ) + ζ)) ( ) ζ corresponds to the constanterrorterm and the final relation is given by: (Eq. ): p(Rvi,vj ) = + exp ϕ ( ) where: ϕ = − log( vi,vj + J(vi, vj )) × ( vi,vj + J(vi, vj )) we use ζ ≥ . and each item in the predicted ties, κ, satisfies (Eq. ) /
  31. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Eq. enables the computation of the probability of reciprocity we can identify as many nodes with a high likelihood of establishing reciprocal ties as required adding a layer of social cohesion related to community detection /
  32. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Eq. enables the computation of the probability of reciprocity we can identify as many nodes with a high likelihood of establishing reciprocal ties as required adding a layer of social cohesion related to community detection The likelihood of a reciprocal tie between any pair of users: L(Rvi,vj ) = − f∈χs ( − p(Rvi,vj )) ( ) Defines a generative process where p(Rvi,vj ) is the marginal reciprocity e fect of each feature f ∈ Xf /
  33. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Utility of Simmelian ties Content veracity: Simmelian ties represent strong relationships, vital in preventing unethical behaviours [ , ] using Simmelian ties allows to enhance the quality of interactions on Twitter We use the term hop-skippers á la [ ] /
  34. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Utility of Simmelian ties Content veracity: Simmelian ties represent strong relationships, vital in preventing unethical behaviours [ , ] using Simmelian ties allows to enhance the quality of interactions on Twitter information di fusion: ... some nodes act as hop-skippers , i.e. users with many reciprocal ties they connecting disparate community parts We use the term hop-skippers á la [ ] /
  35. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Hop-skippers are helpful in community detection Figure: An Example of users with reciprocated ties /
  36. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Enhancing community detection clustering using a collection of nodes with Simmelian ties – ground-truth vs. predicted we use two state-of-the-art community detection algorithms: Girvan-Newman(G-N) [ ] and LabelPropagation(LP) [ ] /
  37. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Community detection result Table: Community detection on three di ferent datasets using two di ferent algorithms: G–N and LP: Girvan–Neuman and Label Propagation #DC: Number of Detected Communities G–N LP Dataset Metric #DC Metric #DC Q NMI Q NMI Ground-truth . . . . ego-Twitter . . . . Predicted . . . . /
  38. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Simelian ties empirically quantify and evaluate social relationships challenging mining-relatedtasks – e.g., clustering, information di fusion & content veracity https://github.com/ijdutse/simmelian_ties_on_Twitter /
  39. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Simelian ties empirically quantify and evaluate social relationships challenging mining-relatedtasks – e.g., clustering, information di fusion & content veracity Simmelianties exhibit useful behaviour such as: connecting large groups of users we demonstrated how Simmelianties can be leveraged to improve such tasks https://github.com/ijdutse/simmelian_ties_on_Twitter /
  40. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Simelian ties empirically quantify and evaluate social relationships challenging mining-relatedtasks – e.g., clustering, information di fusion & content veracity Simmelianties exhibit useful behaviour such as: connecting large groups of users we demonstrated how Simmelianties can be leveraged to improve such tasks we proposed a way to identify Simmelianties on Twitter the data is freely available for further analysis https://github.com/ijdutse/simmelian_ties_on_Twitter /
  41. Introduction Our Approach Prediction Framework Usefulness of Simmelian ties Conclusion

    Reference See the full paper for further information and details about the references: Inuwa-Dutse, I., Liptrott, M. and Korkontzelos, Y., . Simmelian ties on Twitter: empirical analysis and prediction. SixthInternationalConferenceonSocialNetworks Analysis,ManagementandSecurity(SNAMS). IEEE. /