Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Predicting Reciprocity in Social Networks

Justin Cheng
October 10, 2011

Predicting Reciprocity in Social Networks

Presented at SocialCom 2011.

When looking at how people interact on Twitter, how can network factors help us predict which interactions are reciprocal (i.e. both parties participating), and which aren't (i.e. one user pestering another)? What factors are best in predicting reciprocity?

Justin Cheng

October 10, 2011
Tweet

More Decks by Justin Cheng

Other Decks in Research

Transcript

  1. Predicting
    in Social Networks
    Justin Cheng Daniel M. Romero
    Brendan Meeder Jon Kleinberg
    Reciprocity

    View Slide

  2. View Slide

  3. In real life, people engage
    in conversations

    View Slide

  4. But lots of online
    communication is
    directed

    View Slide

  5. An @-message is sent from
    one user to another
    Is this a conversation?

    View Slide

  6. How about this?

    View Slide

  7. Why is A contacting B?
    or
    @ladygaga  
    @random_fan  

    View Slide

  8. Online relationships can be
    reciprocal or non-reciprocal

    View Slide

  9. A superposition of two networks

    View Slide

  10. Reciprocity can be
    subtle

    View Slide

  11. Given characteristics of
    two users, can we
    determine whether they
    know each other?
    ?
     

    View Slide

  12. How do we differentiate
    between symmetric and
    asymmetric interactions?
    ?
     

    View Slide

  13. Can we predict if a
    relationship is
    reciprocal?
    ?
     

    View Slide

  14. The @-message Graph
     
    ?
     

    View Slide

  15. v w
    ?
     
    G
    Predicting symmetry
    (SYM)
    Given a graph
    and a node pair , predict
    both and
    exist
    or only one of these does
    {v, w}
    v ! w w ! v
    G

    View Slide

  16. v w
    ?
     
    G
    Predicting a reverse
    edge (REV)
    Given the graph
    and that links to ,
    does link back to ?
    G
    v
    v
    w
    w

    View Slide

  17. The edge
    is reciprocated both and have
    sent at least messages to each
    other
    (v, w)
    v w
    k
    The edge
    is unreciprocated if sent at
    least messages to
    but sent none in return
    (v, w)
    v
    w
    k
    w

    View Slide

  18. sent messages  
    sent no messages  
    This relationship is reciprocated
    But this one is unreciprocated
    k
    sent messages  
    k
    sent messages  
    k

    View Slide

  19. Identify reciprocated and
    unreciprocated edges in , and
    for each of these edges, try to
    predict whether the relationship is
    reciprocal.
    G
    v w
    G
    ?
     

    View Slide

  20. Given the full network, hide only the
    link from to (if it exists).
    Try to predict whether the link
    actually exists.
    v w
    G
    ?
     
    v
    w

    View Slide

  21. Outline
    Features that might predict reciprocity and
    how well they work
    – Individually,
    – Or in combination
    The structure of the reciprocated and
    unreciprocated sub-networks

    View Slide

  22. Link reciprocity depends a lot on
    the relative status of two
    individuals
    @ladygaga  
    @average_joe  
    @average_jane  

    View Slide

  23. Link reciprocity prediction
    vs.
    Link prediction
    Liben-Nowell and Kleinberg (2004)  
    ?
     

    View Slide

  24. Link reciprocity prediction
    vs.
    Tie strength prediction
    Gilbert and Karahalios (2009)  
    S
     
    W
     

    View Slide

  25. Link reciprocity prediction
    vs.
    Sign prediction
    Leskovec, Huttenlocher and Kleinberg (2010)  
    +
     

     

    View Slide

  26. What are good
    indicators of
    reciprocity?

    View Slide

  27. For each feature, choose some
    threshold value above/below
    which we predict reciprocity to
    maximize accuracy.

    View Slide

  28. Outdegree-indegree Ratio
    deg+(v)
    deg (v)
    /
    deg+(w)
    deg (w)
    v w
    c  
    c  
    deg (v)
    deg+(v) deg (w)
    deg+(w)

    View Slide

  29. Individually,
    Outdegree-indegree ratio
    performed the best with
    82% accuracy

    View Slide

  30. A smaller outdegree-indegree
    ratio indicated reciprocation
    deg+(v)
    deg (v)
    /
    deg+(w)
    deg (w)
    v w
    c  
    c  

    View Slide

  31. A smaller outdegree-indegree
    ratio indicated reciprocation
    deg+(v) deg (w)
    deg (v) deg+(w)
    Ratio of Preferential Attachments  
    69% {  
    53% {  
    v w
    c  
    c  

    View Slide

  32. Other features we tried
    •  Indegree and outdegree
    •  Incoming and outgoing messages
    •  Incoming message – indegree ratio (and out)
    •  Two-step paths in both directions
    •  Two-step paths ratio
    •  Mutual in-neighbors and out-neighbors
    •  Jaccard’s coefficient
    •  Adamic/Adar’s page similarity measure

    View Slide

  33. Degree/Message
    Outdegree  
    Indegree  
    Outgoing Messages  
    Incoming Messages  
    And ratios between them  

    View Slide

  34. Two-step Hops
    v  
    w  
    v  
    w  
    v  
    w  
    v  
    w  
    Mutual Neighbors  
    Two-step paths  

    View Slide

  35. “Link prediction” features
    Jaccard’s coefficient =
    10 total neighbors  
    3 common neighbors  
    Common Neighbors
    Total Neighbors

    View Slide

  36. “Link prediction” features
    Preferential attachment
    Product of indegree of and outdegree of
    v w
    v w
    c  

    View Slide

  37. The Top 3
    Outdegree-indegree ratio  
    Two-step paths ratio  
    Indegree ratio  
    76%  
    76%  
    82%  

    View Slide

  38. But the outdegree-indegree
    ratio and two-step paths ratio
    seem
    suspiciously similar…

    View Slide

  39. v w
    c  
    c  
    Outdegree-indegree ratio  

    View Slide

  40. v w
    Two-step paths ratio  

    View Slide

  41. Marketer  
    Customers  
    Who’ll respond?  

    View Slide

  42. It is better to know
    about than
    in predicting a reverse
    edge
    v
    w

    View Slide

  43. So what happens when we use
    all the features we know?
    Link Pred Two-step
    Hops
    Deg/Msg Deg/Msg
    Ratio
    74% 80% 83% 86%

    View Slide

  44. Decision Tree Accuracy on
    Sets of Features
    74%
    80% 83%
    86%
    v
    w
    v
    w

    View Slide

  45. Decision Trees of Sets of
    Features
    80%
    74%
    83%
    86%

    View Slide

  46. In a decision tree of all attributes,
    Outdegree-Indegree Ratio  
    86%
    accuracy  
    (STILL)  

    View Slide

  47. Types of Edges
    Unreciprocated
    Reciprocated

    View Slide

  48. Clustering Coefficient
    0.19
    0.02
    Reciprocated
    Unreciprocated

    View Slide

  49. Are there two types of users
    on Twitter?
    “Reciprocators”  
    cf. informers and me-formers (Naaman et al.)  
    “Non-reciprocators”  

    View Slide

  50. Types of Nodes
    65 30 5
    Both Reciprocated Edges Only Unreciprocated Edges Only

    View Slide

  51. Most users take part in both
    reciprocated and
    unreciprocated interactions.
    @ladygaga  
    @average_joe  
    @friend_of_joe1  
    @friend_of_joe2  
    “I love your music @ladygaga!”  

    View Slide

  52. Social, reciprocal relationships
    are associated with active,
    continued use of Twitter.

    View Slide

  53. Features that approximate the
    relative status of two nodes
    seem most effective at
    predicting reciprocity between
    them.

    View Slide

  54. Social networks are a superposition
    of reciprocated and unreciprocated
    relationships
    Reciprocity affects how we
    experience these sites
    Using network features, we can
    predict reciprocity in relationships

    View Slide

  55. Thanks for Listening! Questions?
    Slide design heavily inspired by Paul Adams. Icons courtesy of The Noun Project.

    View Slide