Justin Cheng
October 10, 2011
280

# Predicting Reciprocity in Social Networks

Presented at SocialCom 2011.

When looking at how people interact on Twitter, how can network factors help us predict which interactions are reciprocal (i.e. both parties participating), and which aren't (i.e. one user pestering another)? What factors are best in predicting reciprocity?

October 10, 2011

## Transcript

1. Predicting
in Social Networks
Justin Cheng Daniel M. Romero
Brendan Meeder Jon Kleinberg
Reciprocity

2. In real life, people engage
in conversations

3. But lots of online
communication is
directed

4. An @-message is sent from
one user to another
Is this a conversation?

6. Why is A contacting B?
or
@random_fan

7. Online relationships can be
reciprocal or non-reciprocal

8. A superposition of two networks

9. Reciprocity can be
subtle

10. Given characteristics of
two users, can we
determine whether they
know each other?
?

11. How do we differentiate
between symmetric and
asymmetric interactions?
?

12. Can we predict if a
relationship is
reciprocal?
?

13. The @-message Graph

?

14. v w
?

G
Predicting symmetry
(SYM)
Given a graph
and a node pair , predict
both and
exist
or only one of these does
{v, w}
v ! w w ! v
G

15. v w
?

G
Predicting a reverse
edge (REV)
Given the graph
G
v
v
w
w

16. The edge
is reciprocated both and have
sent at least messages to each
other
(v, w)
v w
k
The edge
is unreciprocated if sent at
least messages to
but sent none in return
(v, w)
v
w
k
w

17. sent messages
sent no messages
This relationship is reciprocated
But this one is unreciprocated
k
sent messages
k
sent messages
k

18. Identify reciprocated and
unreciprocated edges in , and
for each of these edges, try to
predict whether the relationship is
reciprocal.
G
v w
G
?

19. Given the full network, hide only the
link from to (if it exists).
Try to predict whether the link
actually exists.
v w
G
?

v
w

20. Outline
Features that might predict reciprocity and
how well they work
– Individually,
– Or in combination
The structure of the reciprocated and
unreciprocated sub-networks

21. Link reciprocity depends a lot on
the relative status of two
individuals
@average_joe
@average_jane

vs.
Liben-Nowell and Kleinberg (2004)
?

vs.
Tie strength prediction
Gilbert and Karahalios (2009)
S

W

vs.
Sign prediction
Leskovec, Huttenlocher and Kleinberg (2010)
+

25. What are good
indicators of
reciprocity?

26. For each feature, choose some
threshold value above/below
which we predict reciprocity to
maximize accuracy.

27. Outdegree-indegree Ratio
deg+(v)
deg (v)
/
deg+(w)
deg (w)
v w
c
c
deg (v)
deg+(v) deg (w)
deg+(w)

28. Individually,
Outdegree-indegree ratio
performed the best with
82% accuracy

29. A smaller outdegree-indegree
ratio indicated reciprocation
deg+(v)
deg (v)
/
deg+(w)
deg (w)
v w
c
c

30. A smaller outdegree-indegree
ratio indicated reciprocation
deg+(v) deg (w)
deg (v) deg+(w)
Ratio of Preferential Attachments
69% {
53% {
v w
c
c

31. Other features we tried
•  Indegree and outdegree
•  Incoming and outgoing messages
•  Incoming message – indegree ratio (and out)
•  Two-step paths in both directions
•  Two-step paths ratio
•  Mutual in-neighbors and out-neighbors
•  Jaccard’s coefficient

32. Degree/Message
Outdegree
Indegree
Outgoing Messages
Incoming Messages
And ratios between them

33. Two-step Hops
v
w
v
w
v
w
v
w
Mutual Neighbors
Two-step paths

Jaccard’s coefficient =
10 total neighbors
3 common neighbors
Common Neighbors
Total Neighbors

Preferential attachment
Product of indegree of and outdegree of
v w
v w
c

36. The Top 3
Outdegree-indegree ratio
Two-step paths ratio
Indegree ratio
76%
76%
82%

37. But the outdegree-indegree
ratio and two-step paths ratio
seem
suspiciously similar…

38. v w
c
c
Outdegree-indegree ratio

39. v w
Two-step paths ratio

40. Marketer
Customers
Who’ll respond?

41. It is better to know
in predicting a reverse
edge
v
w

42. So what happens when we use
all the features we know?
Hops
Deg/Msg Deg/Msg
Ratio
74% 80% 83% 86%

43. Decision Tree Accuracy on
Sets of Features
74%
80% 83%
86%
v
w
v
w

44. Decision Trees of Sets of
Features
80%
74%
83%
86%

45. In a decision tree of all attributes,
Outdegree-Indegree Ratio
86%
accuracy
(STILL)

46. Types of Edges
Unreciprocated
Reciprocated

47. Clustering Coefficient
0.19
0.02
Reciprocated
Unreciprocated

48. Are there two types of users
“Reciprocators”
cf. informers and me-formers (Naaman et al.)
“Non-reciprocators”

49. Types of Nodes
65 30 5
Both Reciprocated Edges Only Unreciprocated Edges Only

50. Most users take part in both
reciprocated and
unreciprocated interactions.
@average_joe
@friend_of_joe1
@friend_of_joe2

51. Social, reciprocal relationships
are associated with active,

52. Features that approximate the
relative status of two nodes
seem most effective at
predicting reciprocity between
them.

53. Social networks are a superposition
of reciprocated and unreciprocated
relationships
Reciprocity affects how we
experience these sites
Using network features, we can
predict reciprocity in relationships

54. Thanks for Listening! Questions?
Slide design heavily inspired by Paul Adams. Icons courtesy of The Noun Project.