Finding Leaders with Maximum Spread of Influence through Social Networks

Finding Leaders with Maximum Spread of Influence through Social Networks
Tsung An Yeh, En Tzu Wang, and Arbee L.P. Chen Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan [email protected] Cloud Computing Center for Mobile Applications, Industrial Technology Research Institute, Hsinchu, Taiwan [email protected] Department of Computer Science, National Chengchi University, Taipei, Taiwan [email protected] December 12, 2012

Introduction • Social network: G = (V, E) –V: users
–E: relationships • Social influence December 12, 2012

Motivation • Influence maximization –Seed set: S ⊆ V –|S|=
k December 12, 2012 Seed k = 3

• Social influence –Propagation probability Problem Definition December 12, 2012
Inactive Active 0.8 V1 V2

• Scalable Influence Maximization for Prevalent Viral Marketing in Large-
Scale Social Networks, W. Chen, C. Wang and Y. Wang (KDD’10) – MIA Model • Activation probability • Incremental influence spread December 12, 2012 Problem Definition

December 12, 2012 1 0.5 0.5 0.5 0.25 Activation Probability
ap(v, S) Problem Definition

December 12, 2012 1 1 0.7 0.5 0.3 1 v1
v2 v3 v v1 v2 v3 Activation Probability ap(v, S) ap(v, S) = 1 – (1 – 1 × 0.7)(1 – 1 × 0.3)(1 – 1 × 0.5) = 1 – 0.3 × 0.7 × 0.5 = 0.895 Problem Definition

December 12, 2012 1 Influence Maximization 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 ∑vϵV ap(v, S) Problem Definition

December 12, 2012 0 0.5 0.4 0 0 Incremental Influence
Spread IncInf(v, S) 0 0 0 0.3 0.8 0.6 Problem Definition

December 12, 2012 1 0.5 0.4 0 0 Incremental Influence
Spread IncInf(v, S) 0 0 0 0.3 0.8 0.6 Problem Definition

December 12, 2012 1 0.5 0.4 0.5 0.3 Incremental Influence
Spread IncInf(v, S) 0.4 0.3 0.4 0.3 0.8 0.6 IncInf(v, S) = 0.4 + 0.3 + 0.5 + 0.4 + 0.3 = 1.9 Problem Definition IncInf(v, S) = (1 – 0)*(0.4 + 0.3 + 0.5 + 0.4 + 0.3) = 1.9

December 12, 2012 Greedy(KDD’03) Related Works V1 V2 V3 …
Influence Spread Vi

Greedy(KDD’03) Related Works December 12, 2012 V1 V2 V3 …
Influence Spread Vi

• Power-law distribution Lower bound k << |V| December 12,
2012 degree user Influence Spread

• ap(v, S) = 1 – (1 – 1*0.7)(1 –
1*0.3)(1 – 1*0.5) = 0.895 • ap(v, S) • ap(v, S) = 1*0.7 + 1*0.3(1 – 1*0.7) + 1*0.5(1 – 1*0.7)(1 – 1*0.3) = 0.7 + 0.09 + 0.105 = 0.895 Observations = 1*0.3 + 1*0.7(1 – 1*0.3) + 1*0.5(1 – 1*0.3)(1 – 1*0.7) = 0.3 + 0.49 + 0.105 = 0.895 0.7 0.79 0.895 0.3 0.79 0.895 v3 v2 v1 v3 v1 v2 December 12, 2012 1 1 0.7 0.5 0.3 1 v1 v2 v3 v v1 v2 v3 v1 v2 v3

Observations Out-path Node December 12, 2012 0.4 0.2 0.3 0.5
0.2 0.2 0.4 0.3 0.2 0.3 0.3 0.2 1 v2 v1 0 IncInf(v1 , S) = (1 – 0)*(0.4 + 0.2 + 0.3) = 0.9 IncInf’(v1 , S) = (1 – 0)*(0.4 + 0 + 0.3) = 0.7

Observations IncInf(v1 , S) = (1 – 0)*(0.4 + 0.2
+ 0.3) = 0.9 In-path Node December 12, 2012 v1 0.4 0.2 0.3 0.5 0.2 0.2 0.4 0.3 0.2 0.3 0.3 0.2 0 0.1 0.15 0.2 1 IncInf(v2 , S) = (1 – 0)*(0.5 + 0.2 + 0.1 + 0.15) = 0.95 IncInf’(v1 , S) = (1 – 0.5)*(0.4 + 0.2 + 0.3) = 0.45 0.5 v2

Observations Competition Node December 12, 2012 v1 0.4 0.2 0.3
0.5 0.2 0.2 0.4 0.3 0.2 0.3 0.3 0.2 0 IncInf(v1 , S) = (1 – 0)*(0.4 + 0.2 + 0.3) = 0.9 v2 v’ 1 ap(v’, S) = 1 – (1 – 1*0.3)(1 – 1*0.4)(1 – 1*0.2) = 1*0.3 + 1*0.4(1 – 1*0.3) + 1*0.2(1 – 1*0.3)(1 – 1*0.4) = 0.3 + 0.28 + 0.084 = 0.664 ap(v’, S) = 1 – (1 – 1*0.3)(1 – 1*0.4)(1 – 1*0.2) = 1*0.4 + 1*0.3(1 – 1*0.4) + 1*0.2(1 – 1*0.4)(1 – 1*0.3) = 0.4 + 0.18 + 0.084 = 0.664 IncInf’(v1 , S) = (1 – 0)*(0.4 + 0.2 + 0.3(1 – 1*0.4)) = 0.4 + 0.2 + 0.18 = 0.78 Competition from v2 v1 v2 v1 v2

In-path Node Competition Node Out-path Node December 12, 2012 v1
0.4 0.2 0.3 0.5 0.2 0.2 0.4 0.3 0.2 0.3 0.3 0.2 0 IncInf(v1 , S) = (1 – 0)*(0.4 + 0.2 + 0.3) = 0.9 Correlated with user v1 … … …

December 12, 2012 Greedy(KDD’03) V1 V2 V3 … Influence Spread
Vi

December 12, 2012 Greedy(KDD’03) V1 V2 V3 … Influence Spread
Vi Minimum Influence MinInf(v) Maximum Influence MaxInf(v)

December 12, 2012 MinInf ≤ IncInf ≤ MaxInf Worst case
Best case

December 12, 2012 k = 5 Lower Bound (LB): kth
largest minimum influence V1 V2 V3 … Influence Spread

December 12, 2012 v MaxInf > MaxInf(v) MaxInf < MaxInf(v)
Correlated with user v

December 12, 2012 v Correlated with user v MaxInf(v) (Best
case) v IncInf(v) 1 1 1 1 1 1 1 1 v 1 MinInf(v) (Worst case)

December 12, 2012 k = 5 Lower Bound (LB): kth
largest minimum influence V1 V2 V3 … Influence Spread

Methods ECE(Efficient Candidates Elimination Algorithm) December 12, 2012 V1 V2
V3 … Influence Spread 1. Compute MaxInf 2. Sort by MaxInf 3. Compute Lower Bound 4. Prune nodes 5. Select node into seed set 6. Update candidate nodes 7. Prune nodes 8. Sort candidate nodes 9. Optimal Pruning

Methods … MaxInf = 25 12 18 16 22 17
12 16 13 13 … … … … … MinInf = 25 Unstable December 12, 2012 1. Compute MaxInf 2. Sort by MaxInf 3. Compute Lower Bound 4. Prune nodes 5. Select node into seed set 6. Update candidate nodes 7. Prune nodes 8. Sort candidate nodes 9. Optimal Pruning

Methods … MaxInf = 25 12 18 16 22 17
12 16 13 13 … … … … … MinInf = 25 15 1 1 December 12, 2012 1. Compute MaxInf 2. Sort by MaxInf 3. Compute Lower Bound 4. Prune nodes 5. Select node into seed set 6. Update candidate nodes 7. Prune nodes 8. Sort candidate nodes 9. Optimal Pruning

Methods … MaxInf = 25 12 18 16 22 17
12 16 13 13 … … … … … MinInf = 25 15 14 1 1 1 1 December 12, 2012 1. Compute MaxInf 2. Sort by MaxInf 3. Compute Lower Bound 4. Prune nodes 5. Select node into seed set 6. Update candidate nodes 7. Prune nodes 8. Sort candidate nodes 9. Optimal Pruning

Methods … MaxInf = 25 12 18 16 22 17
12 16 13 13 … … … … … MinInf = 25 15 14 13 1 1 December 12, 2012 1. Compute MaxInf 2. Sort by MaxInf 3. Compute Lower Bound 4. Prune nodes 5. Select node into seed set 6. Update candidate nodes 7. Prune nodes 8. Sort candidate nodes 9. Optimal Pruning

Methods ECE(Efficient Candidates Elimination Algorithm) k = 5 December 12,
2012 Lower Bound V1 V2 V3 … Influence Spread 1. Compute MaxInf 2. Sort by MaxInf 3. Compute Lower Bound 4. Prune nodes 5. Select node into seed set 6. Update candidate nodes 7. Prune nodes 8. Sort candidate nodes 9. Optimal Pruning

2012 Candidate Nodes V1 V2 V3 … Influence Spread 1. Compute MaxInf 2. Sort by MaxInf 3. Compute Lower Bound 4. Prune nodes 5. Select node into seed set 6. Update candidate nodes 7. Prune nodes 8. Sort candidate nodes 9. Optimal Pruning

2012 V1 V2 V3 … Influence Spread 1. Compute MaxInf 2. Sort by MaxInf 3. Compute Lower Bound 4. Prune nodes 5. Select node into seed set 6. Update candidate nodes 7. Prune nodes 8. Sort candidate nodes 9. Optimal Pruning

Methods ECE(Efficient Candidates Elimination Algorithm) k = 5 Optimal Pruning
December 12, 2012 V1 V2 V3 … Influence Spread 1. Compute MaxInf 2. Sort by MaxInf 3. Compute Lower Bound 4. Prune nodes 5. Select node into seed set 6. Update candidate nodes 7. Prune nodes 8. Sort candidate nodes 9. Optimal Pruning

2012 Initial Optimal Pruning V1 V2 V3 … Influence Spread 1. Compute MaxInf 2. Sort by MaxInf 3. Compute Lower Bound 4. Prune nodes 5. Select node into seed set 6. Update candidate nodes 7. Prune nodes 8. Sort candidate nodes 9. Optimal Pruning

December 12, 2012 Experiments • Datasets NetHEPT NetPHY Epinions Slashdot
0902 Amazon 0302 Amazon 0601 Node 15,233 37,154 75,888 82,168 262,111 403,394 Edge 32,235 180,826 508,837 948,465 1,234,877 3,387,388 Maximal Out Degree 44 131 1801 2511 5 10 Maximal In Degree 60 152 3035 2553 420 2751 Average Degree 2.12 4.87 6.71 11.54 4.71 8.40

December 12, 2012 Experiments 14.87 43.37 12.93 257.72 721.82 10.27

December 12, 2012 Experiments 0 100 200 300 400 500
600 Running time (sec) 10 9 8 7 6 5 4 3 2 1 Initial NetHEPT 0 5000 10000 15000 20000 25000 Running time (sec) 10 9 8 7 6 5 4 3 2 1 Initial Epinions 0 750 1500 2250 3000 3750 4500 Running time (sec) 10 9 8 7 6 5 4 3 2 1 Initial NetPHY 0 7500 15000 22500 30000 37500 45000 Running time (sec) 10 9 8 7 6 5 4 3 2 1 Initial Slashdot0902

December 12, 2012 Experiments • Pruning performance (k = 10)
NetHEPT NetPHY Epinions Slashdot 0902 Amazon 0302 Amazon 0601 Node 15,233 37,154 75,888 82,168 262,111 403,394 Initial Candidate 15 11 10 10 10 13 Pruning Rate 99.90% 99.97% 99.99% 99.99% 99.99% 99.99% Time 0.69% 0.16% 1.42% 1.23% 0.01% 0.01% Optimal Pruning O X O O O X

Conclusions • We propose the ECE algorithm to solve the
problem of influence maximization in social networks. • The ECE algorithm is based on a pruning strategy which can prune more than 96% of nodes and only cost about 1% of the total running time. • The experiment results show that the running time of ECE is as efficient as CELF and outperforming Greedy and PMIA. • When ECE achieve initial optimal pruning, it is much faster than CELF. December 12, 2012

Q&A December 12, 2012

• Maximizing the Spread of Influence through a Social Network,
D. Kempe, J. Kleinberg and É. Tardos (KDD’03) – NP-hard • Greedy algorithm • 1 – 1/e (≒63%) of optimal Related Works December 12, 2012 6 7 10 k = 2

• Cost-effective Outbreak Detection in Networks, J. Leskovec, A. Krause,
C. Guestrin, C. Faloutsos, J. VanBriesen and N. Glance (KDD’07) – Lazy-forward Related Works December 12, 2012

CELF(KDD’07) Related Works December 12, 2012 V1 V2 V3 …
Influence Spread Vi

CELF(KDD’07) Related Works December 12, 2012 Vi V2 V3 …
Influence Spread

PMIA(KDD’10) Related Works December 12, 2012

December 12, 2012 Related Works V1 V2 V3 … Influence
Spread Vi PMIA(KDD’10)

Related Works December 12, 2012 V1 V2 V3 … Influence
Spread Vi PMIA(KDD’10)

Finding Leaders with Maximum Spread of Influenc...

Finding Leaders with Maximum Spread of Influence through Social Networks

More Decks by jos

Other Decks in Research

Featured

Transcript