$30 off During Our Annual Pro Sale. View Details »

Network archaeology: Phase transition in the recoverability of network history

Network archaeology: Phase transition in the recoverability of network history

Talk presented at NetSci 2018 (https://www.netsci2018.com/)

paper (open-access): https://journals.aps.org/prx/abstract/10.1103/PhysRevX.9.041056
Code: https://github.com/jg-you/network-archaeology
arXiv: https://arxiv.org/abs/1803.09191

Network growth processes can be understood as generative models of the structure and history of complex networks. This point of view naturally leads to the problem of network archaeology: Reconstructing all the past states of a network from its structure---a difficult permutation inference problem. In this paper, we introduce a Bayesian formulation of network archaeology, with a generalization of preferential attachment as our generative mechanism. We develop a sequential importance sampling algorithm to evaluate the posterior averages of this model, as well as an efficient heuristic that uncovers the history of a network in linear time. We use these methods to identify and characterize a phase transition in the quality of the reconstructed history, when they are applied to artificial networks generated by the model itself. Despite the existence of a no-recovery phase, we find that non-trivial inference is possible in a large portion of the parameter space as well as on empirical data.

Jean-Gabriel Young

June 13, 2018
Tweet

More Decks by Jean-Gabriel Young

Other Decks in Science

Transcript

  1. N :
    P
    J.-G. Young1
    L. Hébert-Dufresne1,2, E. Laurence1, C. Murphy1
    G. St-Onge1 and P. Desrosiers1,3
    June th — NetSci — Theory I
    . Département de physique, Université Laval, Québec, QC, Canada
    . Vermont Complex Systems Center, University of Vermont, Burlington, VT, USA
    . Centre de recherche de CERVO, QC, Québec, Canada

    View Slide

  2. /
    Erdős-Rényi
    Complex networks
    Grid

    View Slide

  3. /
    Growth models
    One of the great tools of Network Science

    View Slide

  4. /
    Great fit of macro quantities via
    A.L. Barabási & R. Albert, Science, L. Hébert-Dufresne et al., Phys. Rev. E,
    JGY et al., Phys. Rev. E, L. Hébert-Dufresne et al., Phys. Rev. E,

    View Slide

  5. /
    Natural question :
    What else can we learn from growth models?

    View Slide

  6. /
    Natural question :
    What else can we learn from growth models?
    Some answers :
    Micro-evolution [J. Leskovec et al., ACM SIGKDD ( )
    Prominence of growth rules [T. Pham et al., PLoS ONE ( )]

    View Slide

  7. /
    Natural question :
    What else can we learn from growth models?
    Some answers :
    Micro-evolution [J. Leskovec et al., ACM SIGKDD ( )
    Prominence of growth rules [T. Pham et al., PLoS ONE ( )]
    First node(s) [S. Bubeck et al., Random Struct. Algor., ( )]
    History [J. Pinney et al., PNAS ( ); A. Magner et al. WWW ( )]

    View Slide

  8. T
    /

    View Slide

  9. /
    The network archaeology problem : concept
    W ?
    Older edges : thick, dark strokes

    View Slide

  10. /
    The network archaeology problem : definitions
    G : Unannotated network
    X : Modifications to G, ordered in time
    E
    Possible history : X (e
    1
    , e
    2
    , e
    3
    , e
    4
    , e
    5
    ) in T 5 steps.

    View Slide

  11. /
    The network archaeology problem : Bayesian formulation
    S I
    We assume that the parameters θ are known, such that
    P(X|G, θ)
    P(G|X, θ)P(X|θ)
    P(G|θ)
    ∝ P(G|X, θ)P(X|θ) .
    Probabilities defined by a model :
    Likelihood P(G|X, θ) : Prob. of G given history X (logical)
    Prior P(X|θ) : Prob. of producing X
    Evidence P(G|θ) X
    P(G|X, θ)P(X|θ)

    View Slide

  12. A
    /

    View Slide

  13. /
    Parametrized random attachment model : concept
    Preferential attachment with
    general attachment kernel g(k) kγ
    (γ ∈ R);
    events between existing nodes (prob. 1 − b).
    Each discrete time t : new edge, choose site with prob. ∝ g(k)

    View Slide

  14. /
    Parametrized random attachment model : network zoo
    γ 0 γ 0 γ 0

    View Slide

  15. A .
    /

    View Slide

  16. /
    Algorithms for network archaeology
    Our goal :
    Order the edges of G, assuming the G generated by PA
    We compare three methods :
    . Degree ordering. Higher degree = older.
    . Onion decomposition (generalizes k-core). Central = older.
    . Principled inference by sampling. Evaluate expected arrival
    time of each edge according to P(X|G, θ).
    Onion decomposition : [L. Hébert-Dufresne, J. Grochow, and A. Allard, Sci Rep., ( )]

    View Slide

  17. E
    /

    View Slide

  18. /
    Experiments and results : real system
    Social network built with emails ( day)
    Nodes ( ) : Researchers Edges ( ) : Reciprocated emails (40+)
    [Paranjape et al., ACM Web Search and Data Mining ( )]

    View Slide

  19. /
    Experiments and results : real system
    0 200 400
    True arrival time X
    (e)
    0
    200
    400
    Estimated arrival time (e)
    (a)
    0 200 400
    True arrival time X
    (e)
    0
    200
    400
    (b)
    0 200 400
    True arrival time X
    (e)
    0
    200
    400
    (c)
    Degree (ρ 0.39) Onion (ρ 0.41) Sampled (ρ 0.62)

    View Slide

  20. /
    But what are the limits of inference?

    View Slide

  21. /
    Experiments and results : artificial networks ( of )
    E
    Generate artificial networks with fixed loopiness b and vary the
    strength of the rich-get-richer mechanism via γ.

    View Slide

  22. /
    Experiments and results : artificial networks ( of )
    Tree networks (b 1)
    Loopy networks (b < 1)
    10 5 0 5 10
    0.00
    0.25
    0.50
    0.75
    1.00
    Correlation
    (a)
    Bayesian
    Degree
    Onion
    10 5 0 5 10
    0.00
    0.25
    0.50
    0.75
    1.00
    Correlation
    (b)
    Bayesian
    Degree
    Onion

    View Slide

  23. /
    Experiments and results : artificial networks ( of )
    10 5 0 5 10
    0.00
    0.25
    0.50
    0.75
    1.00
    Correlation
    Bayesian
    Degree
    Onion
    Condensation begins: Possible but imperfect
    Chains: Easy
    Nearly star-graphs: Impossible
    Phenomenology of the model [Krapivsky et al., Phys. Rev. Lett., ]

    View Slide

  24. C
    /

    View Slide

  25. /
    Take-home message

    View Slide

  26. /
    Take-home message
    Network archaeology : Recover history encoded in
    structure.
    Reference : arxiv.org/ .
    Software : github.com/jg-you/network-archaeo ogy

    View Slide

  27. /
    Take-home message
    Network archaeology : Recover history encoded in
    structure.
    Best inference results rely on a full knowledge of the model
    and a Bayesian formulation, but ∃ efficient approximation.
    Reference : arxiv.org/ .
    Software : github.com/jg-you/network-archaeo ogy

    View Slide

  28. /
    Take-home message
    Network archaeology : Recover history encoded in
    structure.
    Best inference results rely on a full knowledge of the model
    and a Bayesian formulation, but ∃ efficient approximation.
    There are fundamental limits to inference.
    Reference : arxiv.org/ .
    Software : github.com/jg-you/network-archaeo ogy

    View Slide

  29. /
    Take-home message
    Network archaeology : Recover history encoded in
    structure.
    Best inference results rely on a full knowledge of the model
    and a Bayesian formulation, but ∃ efficient approximation.
    There are fundamental limits to inference.
    Imperfect but non-trivial inference on real systems.
    Reference : arxiv.org/ .
    Software : github.com/jg-you/network-archaeo ogy

    View Slide

  30. /
    Reference : arxiv.org/1803.09191
    Software : github.com/jg-you/network-archaeo ogy
    [email protected] jgyoung.ca @_jgyou

    View Slide

  31. /
    Selected references
    O
    ( ) J.-G. Young, L. Hébert-Dufresne, E. Laurence, C. Murphy,
    G. St-Onge and P. Desrosiers
    arxiv : .
    Archaeology in PPI networks
    ( ) J. W. Pinney et al., PNAS , ( )
    ( ) S. Navlakha and C. Kingsford, PLoS Comput. Biol. , ( )
    Archaeology in SF networks
    ( ) S. Bubeck et al., Random Struct. Algor., , ( )
    ( ) A. Magner et al., WWW ( )

    View Slide