$30 off During Our Annual Pro Sale. View Details »

Robust and fully Bayesian inference of complex networks from noisy data

Robust and fully Bayesian inference of complex networks from noisy data

Paper: https://arxiv.org/abs/2008.03334
Code: https://github.com/jg-you/noisy-networks-measurements
Tutorial: https://github.com/jg-you/noisy-networks-measurements/blob/master/tutorial/tutorial.ipynb

Most empirical studies of complex networks do not return direct, error-free measurements of network structure. Instead, they typically rely on indirect measurements that are often error-prone and unreliable. A fundamental problem in empirical network science is how to make the best possible estimates of network structure given such unreliable data. In this paper we describe a fully Bayesian method for reconstructing networks from observational data in any format, even when the data contain substantial measurement error and when the nature and magnitude of that error is unknown. The method is introduced through pedagogical case studies using real-world example networks, and specifically tailored to allow straightforward, computationally efficient implementation with a minimum of technical input. Computer code implementing the method is publicly available.

Jean-Gabriel Young

September 21, 2020
Tweet

More Decks by Jean-Gabriel Young

Other Decks in Science

Transcript

  1. N S MMXX
    R
    Jean-Gabriel Young
    Center for the Study of Complex Systems, University of Michigan, Ann Arbor, MI, USA
    Department of Computer Science, University of Vermont, Burlington, VT, USA
    jg-you.github.io @_jgyou [email protected]
    Joint work with George T. Cantwell and M.E.J. Newman

    View Slide

  2. In the empirical sciences,
    measurements are treated as noisy observation of reality.
    184 186
    Height (cm)
    0.0
    0.1
    0.2
    0.3
    0.4
    Probability

    View Slide

  3. In network science,
    measurements are treated as direct observations of reality.
    2 4 6 8 10 12
    Dolphin ID
    2
    4
    6
    8
    10
    12
    Dolphin ID
    Dolphin companionship
    0
    5
    10
    15
    20
    25
    30
    Number of observations
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10 11
    12
    13

    View Slide

  4. This talk :
    How to convert noisy measurements to network*
    *efficiently, from first principles, and EASILY

    View Slide

  5. How are network data born?

    View Slide

  6. Statistical approach to network measurement ( of )
    B
    ( , | ) ∝ ( | , ) ( | ) ( )
    Probabilities defined by a measurement model :
    ⊲ Prior ( )
    What is the likely range of parameters?
    ⊲ Network model ( | )
    What class of networks are we considering?
    ⊲ Data model ( | , )
    How would a network lead to data ?

    View Slide

  7. Statistical approach to network measurement ( of )
    B
    ( , | ) ∝ ( | , ) ( | ) ( )
    Statistical measurement can mean any of the following :
    ⊲ Computing the distribution ( | ).
    ⊲ Estimating the probability of every edge ( = 1| ).
    ⊲ Estimating the probability of triangles ( = 1 ∧ = 1 ∧ = 1| ).
    ⊲ And more..
    “Just” averages of the form ∫ ( , , ) ( , | )

    View Slide

  8. How can we compute ∫ ( , , ) ( , | ) ...
    ... for your data ?
    ... with a model that suits your measurements?
    ... easily?
    ... and efficiently?

    View Slide

  9. The method in a nutshell ( of )
    Key insight : consider a smaller (but expressive) class of models.
    ( ) = arbitrary
    ( | ) = [ (0)]1− [ (1)]
    ( | , ) = [ (0)]1− [ (1)]
    F “ ” :
    Network model
    (1) : Prob. of an edge ( , )
    (0) : Prob. of no edge ( , )
    Data model
    (1) : Prob. of , when ( , ) is an edge
    (0) : Prob. of when ( , ) is not an edge

    View Slide

  10. The method in a nutshell ( of )
    Why is it helpful? Because we know the closed forms :
    ( | ) = [ (0) (0) + (1) (1)]
    ( | , ) = [ ( )] [1 − ( )]1−
    With these we can evaluate

    ( , , ) ( , | ) =

    ( , , ) ( | , ) ( | ) ≈
    1
    ( , , )
    in two easy steps :
    . Draw from ( | ) (automatic with stan, pymc, etc.)
    . Draw from ( | , ) (just coin flips)

    View Slide

  11. Example of model
    # of times dolphins seen swimming together
    2 4 6 8 10 12
    Dolphin ID
    2
    4
    6
    8
    10
    12
    Dolphin ID
    Dolphin companionship
    0
    5
    10
    15
    20
    25
    30
    Number of observations
    [ R. C. Connor, R. A. Smolker and A. F. Richards, ( )]
    O
    Network model
    (0) = 1 −
    (1) =
    Data model
    | = 0 ∼ Poisson( 0
    ) i.e. (0) = ( 0) − 0 / !
    | = 1 ∼ Poisson( 1
    ) i.e. (0) = ( 1) − 1 / !
    Prior : 0
    < 1

    View Slide

  12. The method in action
    Dolphin data set, with the example model
    Input Outputs
    2 4 6 8 10 12
    Dolphin ID
    2
    4
    6
    8
    10
    12
    Dolphin ID
    Dolphin companionship
    0
    5
    10
    15
    20
    25
    30
    Number of observations
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10 11
    12
    13
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10 11
    12
    13
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10 11
    12
    13
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10 11
    12
    13
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10 11
    12
    13

    View Slide

  13. The method in action
    Dolphin data set, with the example model
    Input Outputs
    2 4 6 8 10 12
    Dolphin ID
    2
    4
    6
    8
    10
    12
    Dolphin ID
    Dolphin companionship
    0
    5
    10
    15
    20
    25
    30
    Number of observations
    0.7 0.8 0.9 1.0
    Transitivity
    0
    5
    10
    15
    20
    25
    Density
    Thresholded
    0.200 0.205 0.210 0.215
    Mean eigenvector centrality
    0
    50
    100
    150
    200
    250
    300
    350
    Density
    Thresholded

    View Slide

  14. Actual applications
    P -
    [JGY, F. S. Valdovinos, M. E. J. Newman, bioarxiv: ( ).]
    M I
    [K. Leyba et al., forthcoming ( ).]

    View Slide

  15. Take-home message
    ⊲ Measurements are not networks.
    ⊲ Networks from measurements is as an inference problem.
    ⊲ We delineated models for which this problem is easy.
    ⊲ References : arXiv: . (method) and bioarxiv: (application).
    ⊲ Software : github.com/jg-you/noisy-networks-measurements
    ⊲ Tutorial : https://bit. y/32bnKsv

    View Slide

  16. Complete tutorial available on the repository!
    https://bit. y/32bnKsv

    View Slide