140

# Robust and fully Bayesian inference of complex networks from noisy data

Most empirical studies of complex networks do not return direct, error-free measurements of network structure. Instead, they typically rely on indirect measurements that are often error-prone and unreliable. A fundamental problem in empirical network science is how to make the best possible estimates of network structure given such unreliable data. In this paper we describe a fully Bayesian method for reconstructing networks from observational data in any format, even when the data contain substantial measurement error and when the nature and magnitude of that error is unknown. The method is introduced through pedagogical case studies using real-world example networks, and specifically tailored to allow straightforward, computationally efficient implementation with a minimum of technical input. Computer code implementing the method is publicly available. ## Jean-Gabriel Young

September 21, 2020

## Transcript

1. N S MMXX
R
Jean-Gabriel Young
Center for the Study of Complex Systems, University of Michigan, Ann Arbor, MI, USA
Department of Computer Science, University of Vermont, Burlington, VT, USA
jg-you.github.io @_jgyou [email protected]
Joint work with George T. Cantwell and M.E.J. Newman

2. In the empirical sciences,
measurements are treated as noisy observation of reality.
184 186
Height (cm)
0.0
0.1
0.2
0.3
0.4
Probability

3. In network science,
measurements are treated as direct observations of reality.
2 4 6 8 10 12
Dolphin ID
2
4
6
8
10
12
Dolphin ID
Dolphin companionship
0
5
10
15
20
25
30
Number of observations
1
2
3
4
5
6
7
8
9
10 11
12
13

4. This talk :
How to convert noisy measurements to network*
*eﬃciently, from ﬁrst principles, and EASILY

5. How are network data born?

6. Statistical approach to network measurement ( of )
B
( , | ) ∝ ( | , ) ( | ) ( )
Probabilities deﬁned by a measurement model :
⊲ Prior ( )
What is the likely range of parameters?
⊲ Network model ( | )
What class of networks are we considering?
⊲ Data model ( | , )
How would a network lead to data ?

7. Statistical approach to network measurement ( of )
B
( , | ) ∝ ( | , ) ( | ) ( )
Statistical measurement can mean any of the following :
⊲ Computing the distribution ( | ).
⊲ Estimating the probability of every edge ( = 1| ).
⊲ Estimating the probability of triangles ( = 1 ∧ = 1 ∧ = 1| ).
⊲ And more..
“Just” averages of the form ∫ ( , , ) ( , | )

8. How can we compute ∫ ( , , ) ( , | ) ...
... with a model that suits your measurements?
... easily?
... and eﬃciently?

9. The method in a nutshell ( of )
Key insight : consider a smaller (but expressive) class of models.
( ) = arbitrary
( | ) = [ (0)]1− [ (1)]
( | , ) = [ (0)]1− [ (1)]
F “ ” :
Network model
(1) : Prob. of an edge ( , )
(0) : Prob. of no edge ( , )
Data model
(1) : Prob. of , when ( , ) is an edge
(0) : Prob. of when ( , ) is not an edge

10. The method in a nutshell ( of )
Why is it helpful? Because we know the closed forms :
( | ) = [ (0) (0) + (1) (1)]
( | , ) = [ ( )] [1 − ( )]1−
With these we can evaluate

( , , ) ( , | ) =

( , , ) ( | , ) ( | ) ≈
1
( , , )
in two easy steps :
. Draw from ( | ) (automatic with stan, pymc, etc.)
. Draw from ( | , ) (just coin ﬂips)

11. Example of model
# of times dolphins seen swimming together
2 4 6 8 10 12
Dolphin ID
2
4
6
8
10
12
Dolphin ID
Dolphin companionship
0
5
10
15
20
25
30
Number of observations
[ R. C. Connor, R. A. Smolker and A. F. Richards, ( )]
O
Network model
(0) = 1 −
(1) =
Data model
| = 0 ∼ Poisson( 0
) i.e. (0) = ( 0) − 0 / !
| = 1 ∼ Poisson( 1
) i.e. (0) = ( 1) − 1 / !
Prior : 0
< 1

12. The method in action
Dolphin data set, with the example model
Input Outputs
2 4 6 8 10 12
Dolphin ID
2
4
6
8
10
12
Dolphin ID
Dolphin companionship
0
5
10
15
20
25
30
Number of observations
1
2
3
4
5
6
7
8
9
10 11
12
13
1
2
3
4
5
6
7
8
9
10 11
12
13
1
2
3
4
5
6
7
8
9
10 11
12
13
1
2
3
4
5
6
7
8
9
10 11
12
13
1
2
3
4
5
6
7
8
9
10 11
12
13

13. The method in action
Dolphin data set, with the example model
Input Outputs
2 4 6 8 10 12
Dolphin ID
2
4
6
8
10
12
Dolphin ID
Dolphin companionship
0
5
10
15
20
25
30
Number of observations
0.7 0.8 0.9 1.0
Transitivity
0
5
10
15
20
25
Density
Thresholded
0.200 0.205 0.210 0.215
Mean eigenvector centrality
0
50
100
150
200
250
300
350
Density
Thresholded

14. Actual applications
P -
[JGY, F. S. Valdovinos, M. E. J. Newman, bioarxiv: ( ).]
M I
[K. Leyba et al., forthcoming ( ).]

15. Take-home message
⊲ Measurements are not networks.
⊲ Networks from measurements is as an inference problem.
⊲ We delineated models for which this problem is easy.
⊲ References : arXiv: . (method) and bioarxiv: (application).
⊲ Software : github.com/jg-you/noisy-networks-measurements
⊲ Tutorial : https://bit. y/32bnKsv

16. Complete tutorial available on the repository!
https://bit. y/32bnKsv