Slide 1

Slide 1 text

Estimating Microbial Diversity Chris Fonnesbeck Department of Biostatistics Wednesday, February 1, 12

Slide 2

Slide 2 text

Diversity Measures taxon- and phylogenetic-based Wednesday, February 1, 12

Slide 3

Slide 3 text

α diversity (species richness, evenness) Wednesday, February 1, 12

Slide 4

Slide 4 text

β diversity (species turnover) Wednesday, February 1, 12

Slide 5

Slide 5 text

generalization of population estimation methods Wednesday, February 1, 12

Slide 6

Slide 6 text

sample-based estimate of diversity Wednesday, February 1, 12

Slide 7

Slide 7 text

Wednesday, February 1, 12

Slide 8

Slide 8 text

Wednesday, February 1, 12

Slide 9

Slide 9 text

Wednesday, February 1, 12

Slide 10

Slide 10 text

n < N Wednesday, February 1, 12

Slide 11

Slide 11 text

n ≪ N Wednesday, February 1, 12

Slide 12

Slide 12 text

estimate model data Wednesday, February 1, 12

Slide 13

Slide 13 text

10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 Species Frequency community sampling 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 Species Frequency physical sample Wednesday, February 1, 12

Slide 14

Slide 14 text

10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 Species Frequency physical sample 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 Species Frequency amplification amplified sample Wednesday, February 1, 12

Slide 15

Slide 15 text

10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 Species Frequency 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 Species Frequency sequencing amplified sample identified species Wednesday, February 1, 12

Slide 16

Slide 16 text

Rarefaction curves Wednesday, February 1, 12

Slide 17

Slide 17 text

source: Hughes et al. 2001 Wednesday, February 1, 12

Slide 18

Slide 18 text

Wednesday, February 1, 12

Slide 19

Slide 19 text

Diversity indices Wednesday, February 1, 12

Slide 20

Slide 20 text

Shannon-Weiner Index H = − log ∑ i=1 n pi pi E = H log n “evenness” Wednesday, February 1, 12

Slide 21

Slide 21 text

Non-parametric models Wednesday, February 1, 12

Slide 22

Slide 22 text

Chao1 Chao 1984 = + N ˆ Sobs n2 1 2n2 Wednesday, February 1, 12

Slide 23

Slide 23 text

Variance of Chao1 Var( ) = ( + ( / + ) S ˆ n2 ( / n1 n2 )4 4 n1 n2 )3 ( / n1 n2 )2 2 Wednesday, February 1, 12

Slide 24

Slide 24 text

Abundance-based Coverage Estiamtor where = 1 − CACE F1 i ∑10 i=1 Fi = + + S ˆ ACE Sabund Srare CACE F1 CACE γ2 ACE N ˆ ACE Wednesday, February 1, 12

Slide 25

Slide 25 text

>>> x1 array([ 2., 1., 1., 1., 10., 1., 1., 1., 1., 1., 1., 29., 1., 1., 3., 2., 1., 31., 11., 159., 408., 23., 62., 95., 42., 105., 4., 7910., 702., 13., 1., 2., 7., 1., 2., 3., 13., 1., 1., 2., 1., 6., 2., 8., 1., 1., 15., 16., 13., 3., 1., 3., 4., 1., 5., 4., 1., 2., 10., 4., 1., 3., 1., 1., 1., 8., 1., 1., 1., 2., 1., 2., 2., 1., 1., 1., 1., 1., 2., 1., 1., 3., 2., 5., 2., 1., 2., 229., 2., 1., 1., 4., 1., 2., 1., 1., 3., 1., 1., 1., 12., 5., 45., 1., 1., 1., 1., 1., 1., 1., 1., 1., 3., 1., 1., 1., 2., 11., 1., 1., 9., 1., 1., 1., 2., 4., 1., 2., 1., 1., 13., 4., 6., 44.]) Example Wednesday, February 1, 12

Slide 26

Slide 26 text

>>> s = len(x1) >>> n1 = len(x1[x1==1]) >>> n2 = len(x1[x1==2]) >>> s, n1, n2 (134, 66, 19) >>> s + (n1**2)/(2*n2) # Chao1 estimator 248 >>> np.sqrt(n2*((n1/n2)**4/4 + (n1/n2)**3 + (n1/n2)**2/2)) # SE 31.128764832546761 Example Wednesday, February 1, 12

Slide 27

Slide 27 text

Chao2 = + N ˆ Sobs (1 − 1/t)Q2 1 2Q2 Wednesday, February 1, 12

Slide 28

Slide 28 text

sensitivity to library size from Gihring et al. 2011 Wednesday, February 1, 12

Slide 29

Slide 29 text

Parametric models Wednesday, February 1, 12

Slide 30

Slide 30 text

0 1 2 3 4 5 6 7 26 0 2 4 6 8 10 12 14 16 18 20 22 24 # Individuals # Species Empirical distributions Wednesday, February 1, 12

Slide 31

Slide 31 text

Empirical distributions Wednesday, February 1, 12

Slide 32

Slide 32 text

detection parameter E( ) = ni Nipi Wednesday, February 1, 12

Slide 33

Slide 33 text

detection parameter = / N ˆ i ni p ˆ i Wednesday, February 1, 12

Slide 34

Slide 34 text

abundance and detection = 1 − (1 − ) pij ∏ k=1 nj pijk individual k species j sample i Wednesday, February 1, 12

Slide 35

Slide 35 text

abundance and detection = 1 − (1 − ) pij ∏ k=1 nj pijk = 1 − (1 − pij pij )nj Wednesday, February 1, 12

Slide 36

Slide 36 text

mark-recapture designs Wednesday, February 1, 12

Slide 37

Slide 37 text

unique markings Wednesday, February 1, 12

Slide 38

Slide 38 text

species in first sample n1 species in second sample n2 m species in also seen in n2 n1 Wednesday, February 1, 12

Slide 39

Slide 39 text

marked proportion in 2nd sample proportion captured in 1st sample = = m n2 n1 N p1 Wednesday, February 1, 12

Slide 40

Slide 40 text

Lincoln-Petersen estimator = = N ˆ n1 p ˆ1 n1n2 m Wednesday, February 1, 12

Slide 41

Slide 41 text

multinomial model P( , , m|N, , ) = n1 n2 p1 p2 N! m!( − m)!( − m)(N − n)! n1 n2 × ( [ (1 − ) p1p2 )m p1 p2 ] −m n1 × [(1 − ) [(1 − )(1 − ) p1 p2 ] −m n2 p1 p2 ]N−n Wednesday, February 1, 12

Slide 42

Slide 42 text

multiple sampling occasions Wednesday, February 1, 12

Slide 43

Slide 43 text

incidence matrix 1 2 3 4 species 1 1 1 0 0 species 2 0 1 1 1 species 3 0 0 0 1 sample Wednesday, February 1, 12

Slide 44

Slide 44 text

simplest model: M0 111 p3 110 p2(1-p) 101 p2(1-p) 100 p(1-p)2 ... ... observations probability (π) P( |N, ) = xijk πijk N! ! ∏ ijk xijk ∏ ijk πxijk ijk Wednesday, February 1, 12

Slide 45

Slide 45 text

Mh model individual heterogeneity { } ∼ F(p) pi Wednesday, February 1, 12

Slide 46

Slide 46 text

Mh model expected multinomial probabilities = (1 − p dF(p) πj ∫ 1 0 K! (K − j)!j! pj )K−j Wednesday, February 1, 12

Slide 47

Slide 47 text

Mh model estimation jackknife coefficients = N ˆ k ∑ j=1 K ajkfj capture frequencies Wednesday, February 1, 12

Slide 48

Slide 48 text

measures of community variation Wednesday, February 1, 12

Slide 49

Slide 49 text

community 2 community 1 x1 (1) x2 (1) xJ (1) ... x1 (2) x2 (2) xJ (2) ... Wednesday, February 1, 12

Slide 50

Slide 50 text

relative richness = / λ(12) i N(1) i N(2) i Wednesday, February 1, 12

Slide 51

Slide 51 text

relative richness = / λ ˆ(12) i N ˆ(1) i N ˆ(2) i Wednesday, February 1, 12

Slide 52

Slide 52 text

species co-occurrence = ϕ ˆ(12) i | M ˆ (2) i R(1) i R(1) i Cam et al. 2000 Wednesday, February 1, 12

Slide 53

Slide 53 text

unshared species = − B ˆ(12) i N ˆ(2) i ϕ ˆ(12) i N ˆ(1) i Cam et al. 2000 Wednesday, February 1, 12

Slide 54

Slide 54 text

Occupancy models Dorazio and Royle 2005 Dorazio and Royle 2006 MacKenzie et al. 2005 Wednesday, February 1, 12

Slide 55

Slide 55 text

presence-absence data 1 0 1 1 1 0 1 Wednesday, February 1, 12

Slide 56

Slide 56 text

presence-absence data 1 0 1 1 1 0 1 Wednesday, February 1, 12

Slide 57

Slide 57 text

sample occurrence of species Wednesday, February 1, 12

Slide 58

Slide 58 text

Pr(observe species) = Pr(species detected|species present) × Pr(species present) Wednesday, February 1, 12

Slide 59

Slide 59 text

incidence matrix 1 2 ... J species 1 x11 x12 x1J species 2 x21 x22 ... x2J ... species n xn1 xn2 xnJ sample locations = 0, 1, … , ( samples from each location) xij KJ KJ Wednesday, February 1, 12

Slide 60

Slide 60 text

1 2 ... J species 1 x11 x12 x1J species 2 x21 x22 ... x2J ... species n xn1 xn2 xnJ species n+1 0 0 0 ... ... species N 0 0 0 Wednesday, February 1, 12

Slide 61

Slide 61 text

1 2 ... J species 1 x11 x12 x1J species 2 x21 x22 ... x2J ... species n xn1 xn2 xnJ species n+1 0 0 0 ... ... species N 0 0 0 observed unobserved Wednesday, February 1, 12

Slide 62

Slide 62 text

1 2 ... J species 1 x11 x12 x1J species 2 x21 x22 ... x2J ... species n xn1 xn2 xnJ species n+1 0 0 0 ... ... species N 0 0 0 observed unobserved X Wednesday, February 1, 12

Slide 63

Slide 63 text

1 2 ... J species 1 z11 z12 z1J species 2 z21 z22 ... z2J ... species n zn1 zn2 znJ species n+1 z(n+1)1 z(n+1)2 z(n+1)J ... ... species N zN1 zN2 zNJ Wednesday, February 1, 12

Slide 64

Slide 64 text

1 2 ... J species 1 1 z12 1 species 2 z21 1 ... z2J ... species n 1 zn2 znJ species n+1 z(n+1)1 z(n+1)2 z(n+1)J ... ... species N zN1 zN2 zNJ Wednesday, February 1, 12

Slide 65

Slide 65 text

Z 1 2 ... J species 1 1 z12 1 species 2 z21 1 ... z2J ... species n 1 zn2 znJ species n+1 z(n+1)1 z(n+1)2 z(n+1)J ... ... species N zN1 zN2 zNJ Wednesday, February 1, 12

Slide 66

Slide 66 text

modeling occurrence p( | ) = (1 − zij ψij ψzij ij ψij )1−zij Bernoulli model Wednesday, February 1, 12

Slide 67

Slide 67 text

modeling detection if zij=1 p( | = 1, ) = ( ) (1 − xij zij θij K xij θxij ij θij )K−xij (conditional) Wednesday, February 1, 12

Slide 68

Slide 68 text

joint probability × (1 − ψzij ij ψij )1−zij p( , | , ) = xij zij ψij θij [( ) (1 − ] K xij θxij ij θij )K−xij zij Wednesday, February 1, 12

Slide 69

Slide 69 text

marginal probability of observed species p( | , ) = ( ) (1 − xij ψij θij ψij K xij θxij ij θij )K−xij + (1 − )I( = 0) ψij xij Wednesday, February 1, 12

Slide 70

Slide 70 text

models for detection and occupancy logit( ) = + ψij ui αj logit( ) = + θij vi βj Wednesday, February 1, 12

Slide 71

Slide 71 text

from Dorazio and Royle 2005 Wednesday, February 1, 12

Slide 72

Slide 72 text

from Dorazio and Royle 2005 Wednesday, February 1, 12

Slide 73

Slide 73 text

1. Many diversity measures ignore incomplete or heterogeneous detection 2. Detection and presence are often confounded 3. Repeated sampling is an efficient approach to allow detection and occupancy to be estimated 4. Occupancy modeling is a flexible approach for estimating diversity Take-home points Wednesday, February 1, 12