Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Estimating Microbial Diversity

Chris Fonnesbeck
February 01, 2012

Estimating Microbial Diversity

Slides from Vanderbilt Microbiome Research Meeting on 31 January, 2012

Chris Fonnesbeck

February 01, 2012
Tweet

More Decks by Chris Fonnesbeck

Other Decks in Science

Transcript

  1. 10 0 1 2 3 4 5 6 7 8

    9 10 0 1 2 3 4 5 6 7 8 9 Species Frequency community sampling 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 Species Frequency physical sample Wednesday, February 1, 12
  2. 10 0 1 2 3 4 5 6 7 8

    9 10 0 1 2 3 4 5 6 7 8 9 Species Frequency physical sample 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 Species Frequency amplification amplified sample Wednesday, February 1, 12
  3. 10 0 1 2 3 4 5 6 7 8

    9 10 0 1 2 3 4 5 6 7 8 9 Species Frequency 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 Species Frequency sequencing amplified sample identified species Wednesday, February 1, 12
  4. Shannon-Weiner Index H = − log ∑ i=1 n pi

    pi E = H log n “evenness” Wednesday, February 1, 12
  5. Chao1 Chao 1984 = + N ˆ Sobs n2 1

    2n2 Wednesday, February 1, 12
  6. Variance of Chao1 Var( ) = ( + ( /

    + ) S ˆ n2 ( / n1 n2 )4 4 n1 n2 )3 ( / n1 n2 )2 2 Wednesday, February 1, 12
  7. Abundance-based Coverage Estiamtor where = 1 − CACE F1 i

    ∑10 i=1 Fi = + + S ˆ ACE Sabund Srare CACE F1 CACE γ2 ACE N ˆ ACE Wednesday, February 1, 12
  8. >>> x1 array([ 2., 1., 1., 1., 10., 1., 1.,

    1., 1., 1., 1., 29., 1., 1., 3., 2., 1., 31., 11., 159., 408., 23., 62., 95., 42., 105., 4., 7910., 702., 13., 1., 2., 7., 1., 2., 3., 13., 1., 1., 2., 1., 6., 2., 8., 1., 1., 15., 16., 13., 3., 1., 3., 4., 1., 5., 4., 1., 2., 10., 4., 1., 3., 1., 1., 1., 8., 1., 1., 1., 2., 1., 2., 2., 1., 1., 1., 1., 1., 2., 1., 1., 3., 2., 5., 2., 1., 2., 229., 2., 1., 1., 4., 1., 2., 1., 1., 3., 1., 1., 1., 12., 5., 45., 1., 1., 1., 1., 1., 1., 1., 1., 1., 3., 1., 1., 1., 2., 11., 1., 1., 9., 1., 1., 1., 2., 4., 1., 2., 1., 1., 13., 4., 6., 44.]) Example Wednesday, February 1, 12
  9. >>> s = len(x1) >>> n1 = len(x1[x1==1]) >>> n2

    = len(x1[x1==2]) >>> s, n1, n2 (134, 66, 19) >>> s + (n1**2)/(2*n2) # Chao1 estimator 248 >>> np.sqrt(n2*((n1/n2)**4/4 + (n1/n2)**3 + (n1/n2)**2/2)) # SE 31.128764832546761 Example Wednesday, February 1, 12
  10. Chao2 = + N ˆ Sobs (1 − 1/t)Q2 1

    2Q2 Wednesday, February 1, 12
  11. 0 1 2 3 4 5 6 7 26 0

    2 4 6 8 10 12 14 16 18 20 22 24 # Individuals # Species Empirical distributions Wednesday, February 1, 12
  12. detection parameter = / N ˆ i ni p ˆ

    i Wednesday, February 1, 12
  13. abundance and detection = 1 − (1 − ) pij

    ∏ k=1 nj pijk individual k species j sample i Wednesday, February 1, 12
  14. abundance and detection = 1 − (1 − ) pij

    ∏ k=1 nj pijk = 1 − (1 − pij pij )nj Wednesday, February 1, 12
  15. species in first sample n1 species in second sample n2

    m species in also seen in n2 n1 Wednesday, February 1, 12
  16. marked proportion in 2nd sample proportion captured in 1st sample

    = = m n2 n1 N p1 Wednesday, February 1, 12
  17. multinomial model P( , , m|N, , ) = n1

    n2 p1 p2 N! m!( − m)!( − m)(N − n)! n1 n2 × ( [ (1 − ) p1p2 )m p1 p2 ] −m n1 × [(1 − ) [(1 − )(1 − ) p1 p2 ] −m n2 p1 p2 ]N−n Wednesday, February 1, 12
  18. incidence matrix 1 2 3 4 species 1 1 1

    0 0 species 2 0 1 1 1 species 3 0 0 0 1 sample Wednesday, February 1, 12
  19. simplest model: M0 111 p3 110 p2(1-p) 101 p2(1-p) 100

    p(1-p)2 ... ... observations probability (π) P( |N, ) = xijk πijk N! ! ∏ ijk xijk ∏ ijk πxijk ijk Wednesday, February 1, 12
  20. Mh model expected multinomial probabilities = (1 − p dF(p)

    πj ∫ 1 0 K! (K − j)!j! pj )K−j Wednesday, February 1, 12
  21. Mh model estimation jackknife coefficients = N ˆ k ∑

    j=1 K ajkfj capture frequencies Wednesday, February 1, 12
  22. community 2 community 1 x1 (1) x2 (1) xJ (1)

    ... x1 (2) x2 (2) xJ (2) ... Wednesday, February 1, 12
  23. relative richness = / λ ˆ(12) i N ˆ(1) i

    N ˆ(2) i Wednesday, February 1, 12
  24. species co-occurrence = ϕ ˆ(12) i | M ˆ (2)

    i R(1) i R(1) i Cam et al. 2000 Wednesday, February 1, 12
  25. unshared species = − B ˆ(12) i N ˆ(2) i

    ϕ ˆ(12) i N ˆ(1) i Cam et al. 2000 Wednesday, February 1, 12
  26. Occupancy models Dorazio and Royle 2005 Dorazio and Royle 2006

    MacKenzie et al. 2005 Wednesday, February 1, 12
  27. incidence matrix 1 2 ... J species 1 x11 x12

    x1J species 2 x21 x22 ... x2J ... species n xn1 xn2 xnJ sample locations = 0, 1, … , ( samples from each location) xij KJ KJ Wednesday, February 1, 12
  28. 1 2 ... J species 1 x11 x12 x1J species

    2 x21 x22 ... x2J ... species n xn1 xn2 xnJ species n+1 0 0 0 ... ... species N 0 0 0 Wednesday, February 1, 12
  29. 1 2 ... J species 1 x11 x12 x1J species

    2 x21 x22 ... x2J ... species n xn1 xn2 xnJ species n+1 0 0 0 ... ... species N 0 0 0 observed unobserved Wednesday, February 1, 12
  30. 1 2 ... J species 1 x11 x12 x1J species

    2 x21 x22 ... x2J ... species n xn1 xn2 xnJ species n+1 0 0 0 ... ... species N 0 0 0 observed unobserved X Wednesday, February 1, 12
  31. 1 2 ... J species 1 z11 z12 z1J species

    2 z21 z22 ... z2J ... species n zn1 zn2 znJ species n+1 z(n+1)1 z(n+1)2 z(n+1)J ... ... species N zN1 zN2 zNJ Wednesday, February 1, 12
  32. 1 2 ... J species 1 1 z12 1 species

    2 z21 1 ... z2J ... species n 1 zn2 znJ species n+1 z(n+1)1 z(n+1)2 z(n+1)J ... ... species N zN1 zN2 zNJ Wednesday, February 1, 12
  33. Z 1 2 ... J species 1 1 z12 1

    species 2 z21 1 ... z2J ... species n 1 zn2 znJ species n+1 z(n+1)1 z(n+1)2 z(n+1)J ... ... species N zN1 zN2 zNJ Wednesday, February 1, 12
  34. modeling occurrence p( | ) = (1 − zij ψij

    ψzij ij ψij )1−zij Bernoulli model Wednesday, February 1, 12
  35. modeling detection if zij=1 p( | = 1, ) =

    ( ) (1 − xij zij θij K xij θxij ij θij )K−xij (conditional) Wednesday, February 1, 12
  36. joint probability × (1 − ψzij ij ψij )1−zij p(

    , | , ) = xij zij ψij θij [( ) (1 − ] K xij θxij ij θij )K−xij zij Wednesday, February 1, 12
  37. marginal probability of observed species p( | , ) =

    ( ) (1 − xij ψij θij ψij K xij θxij ij θij )K−xij + (1 − )I( = 0) ψij xij Wednesday, February 1, 12
  38. models for detection and occupancy logit( ) = + ψij

    ui αj logit( ) = + θij vi βj Wednesday, February 1, 12
  39. 1. Many diversity measures ignore incomplete or heterogeneous detection 2.

    Detection and presence are often confounded 3. Repeated sampling is an efficient approach to allow detection and occupancy to be estimated 4. Occupancy modeling is a flexible approach for estimating diversity Take-home points Wednesday, February 1, 12