Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Community DNA fingerprinting

Etienne
February 22, 2013

Community DNA fingerprinting

Getting more out of Sanger sequencing.
The idea came from looking at Sanger sequence traces from heterozygote organisms. In these traces it was possible to identify the loci of heterozygocity and determine the two types present visually. We though and still wonder wether it would be possible to extract the species or type composition and possibly frequency from sanger sequencing of community amplicons.
This idea is currently shelved.
https://github.com/low-decarie/community-DNA-fingerprinting

Etienne

February 22, 2013
Tweet

More Decks by Etienne

Other Decks in Science

Transcript

  1. Getting more out of
    Sanger sequencing
    Etienne Low-Decarie
    Corey Chivers
    Community
    DNA
    fingerprinting

    View Slide

  2. Microbial community composition
    (phyto)plankton
    images: NHGRI and NASA
    soil
    human microbiome experimental ecology and evolution

    View Slide

  3. Community fingerprint
    ›  Determine
    community
    composition from a
    community level
    characteristic that
    results from the
    proportional sum of
    characteristics of its
    constituent types.
    ›  Syn. community profiling
    3x ½ x
    2x

    View Slide

  4. Pigment fingerprinting
    bbe FluoroProbe Manual Page 10
    DETERMINATION OF DIFFERENT ALGAE
    The division of chlorophyceae (green algae) shows a broad maximum of fluorescence at the 470nm
    LED, which is caused by chlorophyll-a and -b. The cyanophyceae (blue-green algae) have their
    maximum at 610nm due to the photosynthetic antenna pigment phycocyanin. Cyanophyceae also
    contain chlorophyll-a if there is low intensity at 470nm. This is due to the masking effect of the
    phycocyanin. Furthermore, the high peak at the 525nm region for the bacillariophyceae originates from
    xanthophyll fucoxanthin and for the dinophyceae from peridin. The maxima at 470nm are caused by
    chlorophyll-a and -c. In our last analysed group, cryptophyceae, a significant maximum can be found at
    570nm, which originates from phycoerythrin.
    It is obvious from the figure below that it is not possible to distinguish bacillariophyceae and
    dinophyceae by their “fluorescent fingerprints”. But it can be clearly seen that it is possible to
    distinguish five groups of algae: chlorophyceae, cyanophyceae, dinophyceae and bacillariophyceae,
    cryptophyceae.
    Additionally, it should be mentioned that sometimes the phycocyanin content per cell in cyanophyceae
    varies. Nevertheless, the average fingerprint can be used to differentiate this division.
    0
    0,2
    0,4
    0,6
    0,8
    1
    1,2
    440 460 480 500 520 540 560 580 600
    Excitation-Wavelength/1nm
    Relative Fluorescence Intensity
    scenedesmus
    chlamydomonas
    monoraphidium
    rhodomin
    chlorella
    scenedesmus fal.
    micractinium
    scenedesmus quad.
    scenedesmus subs
    scenedesmus ob.
    microcystis viridis
    synnechocochus
    nrc-1
    aphanizomenon
    cyclotella
    nitzschia
    synedra
    cryptomonas pl
    cryptomonas a
    cryptomonas
    ceratien
    peridinium
    Micro2
    Micro7
    Micro8
    Micro9
    Micro12
    Micro19
    Micro13
    Micro18
    Micro15
    Micro16
    Micro23
    Micro25
    Micro26
    Micro20
    Micro27
    Micro21
    Micro11
    Micro17
    Micro22
    Micro10
    Micro24
    Micro14
    Micro4
    Chlorophyceae
    Dinophyceae
    Bacillariophyceae
    Cryptophyceae
    Cyanophyceae
    The fluorescence intensities of the 5 divisions divided by the intensity of the LED and normalised to the maximum
    intensity of each division. In this measurement, several species of the mycrocystis cyanobacteria (abbreviation
    micro) were tested as well.
    images: Academy of Natural Sciences, BBE moldaenke and others

    View Slide

  5. Physiological fingerprint
    ›  Sole carbon source utilization
    ›  BIOLOG eco-plates
    ›  Toxicological
    ›  Antibiotic
    ›  Toxin and metals

    View Slide

  6. Chemical fingerprinting
    ›  Fatty acid
    ›  Isotope
    ›  Trophic structure signal
    images:Rebecca E. Drenovsky, Roger A. Duncan, Kate M. Scow,SPC ocean fisheries programme,

    View Slide

  7. DNA fingerprints
    ›  Non sequencing based
    ›  Terminal Restriction Fragment Length
    Polymorphism (T-RFLP)
    ›  Denaturing Gradient Gel
    Electrophoresis (DGGE)
    ›  Ribosomal Intergenic Spacer Analysis
    (RISA)
    ›  Sequencing based
    ›  High-throughput sequencing
    images: Xia Zhou, Gregory W. Schmidt, Roberto Danovaro and Antonio Pusceddu,
    Stephan Gantnera, Anders F. Anderssona, Laura Alonso-Sáeza, Stefan Bertilsson

    View Slide

  8. Chain-termination (Sanger) sequencing
    images: Helmut Kae
    NAGTCGTCCA

    View Slide

  9. Heterozygote trace
    0.00
    0.05
    0.10
    0.15
    5 10 15 20 25
    datum
    value
    Homozygote A
    0.00
    0.05
    0.10
    0.15
    5 10 15 20 25
    datum
    value
    Homozygote B
    0.00
    0.05
    0.10
    0.15
    5 10 15 20 25
    datum
    value
    Homozygote A+B

    View Slide

  10. Our naïve logical leap
    If we think it has not been done,
    chances are:
    •  Wrong search terms
    •  Can’t be done
    •  Shouldn’t be done

    View Slide

  11. Two type community trace
    0.00
    0.05
    0.10
    0.15
    5 10 15 20 25
    datum
    value
    Species A
    0.00
    0.05
    0.10
    0.15
    5 10 15 20 25
    datum
    value
    Species B
    0.00
    0.05
    0.10
    0.15
    5 10 15 20 25
    datum
    value
    Community A+B
    variable A T C G
    type: any taxonomic level with common primers and distinguishing loci
    genotype, species, genus…

    View Slide

  12. Two type trace

    View Slide

  13. Quantification
    ›  Sum of square optimization on base ratio
    (Community - (quantity • Type 1 + quantity • Type 2))2

    View Slide

  14. Simulation
    ›  Random species sequences
    ›  Sum species values for
    community
    ›  Vary
    ›  Number of species
    ›  Number of variable loci
    ›  Proportion of species available
    (sequenced) for fitting
    ›  Noise
    ›  …

    View Slide

  15. Simulation
    Number of species in the community
    Number of variable loci
    •  Robust if:
    •  number of
    variable loci >
    number of
    species
    •  Better at predicting
    species with larger
    frequencies
    •  Can withstand 10%
    noise with little loss.

    View Slide

  16. Quantification of visually aligned traces
    0.519 0.453
    R2>0.95

    View Slide

  17. Traces are shifted
    ›  Alignment
    ›  Segment shift through 0 padding
    ›  Discontinuous/rough surface
    ›  Grid search
    ›  Very costly (time~ rangeparameters)
    ›  potential alternatives
    ›  human eye
    ›  differential evolution
    ›  particle swarm
    ›  integer programing
    0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0
    0 0 0
    0
    0 0 0 0 0 0 0

    View Slide

  18. Traces contain drift
    ›  Only segments can be aligned
    ›  Trace segmentation
    ›  Provides population with
    distribution as potential measure
    of error

    View Slide

  19. Distribution of quantification with
    segmentation and alignment
    Segmentation size

    View Slide

  20. Next steps
    ›  Get your expert opinion/feedback
    ›  many potential issues but worth pursuing?
    ›  …
    ›  Show proportion calculated=true
    proportion
    ›  Dilution series
    ›  Volunteers for trace data?
    ›  …

    View Slide

  21. Acknowledgements
    BELL LAB
    FUSSMANN LAB
    LEUNG LAB
    etienne.webhop.org [email protected] [email protected]
    Contacts and code
    Adam Herman
    https://github.com/edielivon/community-DNA-fingerprinting
    Thomas Bureau
    Genome Quebec

    View Slide