$30 off During Our Annual Pro Sale. View Details »

Computational analysis of metagenomic data: delineation of compositional features and screens for desirable enzymes

Computational analysis of metagenomic data: delineation of compositional features and screens for desirable enzymes

Promotionskolloquium - PhD defence

Konrad Förstner

February 04, 2009
Tweet

More Decks by Konrad Förstner

Other Decks in Science

Transcript

  1. Intro GC Protein detection Nitrilases NHases PKS THM
    Computational analysis of metagenomic data: delineation of
    compositional features and screens for desirable enzymes
    Konrad U. F¨
    orstner
    Bork Group, EMBL
    Promotionskolloquium
    04. Februar 2009

    View Slide

  2. Intro GC Protein detection Nitrilases NHases PKS THM
    Table of content
    1 Introduction
    2 GC content analysis
    3 Protein detection workflow
    4 Nitrilases
    5 Nitril hydratases
    6 Polyketide synthases I
    7 Take home messages

    View Slide

  3. Intro GC Protein detection Nitrilases NHases PKS THM
    Table of content
    1 Introduction
    2 GC content analysis
    3 Protein detection workflow
    4 Nitrilases
    5 Nitril hydratases
    6 Polyketide synthases I
    7 Take home messages

    View Slide

  4. Intro GC Protein detection Nitrilases NHases PKS THM
    For the microbial ecologist, what can be cultured is the basis of his
    conception of what exists. This is exactly like learning about
    animals from visiting zoos.
    Carl Woese

    View Slide

  5. Intro GC Protein detection Nitrilases NHases PKS THM

    View Slide

  6. Intro GC Protein detection Nitrilases NHases PKS THM

    View Slide

  7. Intro GC Protein detection Nitrilases NHases PKS THM

    View Slide

  8. Intro GC Protein detection Nitrilases NHases PKS THM

    View Slide

  9. Intro GC Protein detection Nitrilases NHases PKS THM
    Great plate count anomaly
    Less than 1% of the microbes can be cultured under standard
    conditions.

    View Slide

  10. Intro GC Protein detection Nitrilases NHases PKS THM
    Metagenomics
    =
    culture independent approaches

    View Slide

  11. Intro GC Protein detection Nitrilases NHases PKS THM
    Workflow of metagenomics sequencing

    View Slide

  12. Intro GC Protein detection Nitrilases NHases PKS THM
    Selected metagenomic data sets

    View Slide

  13. Intro GC Protein detection Nitrilases NHases PKS THM
    Challenges
    Usually a low coverage
    Dominant species
    Short sequences
    Data size
    ⇒ storage/memory/CPU intensive
    ⇒ software not developed for that
    No standard protocols
    ⇒ hard to compare

    View Slide

  14. Intro GC Protein detection Nitrilases NHases PKS THM
    Table of content
    1 Introduction
    2 GC content analysis
    3 Protein detection workflow
    4 Nitrilases
    5 Nitril hydratases
    6 Polyketide synthases I
    7 Take home messages

    View Slide

  15. Intro GC Protein detection Nitrilases NHases PKS THM
    GC analysis
    GC content = percentage of
    Guanine-Cytosine bp in the
    DNA/RNA
    influences a.o.
    Melting temperature of DNA/RNA
    Codon usage

    View Slide

  16. Intro GC Protein detection Nitrilases NHases PKS THM
    GC analysis - huge difference between soil and ocean water
    Foerstner et al., 2005

    View Slide

  17. Intro GC Protein detection Nitrilases NHases PKS THM
    GC analysis - further data confirms statement
    Raes et al., 2007

    View Slide

  18. Intro GC Protein detection Nitrilases NHases PKS THM
    GC analysis - possible influencing factors
    Nitrogen availability
    Genome size
    Ultraviolet light exposure and
    repair mechanism
    Codon usage of pioneers

    View Slide

  19. Intro GC Protein detection Nitrilases NHases PKS THM
    Table of content
    1 Introduction
    2 GC content analysis
    3 Protein detection workflow
    4 Nitrilases
    5 Nitril hydratases
    6 Polyketide synthases I
    7 Take home messages

    View Slide

  20. Intro GC Protein detection Nitrilases NHases PKS THM
    Metagenomics data sets as resources of biotech enzymes
    Many microbial enzymes are
    essential tools in e.g. the chemical,
    pharma and food industries
    Searching in metagenomic data
    sets might reveal new potent
    members of known enzymes classes

    View Slide

  21. Intro GC Protein detection Nitrilases NHases PKS THM
    Protein detection and classification workflow

    View Slide

  22. Intro GC Protein detection Nitrilases NHases PKS THM
    Nitrilases
    Nitrile + water carboxylic acids + ammonia
    One protein
    Application in the chemical industry
    Stereo- and regio-specific conversion of nitriles

    View Slide

  23. Intro GC Protein detection Nitrilases NHases PKS THM
    Nitrilases - new members and subfamilies found
    Raes et al., 2007

    View Slide

  24. Intro GC Protein detection Nitrilases NHases PKS THM
    Table of content
    1 Introduction
    2 GC content analysis
    3 Protein detection workflow
    4 Nitrilases
    5 Nitril hydratases
    6 Polyketide synthases I
    7 Take home messages

    View Slide

  25. Intro GC Protein detection Nitrilases NHases PKS THM
    NHases
    Nitril hydratases (NHases)
    Nitrile + water amide
    Two domains
    Application in the chemical industry
    Acrylamide >30,000 tons/year
    Nicotinamide >3500 tons/year
    Waste water treatment

    View Slide

  26. Intro GC Protein detection Nitrilases NHases PKS THM
    NHases - tree of the α domain
    Foerstner et al., 2008

    View Slide

  27. Intro GC Protein detection Nitrilases NHases PKS THM
    NHases - Monosiga brevicollis’ taxomony

    View Slide

  28. Intro GC Protein detection Nitrilases NHases PKS THM
    NHases - in Monosiga brevicollis
    Foerstner et al., 2008

    View Slide

  29. Intro GC Protein detection Nitrilases NHases PKS THM
    Table of content
    1 Introduction
    2 GC content analysis
    3 Protein detection workflow
    4 Nitrilases
    5 Nitril hydratases
    6 Polyketide synthases I
    7 Take home messages

    View Slide

  30. Intro GC Protein detection Nitrilases NHases PKS THM
    PKS I
    Polyketide synthases (PKS) create a heterogeneous group of
    secondary metabolites
    The synthesis is similar to the fatty acid synthesis
    Multiple domains
    We focused on polyketide synthases type I (PKS I)

    View Slide

  31. Intro GC Protein detection Nitrilases NHases PKS THM
    PKS I - polyketide synthesis steps
    This picture of this slide is removed due to copyright restriction.
    Jenke-Kodama et al., 2005

    View Slide

  32. Intro GC Protein detection Nitrilases NHases PKS THM
    PKS I - examples of polyketides
    Erythromycin Oleandomycin Aflatoxin B1

    View Slide

  33. Intro GC Protein detection Nitrilases NHases PKS THM
    PKS I - tree of the AT domain HMM hits
    Foerstner et al., 2008

    View Slide

  34. Intro GC Protein detection Nitrilases NHases PKS THM
    PKS I - tree overview
    Foerstner et al., 2008

    View Slide

  35. Intro GC Protein detection Nitrilases NHases PKS THM
    PKS I - hit distribution
    Foerstner et al., 2008

    View Slide

  36. Intro GC Protein detection Nitrilases NHases PKS THM
    PKS I - PKSs per genome
    Foerstner et al., 2008

    View Slide

  37. Intro GC Protein detection Nitrilases NHases PKS THM
    Table of content
    1 Introduction
    2 GC content analysis
    3 Protein detection workflow
    4 Nitrilases
    5 Nitril hydratases
    6 Polyketide synthases I
    7 Take home messages

    View Slide

  38. Intro GC Protein detection Nitrilases NHases PKS THM
    Take home messages
    Metagenomics ...
    ... might help us to explore the complete microbial world
    ... still has many technical challenges
    ... can reveal the environmetal influence on genomic features
    ... can help discover new enzymes

    View Slide

  39. Intro GC Protein detection Nitrilases NHases PKS THM
    Acknowledgements
    Peer Bork
    Thomas Dandekar
    Lars Steinmetz
    Toby Gibson
    The whole Bork group esp. Jeroen
    Raes and Takuji Yamada
    Christian von Mering
    Melly
    My friends and family

    View Slide

  40. Intro GC Protein detection Nitrilases NHases PKS THM
    Image sources/attribution - part 1/2
    Orangutan Houston Zoo http://flickr.com/photos/billtex48/2178056762/ by (Bill and Mavis) -
    B&M
    Opel Zoo 07.07.2007 http://flickr.com/photos/lamberty/754218458 by frijolito75
    Giraffe http://flickr.com/photos/abelle/280246250/ by A.Bell
    Snuggling http://flickr.com/photos/buckwoo/2421562192/ by Ken W!
    Delicious Dead Bee and Hungry Ants http://flickr.com/photos/hamed/176176998/ by Hamed Saber
    hundreds of fish swarm a soft coral head http://flickr.com/photos/g-na/370131126/ by g-na
    hunt is on http://flickr.com/photos/doug88888/2930690305/ by doug88888
    Long-billed Curlew http://flickr.com/photos/mikebaird/3011987508/ by mikebaird
    145ps 01087.jpg http://flickr.com/photos/ricephotos/2679758872/ by IRRI Images
    Polymicrobic biofilm epifluorescence
    http://commons.wikimedia.org/wiki/File:Polymicrobic_biofilm_epifluorescence.jpg
    The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific
    Rusch DB, Halpern AL, Sutton G, Heidelberg KB, Williamson S, et al. PLoS Biology Vol. 5, No. 3, e77
    doi:10.1371/journal.pbio.0050077
    green farm http://flickr.com/photos/nakae/204037619/ by nakae
    Acid Mine Drainage http://flickr.com/photos/savethewildup/400614071/ by savethewildup
    blue ocean http://flickr.com/photos/coolskipper/27242821/ by coolskipper
    Digestive system http://commons.wikimedia.org/wiki/File:Digestive_system_whitout_labels.svg
    by Mariana Ruiz Villarreal
    Pg166 bioreactor http://commons.wikimedia.org/wiki/File:Pg166_bioreactor.jpg

    View Slide

  41. Intro GC Protein detection Nitrilases NHases PKS THM
    Image sources/attribution - part 2/2
    Big Drop-Off [...] http://flickr.com/photos/ctsnow/113339176/ by ctsnow
    Sphaeroeca-colony http://commons.wikimedia.org/wiki/File:Sphaeroeca-colony.jpg by Dhzanette
    Ocean view http://flickr.com/photos/provoost/399669002/ by Sjors Provoost
    The hurdles http://flickr.com/photos/29621494N02/3060466344/ by paula fisher
    Erythromycin http://de.wikipedia.org/w/index.php?title=Datei:Erythrommycin_A_B_C.svg by
    Yikrazuul
    Aflatoxin B1 http://de.wikipedia.org/w/index.php?title=Datei:
    Aflatoxin_B1.svg&filetimestamp=20070113042046 by Bryan Derksen
    Oleandomycin http://en.wikipedia.org/wiki/File:Oleandomycin.png by Edgar181
    Tool rack http://en.wikipedia.org/wiki/File:Oleandomycin.png by L. Marie
    Collaboration http://flickr.com/photos/fncll/145149313/ ChrisL AK
    Base pair AT http://commons.wikimedia.org/wiki/File:Base_pair_AT.svg
    Base pair GC http://commons.wikimedia.org/wiki/File:Base_pair_GC.svg

    View Slide

  42. Intro GC Protein detection Nitrilases NHases PKS THM
    About this document
    Created in L
    ATEX using the beamer class, TeX Live and Emacs.
    All these programs run on OpenBSD.
    http://www.latex-project.org
    http://latex-beamer.sourceforge.net
    http://www.tug.org/texlive/
    http://www.gnu.org/software/emacs
    http://www.gimp.org/
    http://www.openbsd.org
    Published under the Creative Commons Attribution 3.0 License
    http://creativecommons.org/licenses/by/3.0/
    Document version 1.0 2009/02/04

    View Slide