Slide 1

Slide 1 text

The Protist Ribosomal Reference database Daniel Vaulot and the PR2 team Protist-Online - 2020-06-24 1 / 27

Slide 2

Slide 2 text

Outline The explosion of metabarcoding PR2 Major uses A database of metabarcodes: meta PR2 What's next ? 2 / 27

Slide 3

Slide 3 text

Metabarcoding 3 / 27

Slide 4

Slide 4 text

Principle modified from Ruiz-trillo, I. & Ferrer-Bonet, M. ¿Con quién compartimos el planeta? Investigacion y Ciencia 56–60 (2018) 4 / 27

Slide 5

Slide 5 text

Target gene 18S rRNA ITS 16S plastid rbcL 5 / 27

Slide 6

Slide 6 text

Assignement Reference database Genbank Taxonomy very bad Silva OK for prokaryotes Eukaryotes bad 6 / 27

Slide 7

Slide 7 text

The PR2 database 7 / 27

Slide 8

Slide 8 text

History 8 / 27

Slide 9

Slide 9 text

Team 9 / 27

Slide 10

Slide 10 text

Key features - Version 4.12.0 (08-2019) Unified taxonomy (8 ranks from kingdom to species) Web site: https://pr2-database.org/ 177 934 sequences of nuclear 18S rRNA 6 010 sequences of plastid 16S rRNA (PhytoRef) Quality control (e.g. > 500 bp., N < 20, no "NNN") Metadata (e.g. coordinates, environment) Available as flat file or as R package Guillou, L., Bachar, D., Audic, S., Bass, D., Berney, C., Bittner, L., Boutte, C. et al. 2013. The Protist Ribosomal Reference database ( PR2): a catalog of unicellular eukaryote Small Sub-Unit rRNA sequences with curated taxonomy. Nucleic Acids Res. 41:D597–604. 10 / 27

Slide 11

Slide 11 text

Management MySQL database R scripts for: importing exporting validating Data provided metabarcoding (dada2, QIIME) fasta (phylogeny) R package https://github.com/pr2database/pr2database/releases 11 / 27

Slide 12

Slide 12 text

R package https://pr2database.github.io/pr2database/articles/pr2database.html 12 / 27

Slide 13

Slide 13 text

Annotation - Contributions https://pr2-database.org/documentation/pr2-taxonomic-groups 13 / 27

Slide 14

Slide 14 text

Annotation - EukRef (J. del Campo) https://pr2-database.org/eukref/about/ 14 / 27

Slide 15

Slide 15 text

Version 5.0.0 - July 2020 Groups reannotated Dinoflagellates Diatoms, Chrysophyceae, Pelagophyceae Foraminifera Adl SM. et al.. 2019. Revisions to the Classification, Nomenclature, and Diversity of Eukaryotes. Journal of Eukaryotic Microbiology 66:4–119. Burki F., Roger AJ., Brown MW., Simpson AGB. 2019. The New Tree of Eukaryotes. Trends in Ecology & Evolution Taxonomy goes from 8 to 9 levels kingdom -> domain division / subdivision / class New sequences 18S nuclear: 300,000 Silva and Genbank not yet integrated into PR2 assigned with dada2 18S nucleomorph: 250 16S mitochondria 15 / 27

Slide 16

Slide 16 text

Using PR2 16 / 27

Slide 17

Slide 17 text

More than 380 papers citing PR2 https://pr2-database.org/papers/papers-citing-pr2 17 / 27

Slide 18

Slide 18 text

Metabarcoding Applied domains Gutters of Paris Human microbiome Forensics Marine Freshwater Soil 18 / 27

Slide 19

Slide 19 text

Primer database In silico analysis https://github.com/pr2database/pr2-primers/wiki/18S-rRNA-primers Geisen, Vaulot et al. A user guide to environmental protistology: primers, metabarcoding, sequencing, and analyses. bioRxiv https://doi.org/10.1101/850610. 19 / 27

Slide 20

Slide 20 text

A database of metabarcodes: meta PR2 20 / 27

Slide 21

Slide 21 text

Many metabarcoding data sets available Ocean Sampling Day (OSD) Malaspina Tara Oceans individual studies But hard to use together... Processed with different pipelines Different levels of similarity Different reference databases Metadata lacking 21 / 27

Slide 22

Slide 22 text

meta PR2 Download public data Raw sequences (fastq) Metadata Reprocess Amplicon Sequence Variant (dada2) Different datasets can be merged Stored in MySQL database Processed with R scripts 22 / 27

Slide 23

Slide 23 text

Status of the database Datasets included : 32 V4 OSD Malaspina Polar regions V9 Tara Oceans Samples: 5,094 ASVs: 126,669 23 / 27

Slide 24

Slide 24 text

Pelagophyceae Key picophytoplankton group in oceanic waters. Cabello AM.et al. 2018. Pelagophyte assemblages of the global ocean display low intraspecific diversity. in prep. Andersen RA., Saunders GW., Paskind MP., Sexton J. 1993. Ultrastructure and 18S rRNA gene sequence for Pelagomonas calceolata gen. and sp. nov. and the description of a new algal class, the Pelagophyceae classis nov. Journal of Phycology 29:701–715. 24 / 27

Slide 25

Slide 25 text

What is next Full rRNA operon Annotation of specific groups Contributors EukRef Metadata Phenotypes 25 / 27

Slide 26

Slide 26 text

Acknowledgments Biomarks (EU) Moore Foundation (EukRef) CNRS Sorbonne Université Nanyang Technological University 26 / 27

Slide 27

Slide 27 text

pr2-database.org 27 / 27