PhD Europe 2018

Tools, simulations and trees for scRNA-seq Luke Zappia @_lazappi_

Parkville precinct

MCRI Bioinformatics bpipe Corset Lace Necklace GOseq Splatter clinker JAFFA
Cpipe Ximmer Schism missMethyl STRetch Structural Clinical STRs Single-cell Pipelines Gene sets Fusions Assembly superTranscripts Methylation scRNA-tools clustree Visualisation

MCRI Kidney Development OpenStax College, CC BY 3.0 via Wikimedia
Commons

Kidney organoids Day 0 4 7 10 18 25 CHIR
FGF9 FGF9 CHIR Form pellets No GF iPSCs organoid

1 Tools 2 Simulations 3 Clustering trees 4 Analysis 1
2 3 4

Dataset 4 Organoids 10x Chromium 2 Batches (3 + 1)
7937 cells (6649 + 1288) Identify cell types

Tools 1

Svensson et al. DOI: 10.1038/nprot.2017.149

www. .org “Exploring the single-cell RNA-seq analysis landscape with the
scRNA-tools database” PLoS Computational Biology (2018) DOI: 10.1371/journal.pcbi.1006245

Simulations 2

Simulations Provide a truth to test against BUT - Often
poorly documented and explained - Not easily reproducible or reusable - Don’t demonstrate similarity to real data

“Splatter: simulation of single-cell RNA sequencing data” Genome Biology (2017)
DOI: 10.1186/s13059-017-1305-0

Splat Negative binomial Expression outliers Deﬁned library sizes Mean-variance trend
Dropout

Simulation models Simple - Negative binomial Lun - NB with
cell factors DOI: 10.1186/s13059-016-0947-7 Lun2 - Sampled NB with batch effects DOI: 10.1093/biostatistics/kxw055 scDD - NB with bimodality DOI: 10.1186/s13059-016-1077-y BASiCS - NB with spike-ins DOI: 10.1371/journal.pcbi.1004333 mfa - Bifurcating pseudotime trajectory DOI: 10.12688/wellcomeopenres.11087.1 PhenoPath - Pseudotime with gene types DOI: 10.1038/s41467-018-04696-6 ZINB-WaVE - Sophisticated ZINB DOI: 10.1186/s13059-018-1406-4 SparseDC - Clusters across two conditions DOI: 10.1093/nar/gkx1113

1. Estimate 2. Simulate 3. Compare params1 <- splatEstimate(real.data) params2
<- simpleEstimate(real.data) sim1 <- splatSimulate(params1, ...) sim2 <- simpleSimulate(params2, ...) datasets <- list(Real = real.data, Splat = sim1, Simple = sim2) comp <- compareSCESets(datasets) diff <- diffSCESets(datasets, ref = “Real”) Using Splatter

ZINB-WaVE SparseDC PhenoPath mfa BASiCS scDD Lun2 (ZINB) Lun2 Lun
Simple Splat (Drop) Splat Real Mean log 2 (CPM + 1) Distribution of mean expression ZINB-WaVE SparseDC PhenoPath mfa BASiCS scDD Lun2 (ZINB) Lun2 Lun Simple Splat (Drop) Splat Rank Difference Mean log 2 (CPM + 1) Difference in mean expression

ZINB-WaVE SparseDC PhenoPath mfa BASiCS scDD Lun2 (ZINB) Lun2 Lun
Simple Splat (Drop) Splat Mean Variance Mean-Variance Library size %Zeros (Cell) % Zeros (Gene) Mean-Zeros Rank of MAD from real data

CountSimQC DESeq2 dispersions Feature correlations Soneson et al. DOI: 10.1093/bioinformatics/btx631

https://github.com/YosefLab/SymSim https://github.com/bvieth/powsimR

Complex simulations Groups Batches Paths

Clustering trees 3

Clustering methods > 25% of all tools

How many clusters?

A tree of clusters?

“Clustering trees: a visualisation for evaluating clusterings at multiple resolutions”
GigaScience (2018) DOI: doi.org/10.1093/gigascience/giy083

Organoid data

Analysis 4

GATA3 ECAD LTL WT1 CD + DT + PT +
Glo

Alignment Quantiﬁcation Quality control Integration Clustering Gene detection Ordering CellRanger
CellRanger scater Seurat Seurat Seurat Monocle Analysis steps

Stroma Endothelium Cell cycle Podocyte Epithelium

Podocyte Early podocyte Early proximal Early distal Progenitor

Progenitor Early tubule Podocyte

Human dataset 16 week fetal kidney 3178 cells 10x Chromium
Lindström et al. “Conserved and Divergent Features of Mesenchymal Progenitor Cell Types within the Cortical Nephrogenic Niche of the Human and Mouse Kidney” J Am Soc Nephrol (2018) DOI:10.1681/ASN.2017080890

Stroma Endothelium Cell cycle Podocyte Nephron Progenitor Immune

Fetal kidney Organoid

Podocyte (human only) Early podocyte Proximal Distal Progenitor Stroma

Podocyte Early proximal Early distal Early podocyte Progenitor Diff. progenitor
Human pod. Stroma Fetal kidney Organoid

install.packages(“clustree”) Paper doi.org/10.1093/gigascience/giy083 @_lazappi_ oshlacklab.com github.com/lazappi biocLite(“splatter”) Paper doi.org/10.1186/s13059-017-1305-0 www.scRNA-tools.org
Paper doi.org/10.1093/gigascience/giy083

install.packages(“clustree”) Paper doi.org/10.1093/gigascience/giy083 la

PhD Europe 2018

PhD Europe 2018

More Decks by Luke Zappia

Other Decks in Science

Featured

Transcript