gi2017: Simulation and analysis tools for single-cell RNA sequencing data

Simulation and analysis tools for single-cell RNA sequencing data Luke
Zappia @_lazappi_ #gi2017

Single-cell RNA-seq ACTGACTCCA TCAGTACTGA CGTGTCATAG GATTGACCTA ACTGACTCCA TCAGTACTGA CGTGTCATAG GATTGACCTA
ACTGACTCCA TCAGTACTGA CGTGTCATAG GATTGACCTA ACTGACTCCA TCAGTACTGA CGTGTCATAG GATTGACCTA Gene Cell Cell Cell Cell A B C D

Svensson et al. arXiv . ,

Gene Cell Cell Cell Cell A B C D Bad
cell? Low expression? Cell type specific? Cell cycle? Dropout?

www. .org

Provide a truth to test against BUT - Often poorly
documented and explained - Not easily reproducible or reusable - Don’t demonstrate similarity to real data Simulations

“Splatter: simulation of single-cell RNA sequencing data.” Genome Biology (
) DOI: . /s - - -

Splat Negative binomial Expression outliers Defined library sizes Mean-variance trend
Dropout

Simple - Negative binomial Lun - NB with cell factors
Lun ATL, Bach K, Marioni JC. Genome Biology ( ). DOI: . /s - - - . Lun 2 - Sampled NB with batch effects Lun ATL, Marioni JC. Biostatistics ( ). DOI: . /biostatistics/kxw . scDD - NB with bimodality Korthauer KD, et al. Genome Biology ( ). DOI: . /s - - -y. BASiCS - NB with spike-ins Vallejos CA, Marioni JC, Richardson S. PLoS Comp. Bio. ( ). DOI: . /journal.pcbi. . Simulations

. Estimate . Simulate . Compare params1 <- splatEstimate(real.data) params2
<- simpleEstimate(real.data) sim1 <- splatSimulate(params1, ...) sim2 <- simpleSimulate(params2, ...) datasets <- list(Real = real.data, Splat = sim1, Simple = sim2) comp <- compareSCESets(datasets) diff <- diffSCESets(datasets, ref = “Real”) Using Splatter

Real data HapMap individuals plates each random cells Tung P-Y
et al. Sci. Rep. ( ) DOI: . /srep A A A A B B B B C C C C Tung et al. iPSCs, C capture

Means Difference in Means

Zeros per cell Difference in zeros

Mean-zeros Difference

Rank Full-length Full-length

Complex simulations Groups Batches Paths

SingleCellExperiment Batch effects Simulations - BASiCS - mfa - PhenoPath
- ZINB-WaVE New in Splatter . . Bioconductor 3.6

Many tools for scRNA-seq analysis Catalogued in the scRNA-tools database
Can be tested using synthetic datasets Splatter is our package for simulating scRNA-seq data Summary

@_lazappi_ oshlacklab.com Supervisors Alicia Oshlack Melissa Little MCRI Bioinformatics Belinda
Phipson Breon Schmidt Everyone that makes tools and data available www.scRNA-tools.org @scRNAtools “Splatter: simulation of single-cell RNA sequencing data.” Genome Biology ( ) DOI: . /s - - - “Exploring the single-cell RNA-seq analysis landscape with the scRNA-tools database” bioRxiv ( ) DOI: . / bioconductor.org/packages/ splatter

gi2017: Simulation and analysis tools for singl...

gi2017: Simulation and analysis tools for single-cell RNA sequencing data

Luke Zappia

More Decks by Luke Zappia

Other Decks in Science

Featured

Transcript

Simulation and analysis tools for single-cell RNA sequencing data Luke

Single-cell RNA-seq ACTGACTCCA TCAGTACTGA CGTGTCATAG GATTGACCTA ACTGACTCCA TCAGTACTGA CGTGTCATAG GATTGACCTA

Svensson et al. arXiv . ,

Gene Cell Cell Cell Cell A B C D Bad

www. .org

Provide a truth to test against BUT - Often poorly

“Splatter: simulation of single-cell RNA sequencing data.” Genome Biology (

Splat Negative binomial Expression outliers Defined library sizes Mean-variance trend

Simple - Negative binomial Lun - NB with cell factors

. Estimate . Simulate . Compare params1 <- splatEstimate(real.data) params2

Real data HapMap individuals plates each random cells Tung P-Y

Means Difference in Means

Zeros per cell Difference in zeros

Mean-zeros Difference

Rank

Rank Full-length Full-length

Complex simulations Groups Batches Paths

SingleCellExperiment Batch effects Simulations - BASiCS - mfa - PhenoPath

Many tools for scRNA-seq analysis Catalogued in the scRNA-tools database

@_lazappi_ oshlacklab.com Supervisors Alicia Oshlack Melissa Little MCRI Bioinformatics Belinda