Slide 1

Slide 1 text

Single-cells, simulation and kidneys in a dish Luke Zappia MCRI Bioinformatics @_lazappi_

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

Parkville Precinct MCRI

Slide 4

Slide 4 text

MCRI Bioinformatics bpipe Corset Lace Necklace GOseq Splatter clinker JAFFA Cpipe Ximmer Schism missMethyl STRetch Structural Clinical STRs Single-cell Pipelines Gene sets Fusions Assembly superTranscripts Methylation scRNA-tools

Slide 5

Slide 5 text

Bulk RNA-seq ACTGACTCCA TCAGTACTGA CGTGTCATAG GATTGACCTA Gene Sample 1 A 43 B 3 C 17 D 24

Slide 6

Slide 6 text

Single-cell RNA-seq ACTGACTCCA TCAGTACTGA CGTGTCATAG GATTGACCTA ACTGACTCCA TCAGTACTGA CGTGTCATAG GATTGACCTA ACTGACTCCA TCAGTACTGA CGTGTCATAG GATTGACCTA ACTGACTCCA TCAGTACTGA CGTGTCATAG GATTGACCTA Gene Cell 1 Cell 2 Cell 3 Cell 4 A 12 10 9 0 B 0 0 0 1 C 9 6 0 0 D 7 0 4 0

Slide 7

Slide 7 text

Moore’s Law Sevensson et al. arXiv 1704.01379, 2017

Slide 8

Slide 8 text

No content

Slide 9

Slide 9 text

Unique Molecular Identifiers UMIs 5’ 3’ AAAA (PCR){BC}[UMI]TTTT 5 4 Aligned reads De-duplication and counting

Slide 10

Slide 10 text

Gene Cell 1 Cell 2 Cell 3 Cell 4 A 12 10 9 0 B 0 0 0 1 C 9 6 0 0 D 7 0 4 0

Slide 11

Slide 11 text

Gene Cell 1 Cell 2 Cell 3 Cell 4 A 12 0 10 9 0 B 0 0 0 1 C 9 6 0 0 D 7 0 4 0 Bad cell? Low expression? Cell type specific? Cell cycle? Dropout?

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

No content

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

www. .org

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

No content

Slide 19

Slide 19 text

Simulation Biology Evaluation

Slide 20

Slide 20 text

Simulation

Slide 21

Slide 21 text

Simulations Provide a truth to test against BUT - Often poorly documented and explained - Not easily reproducible or reusable - Don’t demonstrate similarity to real data

Slide 22

Slide 22 text

“Splatter: simulation of single-cell RNA sequencing data.” Genome Biology (2017) DOI: 10.1186/s13059-017-1305-0

Slide 23

Slide 23 text

Splatter Bioconductor package Collection of simulation methods Consistent, easy to use, interface Functions for comparison

Slide 24

Slide 24 text

Negative binomial

Slide 25

Slide 25 text

Splat Negative binomial Expression outliers Defined library sizes Mean-variance trend Dropout

Slide 26

Slide 26 text

Simple - Negative binomial Lun - NB with cell factors Lun ATL, Bach K, Marioni JC. Genome Biology (2016). DOI: 10.1186/s13059-016-0947-7. Lun 2 - Sampled NB with batch effects Lun ATL, Marioni JC. Biostatistics (2017). DOI: 10.1093/biostatistics/kxw055. Simulations scDD - NB with bimodality Korthauer KD, et al. Genome Biology (2016). DOI: 10.1186/s13059-016-1077-y. BASiCS - NB with spike-ins Vallejos CA, Marioni JC, Richardson S. PLoS Comp. Bio. (2015). DOI: 10.1371/journal.pcbi.1004333.

Slide 27

Slide 27 text

Using Splatter params1 <- splatEstimate(real.data) params2 <- simpleEstimate(real.data) sim1 <- splatSimulate(params1, ...) sim2 <- simpleSimulate(params2, ...) datasets <- list(Real = real.data, Splat = sim1, Simple = sim2) comp <- compareSCESets(datasets) diff <- diffSCESets(datasets, ref = “Real”) 1. Estimate 2. Simulate 3. Compare

Slide 28

Slide 28 text

Real data 3 HapMap individuals 3 plates each 200 random cells Tung P-Y et al. Sci. Rep. (2017) DOI:10.1038/srep39921 A1 A2 A3 A B1 B2 B3 B C1 C2 C3 C Tung et al. iPSCs, C1 capture

Slide 29

Slide 29 text

Means Difference in Means

Slide 30

Slide 30 text

Zeros per cell Difference in zeros

Slide 31

Slide 31 text

Mean-zeros Difference

Slide 32

Slide 32 text

Rank 1 8

Slide 33

Slide 33 text

Rank 1 8 Full-length Full-length

Slide 34

Slide 34 text

Complex simulations Groups Batches Paths

Slide 35

Slide 35 text

Example evaluation Parameters - Estimated from Tung data Simulation - 400 cells - 3 groups (60%, 25%, 15%) - 10% DE (~1700 genes) - 20 replicates Method - SC3 - k-means consensus clustering - Differential expression - Marker genes

Slide 36

Slide 36 text

Clustering Gene identification

Slide 37

Slide 37 text

New in Splatter 1.2.0 SingleCellExperiment Batch effects Simulations - BASiCS - mfa - PhenoPath - ZINB-WaVE Bioconductor 3.6

Slide 38

Slide 38 text

Simulation summary Simulations are a great tool But they should be: - Reusable - Reproducible - Realistic Splatter is our solution Genome Biology 10.1186/s13059-017-1305-0

Slide 39

Slide 39 text

Biology

Slide 40

Slide 40 text

The kidney OpenStax College, CC BY 3.0 via Wikimedia Commons

Slide 41

Slide 41 text

Organoids Day 0 4 7 10 18 25 CHIR FGF9 FGF9 CHIR Form pellets No GF iPSCs organoid Takasato M et al. Nature. (2015) DOI: 10.1038/nature15695

Slide 42

Slide 42 text

GATA3 ECAD LTL WT1 CD + DT + PT + Glo

Slide 43

Slide 43 text

Fluidigm experiment 4 organoids C1 capture Full-length No spike-ins

Slide 44

Slide 44 text

Analysis Alignment Quantification Quality control Clustering Gene detection Interpretation STAR featureCounts scater SC3 SC3 Biologists

Slide 45

Slide 45 text

Quality control Cells - Alignment - Quantification - Expression 278 -> 155 Genes - Expression - Class 23388

Slide 46

Slide 46 text

Clustering

Slide 47

Slide 47 text

10x experiment 3 organoids Chromium capture UMI ~7000 cells

Slide 48

Slide 48 text

Analysis CellRanger CellRanger scater Seurat Seurat Biologists Alignment Quantification Quality control Clustering Gene detection Interpretation

Slide 49

Slide 49 text

Three clusters Vasculature Epithelium “Stroma”

Slide 50

Slide 50 text

Many clusters

Slide 51

Slide 51 text

Vasculature Proximal tubule Podocytes

Slide 52

Slide 52 text

Mesangium Renal stroma

Slide 53

Slide 53 text

Nephron? Neuronal?

Slide 54

Slide 54 text

?

Slide 55

Slide 55 text

No content

Slide 56

Slide 56 text

Nodes Resolution (k) Cluster Size

Slide 57

Slide 57 text

Edges Cluster from (lower resolution) Cluster to (higher resolution) Number Proportion

Slide 58

Slide 58 text

Proportions 100 60 40 k = 1 k = 2 p from = n / size low n = 60 n = 40 p to = n / size high

Slide 59

Slide 59 text

Clustering tree Resolution 0.01 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Slide 60

Slide 60 text

Resolution 0.01 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Slide 61

Slide 61 text

Resolution 0.01 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Vasculature

Slide 62

Slide 62 text

Resolution 0.01 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Proximal Tubule Podocytes

Slide 63

Slide 63 text

Resolution 0.01 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Mesangium

Slide 64

Slide 64 text

Resolution 0.01 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Renal stroma

Slide 65

Slide 65 text

Resolution 0.01 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Nephron/neuronal?

Slide 66

Slide 66 text

Resolution 0.01 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 ?

Slide 67

Slide 67 text

Summary Kidney organoids are complex More cells help Cluster relationships can be useful Background knowledge is vital

Slide 68

Slide 68 text

Acknowledgements Everyone that makes tools and data available Supervisors Alicia Oshlack Melissa Little MCRI Bioinformatics Belinda Phipson Breon Schmidt MCRI KDDR Alex Combes

Slide 69

Slide 69 text

@_lazappi_ oshlacklab.com www.scRNA-tools.org @scRNAtools “Splatter: simulation of single-cell RNA sequencing data.” Genome Biology (2017) DOI: 10.1186/s13059-017-1305-0 tinyurl.com/clust-tree-funcs “Exploring the single-cell RNA-seq analysis landscape with the scRNA-tools database” bioRxiv (2017) DOI: 10.1101/206573 bioconductor.org/packages/ splatter

Slide 70

Slide 70 text

No content