Slide 1

Slide 1 text

Suggestions for successful scRNA-seq analysis Luke Zappia

Slide 2

Slide 2 text

Postdoctoral researcher scRNA-seq methods, software, benchmarking, analysis Luke Zappia Apply ML to biological data Integration, perturbations, transitions, multimodal Theis lab

Slide 3

Slide 3 text

1. What is scRNA-seq? 2. Designing an scRNA-seq experiment 3. Standard scRNA-seq analysis 4. Advanced analysis topics

Slide 4

Slide 4 text

1. What is scRNA-seq?

Slide 5

Slide 5 text

single-cell RNA sequencing

Slide 6

Slide 6 text

Why single-cell?

Slide 7

Slide 7 text

Single-cell capture Droplet-based Plate/well-based More cells Easier UMI Fewer cells Custom setup Full length, higher depth More flexible

Slide 8

Slide 8 text

UMI vs full-length Unique Molecular Identifiers 5’ AAAA (PCR){BARCODE}[UMI]TTTT Full-length Better quantification Less sequencing No gene-length bias Full coverage More sequencing Affected by gene length

Slide 9

Slide 9 text

Extensions Protein expression (CITE-seq, feature barcoding) Chromatin accessibility (scATAC-seq, 10x Multiome) Spatial location (10 Visium, MERFISH) Immune receptors (TCR/BCR profiling) Methylation, CRISPR screens, electrophysiology,... Pre-sorting (FACS to enrich target cells) Multiplexing (Cell hashing)

Slide 10

Slide 10 text

Comparison to bulk Gives insight into cellular variability Avoids the composition problem Much more complex analysis Much noisier Much sparser - But UMI data isn’t zero inflated!

Slide 11

Slide 11 text

2. Experimental design

Slide 12

Slide 12 text

Who should be involved? Experimentalists 󰟾󰞲 Bioinformaticians 󰟲󰞅 PIs 󰟞󰞝 Collaborators 󰟦󰝹

Slide 13

Slide 13 text

What is the question? What do you want to answer with this experiment? - Not necessarily an hypothesis - Come from experimentalists but refined with analysts - Discuss everything that is relevant - Everyone needs to be on board

Slide 14

Slide 14 text

Things to consider Cells are not replicates! - You need multiple samples from each condition Avoid confounding batches and conditions - How will the samples be prepared? What are your controls? How rare are the cells you are interested in? Are you using the right assay?

Slide 15

Slide 15 text

Example designs Exploratory Case/control Multiple conditions Time series Cohort study Many others…

Slide 16

Slide 16 text

How long will it take? Experiments take time, so does analysis - Often getting results takes longer than generating data Simpler experiments with clearer questions are quicker and easier to analyse You will be likely be competing with other projects, good relationships are key!

Slide 17

Slide 17 text

Make a plan What is the question? What is the design (replicates!)? Who is involved? What is everyone’s role (authorship)? What if somebody leaves? What is the timeline? How is it funded? Write it down!

Slide 18

Slide 18 text

Tips for good collaborations Involve everyone in the process Good, clear communication Share all the (relevant) data Keep good records - Complete, consistent, machine-readable metadata

Slide 19

Slide 19 text

3. Standard analysis

Slide 20

Slide 20 text

De-multiplexing Alignment Quantification @SEQ_ID GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT + !''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65 Gene Cell 1 Cell 2 Cell 3 Cell 4 A 12 10 9 0 B 0 0 1 4 C 9 6 0 0 D 7 0 4 0

Slide 21

Slide 21 text

Quality control Normalisation De-multiplexing Alignment Quantification Cell selection Cell filtering

Slide 22

Slide 22 text

Integration Remove technical effects between batches

Slide 23

Slide 23 text

Single-cell Integration Benchmarking Lücken et al., Nature Methods 2022 scvi-tools.org Gayoso et al., Nature Biotechnology 2022 De Donno et al., Nature Methods 2023 scPoli Hrovatin et al., bioRxiv sysVI Lotfollahi et al., Nature Biotechnology 2022 scArches

Slide 24

Slide 24 text

Integration Clustering Marker genes Annotation

Slide 25

Slide 25 text

Integration Clustering Marker genes Annotation

Slide 26

Slide 26 text

Integration Clustering Marker genes Annotation Label2 Label1 Label3 Label4 Label5

Slide 27

Slide 27 text

Quality control Normalisation De-multiplexing Alignment Quantification Integration Clustering Marker genes Annotation Downstream analysis…

Slide 28

Slide 28 text

Visualisation Quality control Normalisation De-multiplexing Alignment Quantification Integration Clustering Marker genes Annotation Downstream analysis…

Slide 29

Slide 29 text

2D embedding Most common visualisation - t-SNE, UMAP etc. Can be useful BUT: - Easy to overinterpret - Hides lots of complexity - Potentially misleading

Slide 30

Slide 30 text

Over 1700 scRNA-seq tools www.scRNA-tools.org

Slide 31

Slide 31 text

Ecosystems scverse

Slide 32

Slide 32 text

4. Advanced analysis

Slide 33

Slide 33 text

Downstream analyses vs Differential expression Condition 1 Condition 2 vs Differential abundance Pseudotime RNA velocity Fine variation

Slide 34

Slide 34 text

Using references Classification Reference mapping “Foundation models” Label Label Embedding Condition Niche…

Slide 35

Slide 35 text

Multimodal analysis RNA Protein Accessibility Result

Slide 36

Slide 36 text

Resources “Current best practices in single-cell RNA-seq analysis: a tutorial” Lücken, Theis, Molecular Systems Biology 2019 “Orchestrating Single-Cell Analysis with Bioconductor” bioconductor.org/books/release/OSCA Seurat satijalab.org/seurat Scanpy scanpy.readthedocs.io scverse scverse.org scRNA-tools scRNA-tools.org Open Problems in Single-Cell Analysis openproblems.bio sc-best-practices.org Huemos, Schaar et al., Nature Reviews Genetics 2023

Slide 37

Slide 37 text

Acknowledgements Theis lab 󰟾󰞲 Community 🫂 Everyone who has written documentation, tutorials etc. 📄 Everyone has developed tools and made their code available 🛠 󰞅

Slide 38

Slide 38 text

Suggestions for Successful scRNA-seq analysis 󰢦󰢡 ❓ Luke Zappia @_lazappi_ @lazappi lazappi.id.au @lazappi