Slide 1

Slide 1 text

@ Murdoch Children’s Research Institute, 2019 Identifying cryptic variants in cancer transcriptomes using RNA-seq data 16.08.2019 Victorian Cancer Bioinformatics Symposium Marek Cmero

Slide 2

Slide 2 text

2 Motivation Image 1: Szczepański, T., Harrison, C. J., & van Dongen, J. J. M. (2010). Genetic aberrations in paediatric acute leukaemias and implications for management of patients. The Lancet Oncology, 11(9), 880–889. https://doi.org/10.1016/S1470-2045(09)70369-9 Image 2: Tsapogas, P., Mooney, C. J., Brown, G., & Rolink, A. (2017). The cytokine Flt3-ligand in normal and malignant hematopoiesis. International Journal of Molecular Sciences, 18(6). https://doi.org/10.3390/ijms18061115 - Gene fusions and transcriptomic variants can modify gene function in cancer, e.g.: - BCR-ABL1 fusion - FLT3 internal tandem duplication - RNA-seq can effectively identify and characterise gene fusions - Beyond fusions, other variants are difficult to detect: - Non-canonical fusions - Transcribed structural variants - Novel splice variants - No method exists to detect, annotate and visualise all types of cryptic variants in RNA- seq data cryptic variants

Slide 3

Slide 3 text

3 Fusions Detected by fusion callers Not detected by fusion callers Typically detected in RNA

Slide 4

Slide 4 text

4 Transcribed structural variants (TSVs) Can be detected by specialised callers (or novel intron) (SVs) typically detected in DNA, harder to detect in RNA

Slide 5

Slide 5 text

5 Novel splice variants Not present in the DNA Detected by transcript assemblers

Slide 6

Slide 6 text

6 Complex/combined variants • Difficult to detect for fusion callers • May be detected by some TSV callers

Slide 7

Slide 7 text

7

Slide 8

Slide 8 text

8 pipeline Basic idea Find and annotate transcripts containing cryptic variants - Assemble transcripts in case sample - Not biased by reference genome - Quantify assembled transcripts using fast pseudo- alignment - Interested in novel contigs not present in controls (rare variants) - Perform DE on assembled transcripts - 1 case vs. N controls - Identify up-regulated novel transcripts - Align DE contigs to genome - Annotate variants not matching reference - Visualise variants

Slide 9

Slide 9 text

9 Applying MINTIE to 1500 simulated cryptic variants (100 per category)

Slide 10

Slide 10 text

10 Comparison of cryptic variants detected using existing tools (targeted)

Slide 11

Slide 11 text

11 Cryptic variants called in a real B-ALL sample Assemble Annotate DE N = 584146 (contigs) Filter Merge N = 22567 (variants) N = 278 (variants) N = 176 (variants) N = 108 (variants)

Slide 12

Slide 12 text

12 RB1 unpartnered fusion – genome view (RNA-seq) RNA-seq coverage (case) Fusion contig Putative deletion Fusion boundaries Novel contigs

Slide 13

Slide 13 text

13 Another example in a different sample IKZF1 partial tandem duplication RNA-seq coverage (case) Novel contigs Variant contig

Slide 14

Slide 14 text

14 Summary • MINTIE Detects all kinds of cryptic variants in RNA-seq cancer samples: • Canonical, non-canonical and unpartnered fusions • Novel splice variants • Transcribed structural variants • Method • De novo assemble > quantify > DE > annotate > visualise • Detects more variants than any other tool • Detected RB1 unpartnered fusion and IKZF1 PTD in B-ALL samples • We are hard at work on the visualisation component!

Slide 15

Slide 15 text

Acknowledgements MCRI Bioinformatics • Alicia Oshlack* • Nadia Davidson* • Breon Schmidt • + Whole team! MCRI Cell biology • Paul Ekert WEHI • Ian Majewski https://github.com/Oshlack/MINTIE @marekcmero *supervised this work equally