"Navigating human brain gene expression measurements at different resolutions to study psychiatric disorders" seminar on 2023-05-22 at UCL Great Ormond Street Institute of Child Health
different resolutions to study psychiatric disorders Leonardo Collado Torres, Investigator UCL Great Ormond Street Institute of Child Health May 22 2023 Slides available at speakerdeck.com/lcolladotor
= 36 Discovery data Postmortem Human Brain Samples Fetal Infant Child Teen Adult 50+ 6 / group, N = 36 Replication data Andrew E Jaffe @andrewejaffe Ph.D. co-advisor Developmental regulation of human cortex transcription and its clinical relevance at single base resolution doi.org/10.1038/nn.3898 github.com/leekgroup/libd_n36
in the Frontal Cortex and Hippocampus across Development and Schizophrenia LIBD BrainSEQ Phase 2: DLPFC + HPC eqtl.brainseq.org/phase2/ doi.org/10.1073/pnas.1617384114 qSVA framework for RNA quality correction in differential expression analysis Amy Peterson @amptrsn
(GitHub) Christopher Wilks @chrisnwilks Shannon Ellis @Shannon_E_Ellis Kasper Daniel Hansen @KasperDHansen Andrew E Jaffe @andrewejaffe Ph.D. co-advisor + LIBD former boss Jeff Leek @jtleek Ph.D. advisor
N=9,962 TCGA N=11,284 SRA N=49,848 samples expression estimates gene exon junctions ERs Answer meaningful questions about human biology and expression slide adapted from Shannon Ellis Reproducible RNA-seq analysis using #recount2 + Improving the value of public RNA-seq expression data by phenotype prediction doi.org/10.1038/nbt.3838 doi.org/10.1093/nar/gky102
Race Age 6620 NA female liver NA NA 6621 NA female liver NA NA 6622 NA female liver NA NA 6623 NA female liver NA NA 6624 NA female liver NA NA 6625 NA male liver NA NA 6626 NA male liver NA NA 6627 NA male liver NA NA 6628 NA male liver NA NA 6629 NA male liver NA NA 6630 NA male liver NA NA 6631 NA NA blood NA NA 6632 NA NA blood NA NA 6633 NA NA blood NA NA 6634 NA NA blood NA NA 6635 NA NA blood NA NA 6636 NA NA blood NA NA z z z z slide adapted from shannon ellis Shannon Ellis @Shannon_E_Ellis
male 1240 Male 141 Total 3640 Even when information is provided, it’s not always clear… sra_meta$Sex “1 Male, 2 Female”, “2 Male, 1 Female”, “3 Female”, “DK”, “male and female” “Male (note: ….)”, “missing”, “mixed”, “mixture”, “N/A”, “Not available”, “not applicable”, “not collected”, “not determined”, “pooled male and female”, “U”, “unknown”, “Unknown” slide adapted from Shannon Ellis Shannon Ellis @Shannon_E_Ellis Improving the value of public RNA-seq expression data by phenotype prediction doi.org/10.1093/nar/gky102
and queries for large-scale RNA-seq expression and splicing Christopher Wilks @chrisnwilks research.libd.org/recount3-docs/ doi.org/10.1186/s13059-021-02533-6
and RNA content between cell types • Use smFISH with RNAscope to establish data set of: ◦ Cellular composition ◦ Nuclei sizes of major cell types ◦ Average nuclei RNA content of major cell types How do we measure total RNA content of a cell if we can only observe a few genes at a time? Use a TREG Data-driven Identification of Total RNA Expression Genes (TREGs) for Estimation of RNA Abundance in Heterogeneous Cell Types research.libd.org/TREG/ doi.org/10.1101/2022.04.28.489923 Louise A Huuki-Myers @lahuuki
Expression is proportional to the overall RNA expression in a nucleus • In smFISH the count of TREG puncta in a nucleus can estimate the RNA content Data-driven Identification of Total RNA Expression Genes (TREGs) for Estimation of RNA Abundance in Heterogeneous Cell Types research.libd.org/TREG/ doi.org/10.1101/2022.04.28.489923
head morphogenesis; expressed in cerebral cortex Trpc4: acts upstream of or within gamma-aminobutyric acid secretion and oligodendrocyte differentiation; expressed in brain Scaf11: predicted to be involved in spliceosomal complex assembly; expressed in diencephalon lateral wall ventricular layer; ; midbrain ventricular layer; and telencephalon ventricular layer Daianna Gonzalez-Padilla @daianna_glez
predicted to be involved in protein modification by small protein conjugation or removal, protein neddylation, and regulation of cell growth; expressed in NS Pnisr: predicted to be active in presynaptic active zone; expressed in NS
exon3 exon1 exon3 exon2 genome sequence GT AG GT AG exon3 exon1 isoform1 isoform1 alignments isoform2 alignments isoform2 transcript assembly (Cufflinks,StringTie) exon2 exon3 exon1 exon3 exon1 isoform1 isoform2 1 2 3 Transcript reconstruction from read mappings to the genome exons & introns do not have to be defined in the reference annotation captures potentially "novel" isoforms
Datasets. Comparing gene-level degradation effects in the full degradation experiment (all regions) vs. t-statistic from Differential expression of case vs. schizophrenia for five Lieber Institute publicly available datasets (rows TODO supp table) over six different models (columns). Backgrounds shaded by value of absolute correlation. Joshua M Stolz @JoshStolz2 Hédia Tnani @TnaniHedia #qsvaR
expression. The replication rate between over p-value cutoffs for all available models for A. BSP1 and BSP2 DLPFC B. CMC and BSP1 C. CMC and BSP2 DLPFC Joshua M Stolz @JoshStolz2
think adding 0, multiplying by 1 • It nearly always takes a team • Data sharing accelerates science + democratizes access to it • Zooming in allows us to reduce the heterogeneity • We can learn from each other: from uniformly processing our data & re-using it → replicate / validate?
for data skills: wrangling, visualization, analysis - LIBD itself generates results that are large data collections - Greater demand across LIBD scientists to learn how to work with data https://ceramics.org/ceramic-tech-today/supercomputer- powered-materials-database-unleashes-data-deluge … and many more
to learn, guide, build training material - You need time also for collaborating - It’s important to respect both and plan accordingly 2 20% 80% research
Weber @stephaniehicks Stephanie C Hicks @abspangler Abby Spangler @martinowk Keri Martinowich @CerceoPage Stephanie C Page @kr_maynard Kristen R Maynard @lcolladotor Leonardo Collado-Torres @Nick-Eagles (GH) Nicholas J Eagles Kelsey D Montgomery Sang Ho Kwon Image Analysis Expression Analysis Data Generation Thomas M Hyde @lahuuki Louise A Huuki-Myers @BoyiGuo Boyi Guo @mattntran Matthew N Tran @sowmyapartybun Sowmya Parthiban Slides available at speakerdeck.com /lcolladotor + Many more LIBD, JHU, and external collaborators @mgrantpeters Melissa Grant-Peters @prashanthi-ravichandran (GH) Prashanthi Ravichandran