Upgrade to Pro — share decks privately, control downloads, hide ads and more …



Data-driven Identification of Total RNA Expression Genes (TREGs) for
Estimation of RNA Abundance in Heterogeneous Cell Types

Louise Huuki Myers Biology of Genomes 22 Poster


Louise Huuki

May 03, 2022

More Decks by Louise Huuki

Other Decks in Science


  1. Data-driven Identification of Total RNA Expression Genes (TREGs) for Estimation

    of RNA Abundance in Heterogeneous Cell Types L.A. Huuki-Myers1, K.D. Montgomery1, S.H. Kwon1,2, S.C. Page1, S.C. Hicks3, K.R. Maynard1,4*, L. Collado-Torres1,5* 1. Lieber Institute for Brain Development, 2. Department of Neuroscience Johns Hopkins School of Medicine, 3. Department of Biostatistics Johns Hopkins Bloomberg School of Public Health, 4. Department of Psychiatry and Behavioral Sciences JHSOM, 5. Center for Computational Biology Johns Hopkins University BAPTA Abstract Conclusion Acknowledgements Validation of TREGs with smFISH + RNAscope Presenter & Poster requests: Louise.Huuki@libd.org @lahuuki Download this Poster: BioRxiv Pre-print TODO Next-generation sequencing technologies have facilitated data-driven identification of gene sets with different features including genes with stable expression, cell-type specific expression, or spatially variable expression. Here, we aimed to define and identify a new class of "control" genes called) Total RNA Expression Genes (TREGs), which correlate with total RNA abundance in heterogeneous cell types of different sizes and transcriptional activity. We provide a data-driven method to identify TREGs from single cell RNA-sequencing (RNA-seq) data, available as an R/Bioconductor package. We demonstrated the utility of our method in the postmortem human brain using multiplex single molecule fluorescent in situ hybridization (smFISH) and compared candidate TREGs against classic housekeeping genes. We identified AKT3 as a top TREG across five brain regions, especially in the dorsolateral prefrontal cortex. Rank Invariance Calculation Software tools available as R/BioC package: bioconductor.org/packages/TREG i. Filter for low expressed genes ii. Compute Expression Rank of each nucleus for each gene iii. Calculate mean gene expression across all nuclei for one cell type and then its Rank Expression. iv. Per gene, find the difference of the Rank Expression against the mean Rank Expression for each nucleus in a given cell type. v. Calculate the mean of the absolute Expression Rank differences for each gene. vi. Rank the mean absolute Expression Rank differences. vii. Repeat steps ii-vi for each cell type. viii. Per gene, compute the sum of the previous ranks across all cell types, and then rank these sums across genes such that the highest rank is given to the gene with the smallest sum. This is the final Rank Invariance value. Gene Filtering & TREG Selection c i,j,k,z is the number of snRNA-seq counts for nucleus z for gene i, cell type j, and brain region k, and nj,k is the number of nuclei for cell type j and brain region k 1. Filter to top 50% expressed genes 2. Calculate Proportion Zero for each gene across each Brain Region & cell type 3. Filter to genes with a maximum Proportion Zero across groups < 0.75 Properties of candidate TREGs in snRNA-seq data • Select AKT3, ARID1B, and MALAT1 as candidate TREGs from top RI values • Observed highly rank invariant expression vs. Housekeeping gene POLR2A • Candidate TREGs show less variable Expression rank than HK POLR2A • Observed strong linear relationship between TREG expression and total nuclear RNA (estimated by the log2 sum of all counts) within each cell type • Three RNAscope probe combinations (TREG or POLR2A + cell type markers) used to test the performance of the genes • TREG puncta are observed in 86% or more nuclei and in dynamic ranges • AKT3 expression in higher in transcriptionaly activity gray matter than white matter • MALAT1 reads are unreliable • Rank Invariance is an effective way to find TREGs in sn/scRNA-seq data and can be used to identify TREGs relevant to a specific tissue or experimental setting • AKT3 is an effect TREG in the human brain specifically the DLPFC • TREGs represent an important class of genes that could be used for a variety of assays and downstream analyses Puncta vs. Total RNA Expression Huuki-Myers et al., BioRxiv, 2022, 10.1101/2022.04.28.489923 Tran, Maynard et al., Neuron, 2021, 10.1101/2020.10.07.329839 Indica labs, HALO 3.3 FISH-IF TREG paper git repo 10.5281/zendo.6502303 Gene Prop. non- zero in DLPFC snRNA Mean prop. non-zero Mean n puncta AKT3 0.92 0.88 4.09 ARID1B 0.94 0.86 3.08 MALAT1 1.00 0.98 2.07 POLR2A 0.30 0.78 2.75 Gene β sd Std. AKT3 -5.52 5.18 -1.07 ARID1B -2.63 3.42 -0.77 MALAT1 -1.22 1.53 -0.8 POLR2A -3.49 3.34 -1.05 All genes in snRNA-seq -21844.07 15560.76 -1.33 • AKT3 has most similar slope to total expression measured by snRNA-seq over observable cell types • RNA scope + TREG allows the comparison of nuclear size and total RNA expression Experimental Design Proportion Zero Equation Louise Huuki-Myers Kelsey D. Montgomery Sang Ho Kwon Stephanie C. Page Stephanie C. Hicks Kristen R. Maynard Leonardo Collado-Torres