Upgrade to Pro — share decks privately, control downloads, hide ads and more …



Data-driven Identification of Total RNA Expression Genes (TREGs) for
Estimation of RNA Abundance in Heterogeneous Cell Types

Louise Huuki Myers Biology of Genomes 22 Poster

Louise Huuki-Myers

May 03, 2022

More Decks by Louise Huuki-Myers

Other Decks in Science


  1. Data-driven Identification of Total RNA Expression Genes (TREGs) for
    Estimation of RNA Abundance in Heterogeneous Cell Types
    L.A. Huuki-Myers1, K.D. Montgomery1, S.H. Kwon1,2, S.C. Page1, S.C. Hicks3, K.R. Maynard1,4*, L. Collado-Torres1,5*
    1. Lieber Institute for Brain Development, 2. Department of Neuroscience Johns Hopkins School of Medicine, 3. Department of Biostatistics Johns Hopkins Bloomberg
    School of Public Health, 4. Department of Psychiatry and Behavioral Sciences JHSOM, 5. Center for Computational Biology Johns Hopkins University
    Validation of TREGs with smFISH +
    Presenter &
    Poster requests:
    [email protected]
    @lahuuki Download this Poster:
    BioRxiv Pre-print
    Next-generation sequencing technologies have facilitated data-driven
    identification of gene sets with different features including genes with
    stable expression, cell-type specific expression, or spatially variable
    expression. Here, we aimed to define and identify a new class of
    "control" genes called) Total RNA Expression Genes (TREGs), which
    correlate with total RNA abundance in heterogeneous cell types of
    different sizes and transcriptional activity. We provide a data-driven
    method to identify TREGs from single cell RNA-sequencing (RNA-seq)
    data, available as an R/Bioconductor package. We demonstrated the
    utility of our method in the postmortem human brain using multiplex
    single molecule fluorescent in situ hybridization (smFISH) and
    compared candidate TREGs against classic housekeeping genes. We
    identified AKT3 as a top TREG across five brain regions, especially in the
    dorsolateral prefrontal cortex.
    Rank Invariance Calculation
    Software tools available as R/BioC package:
    i. Filter for low expressed genes
    ii. Compute Expression Rank of each
    nucleus for each gene
    iii. Calculate mean gene expression across
    all nuclei for one cell type and then its
    Rank Expression.
    iv. Per gene, find the difference of the
    Rank Expression against the mean Rank
    Expression for each nucleus in a given
    cell type.
    v. Calculate the mean of the absolute
    Expression Rank differences for each
    vi. Rank the mean absolute Expression
    Rank differences.
    vii. Repeat steps ii-vi for each cell type.
    viii. Per gene, compute the sum of the
    previous ranks across all cell types, and
    then rank these sums across genes
    such that the highest rank is given to
    the gene with the smallest sum. This is
    the final Rank Invariance value.
    Gene Filtering & TREG Selection
    i,j,k,z is the number of snRNA-seq
    counts for nucleus z for gene i, cell
    type j, and brain region k, and nj,k is
    the number of nuclei for cell type j
    and brain region k
    1. Filter to top 50% expressed genes
    2. Calculate Proportion Zero for each gene across each Brain Region
    & cell type
    3. Filter to genes with a maximum Proportion Zero across groups <
    Properties of candidate TREGs in snRNA-seq data
    • Select AKT3, ARID1B, and MALAT1 as candidate TREGs from top RI
    • Observed highly rank invariant expression vs. Housekeeping gene
    • Candidate TREGs show less variable Expression rank than HK POLR2A
    • Observed strong linear relationship between TREG expression and
    total nuclear RNA (estimated by the log2 sum of all counts) within
    each cell type
    • Three RNAscope probe combinations (TREG or POLR2A + cell type
    markers) used to test the performance of the genes
    • TREG puncta are observed in 86% or more nuclei and in dynamic
    • AKT3 expression in higher in transcriptionaly activity gray matter than
    white matter
    • MALAT1 reads are unreliable
    • Rank Invariance is an effective way to find TREGs in
    sn/scRNA-seq data and can be used to identify TREGs
    relevant to a specific tissue or experimental setting
    • AKT3 is an effect TREG in the human brain specifically the
    • TREGs represent an important class of genes that could be
    used for a variety of assays and downstream analyses
    Puncta vs. Total RNA Expression
    Huuki-Myers et al., BioRxiv, 2022, 10.1101/2022.04.28.489923
    Tran, Maynard et al., Neuron, 2021, 10.1101/2020.10.07.329839
    Indica labs, HALO 3.3 FISH-IF
    TREG paper git repo 10.5281/zendo.6502303
    Prop. non-
    zero in
    Mean n
    AKT3 0.92 0.88 4.09
    ARID1B 0.94 0.86 3.08
    MALAT1 1.00 0.98 2.07
    POLR2A 0.30 0.78 2.75
    Gene β sd Std.
    AKT3 -5.52 5.18 -1.07
    ARID1B -2.63 3.42 -0.77
    MALAT1 -1.22 1.53 -0.8
    POLR2A -3.49 3.34 -1.05
    All genes in
    snRNA-seq -21844.07 15560.76 -1.33
    • AKT3 has most similar slope to
    total expression measured by
    snRNA-seq over observable cell
    • RNA scope + TREG allows the
    comparison of nuclear size and
    total RNA expression
    Experimental Design
    Proportion Zero Equation
    Louise Huuki-Myers Kelsey D. Montgomery Sang Ho Kwon Stephanie C. Page Stephanie C. Hicks Kristen R. Maynard Leonardo Collado-Torres

    View full-size slide