Upgrade to Pro — share decks privately, control downloads, hide ads and more …

CDC/ATSDR R User Group 2021

CDC/ATSDR R User Group 2021

2021-01-28 presentation on #spatialLIBD and #recount3 for the CDC/ATSDR R User Group

Leonardo Collado-Torres

January 28, 2021
Tweet

More Decks by Leonardo Collado-Torres

Other Decks in Science

Transcript

  1. 29
    Expanding the resolution of gene
    expression analyses: spatially
    (spatialLIBD) and in numbers (recount3)
    Leonardo Collado-Torres, Ph.D., Investigator
    Lieber Institute for Brain Development
    CDC/ATSDR R User Group
    2021-01-28
    @lcolladotor
    @LieberInstitute
    #recount3
    #spatialLIBD

    View Slide

  2. 2
    https://doi.org/10.1016/j.biopsych.2020.06.005

    View Slide

  3. The spatial architecture of the brain is
    fundamentally connected to its function
    3
    chartdiagram.com slideshare.net

    View Slide

  4. Laminar position of a cell influences its gene
    expression, morphology, physiology, and
    function
    4
    Kwan et al., 2012, Development

    View Slide

  5. Single nucleus RNA-sequencing & Visium technologies
    5
    Single Cell Gene Expression
    Spatial Gene Expression

    View Slide

  6. Overview
    6
    1. Identification of layer-enriched genes in human cortex using Visium.
    2. Spatial registration of single-nucleus RNA-seq data from human cortex.
    3. Layer-enriched expression of genes associated with brain disorders.
    Maynard, Collado-Torres, et al, bioRxiv, 2020
    @kr_maynard

    View Slide

  7. Study design for Visium experiments in
    dorsolateral prefrontal cortex (DLPFC)
    7
    Andrew E Jaffe
    Keri Martinowich
    Stephanie C Hicks Lukas M Weber
    Cedric Uytingco Nikhil Rao
    @stephaniehicks @lmwebr @martinowk @andrewejaffe

    View Slide

  8. Visualizing gene expression in a histological context
    8
    logcounts logcounts logcounts
    Maynard, Collado-Torres, et al, bioRxiv, 2020

    View Slide

  9. 2 pairs spatial adjacent replicates x subject = 12 sections
    9
    Subject 1
    Subject 2
    Subject 3
    Adjacent spatial replicates (0µm) Adjacent spatial replicates (300µm)
    Maynard, Collado-Torres, et al, bioRxiv, 2020
    PCP4

    View Slide

  10. “Pseudo-bulking” collapses data: spot to layer level
    10
    Maynard, Collado-Torres, et al, bioRxiv, 2020

    View Slide

  11. Three statistical models to assess laminar enrichment
    “ANOVA”
    model
    11
    “Enrichment”
    model
    “Pairwise”
    model
    Maynard, Collado-Torres, et al, bioRxiv, 2020
    Is any layer different? Is one layer > the rest? Is layer X > layer Y?

    View Slide

  12. 12 ISH images courtesy of Allen Human Brain Atlas: http://human.brain-map.org/ (Hawrylycz et al., 2012)
    Maynard, Collado-Torres, et al, bioRxiv, 2020
    Visium replicates layer-enrichment of previously
    identified layer marker genes
    L4>rest, p=1.74e-09
    L6>WM, p=4.48e-19
    logcounts
    logcounts

    View Slide

  13. Identification & validation of novel layer-enriched genes
    13
    Maynard, Collado-Torres, et al, bioRxiv, 2020
    L5>rest, p=4.33e-12
    L6>rest, p=5.05e-12
    L1>rest, p=1.47e-10
    L2>rest, p=9.73e-11

    View Slide

  14. L4
    L3
    L2
    L1
    0.0
    0.2
    0.4
    0.6
    0.8
    (A) (B)
    (C)
    Maynard, Collado-Torres, et al, bioRxiv, 2020
    Spatial registration of your sc/snRNA-seq data
    Your sc/snRNA-seq data
    Hodge et al, Nature, 2019

    View Slide

  15. L4
    L3
    L2
    L1
    0.0
    0.2
    0.4
    0.6
    0.8
    (A) (B)
    (C)
    Maynard, Collado-Torres, et al, bioRxiv, 2020
    Spatial registration of your sc/snRNA-seq data
    Your sc/snRNA-seq data
    Our spatial data
    Hodge et al, Nature, 2019

    View Slide

  16. 16
    Maynard, Collado-Torres, et al, bioRxiv, 2020
    12
    15 Matthew N Tran Brianna K Barry
    @mattntran
    Identify clusters in your sc/snRNA-seq data
    - Pre-process your sc/snRNA-seq data
    - Identify cell/nuclei clusters
    - Find data-driven marker genes and/or
    combine with known marker genes
    - Label clusters

    View Slide

  17. 17
    Maynard, Collado-Torres, et al, bioRxiv, 2020
    # columns for us:
    12 * 7 = 84 (76)
    “Pseudo-bulk” our spatial transcriptomics data

    View Slide

  18. 18
    Maynard, Collado-Torres, et al, bioRxiv, 2020
    Your sc/snRNA-seq:
    cell or nuclei clusters * subjects or other analysis variables
    “Pseudo-bulk” your sc/snRNA-seq data

    View Slide

  19. Three statistical models to assess laminar enrichment
    “ANOVA”
    model
    19
    “Enrichment”
    model
    “Pairwise”
    model
    Is any layer different?
    Is one layer > the rest?
    Is layer X > layer Y?
    Maynard, Collado-Torres, et al, bioRxiv, 2020

    View Slide

  20. WM
    L6
    L5
    L4
    L3
    L2
    L1
    Oli3
    Oli5
    Oli4
    Oli0
    Oli1
    Ast3
    Ast2
    Ast0
    Ast1
    Mic2
    Mic3
    Mic0
    Mic1
    Opc0
    Opc1
    Opc2
    Per
    End1
    End2
    Ex2
    Ex0
    Ex4
    Ex6
    Ex14
    Ex1
    Ex5
    Ex7
    Ex8
    In0
    In7
    In9
    In11
    In2
    In10
    In3
    In6
    In1
    In4
    In5
    In8
    Ex3
    Ex11
    Ex12
    Ex9
    −0.8
    −0.6
    −0.4
    −0.2
    0.0
    0.2
    0.4
    0.6
    0.8
    (C)
    Maynard, Collado-Torres, et al, bioRxiv, 2020
    Spatial registration of your sc/snRNA-seq data
    Interpretation guidelines:
    • Find strong positive correlation values (dark green) to identify
    cell/nuclei clusters enriched for a given layer
    • By row: for a given layer
    • By column: for a given cell/nuclei cluster
    Mathys et al, Nature, 2019

    View Slide

  21. Maynard, Collado-Torres, et al, bioRxiv, 2020
    WM
    Layer6
    Layer5
    Layer4
    Layer3
    Layer2
    Layer1
    22 (Oligo)
    3 (Oligo)
    23 (Oligo)
    17 (Oligo)
    21 (Oligo)
    7 (Astro)
    5 (Astro)
    9 (OPC)
    26 (OPC)
    1 (Micro)
    24 (Drop)
    13 (Excit)
    10 (Excit)
    27 (Excit)
    29 (Inhib)
    14 (Inhib)
    15 (Inhib)
    18 (Inhib)
    2 (Excit)
    31 (Excit)
    8 (Excit)
    16 (Inhib)
    28 (Inhib)
    30 (Inhib)
    20 (Inhib)
    11 (Inhib)
    25 (Inhib)
    4 (Excit)
    12 (Excit)
    6 (Excit)
    19 (Excit)
    −0.8
    −0.6
    −0.4
    −0.2
    0.0
    0.2
    0.4
    0.6
    0.8
    Matthew N Tran Brianna K Barry
    @mattntran
    Interpretation guidelines:
    • Find strong positive correlation values (dark green)
    to identify cell/nuclei clusters enriched for a given
    layer
    • By row: for a given layer
    • By column: for a given cell/nuclei cluster

    View Slide

  22. http://spatial.libd.org/spatialLIBD/
    Maynard, Collado-Torres, et al, bioRxiv, 2020
    Spatial registration of your sc/snRNA-seq data: DIY

    View Slide

  23. 23
    Maynard, Collado-Torres, et al, bioRxiv, 2020
    Cluster1 Cluster2 Cluster3
    ENSG00000104419 3 -2 0.3
    ENSG0000018400
    7
    1 0.67 4
    … … … …
    Full example table
    https://github.com/LieberInstitute/spatialLIBD/blob/master/data-raw/tstats_Human_DLPFC_snRNAseq_Nguyen_topLayer.csv
    Save your “enrichment” t-
    statistics for your sc/snRNA-seq
    clusters
    Spatial registration of your sc/snRNA-seq data: DIY

    View Slide

  24. 24
    Maynard, Collado-Torres, et al, bioRxiv, 2020
    Spatial registration of your sc/snRNA-seq
    data: DIY
    spatial.libd.org/spatialLIBD/
    Cluster1 Cluster2 Cluster3
    ENSG00000104419 3 -2 0.3
    ENSG00000184007 1 0.67 4
    … … … …

    View Slide

  25. Gandal et al, Science, 2018
    SFARI GENE; 2.0 by Abrahams et al, Mol Autism, 2013
    Jaffe et al, Nature Neuroscience, 2020
    - Curated lists
    - GWAS/TWAS
    hits
    - Differential
    expression
    - …
    Layer-enriched gene expression profiling

    View Slide

  26. 0
    2
    4
    6
    8
    10
    12
    WM
    L6
    L5
    L4
    L3
    L2
    L1
    SFAR
    I
    ASC
    102
    ASD
    53
    D
    D
    ID
    49
    D
    E.U
    p
    D
    E.D
    ow
    n
    2.7
    2.1
    2.7
    4
    3.6
    4.9
    4.5
    2.5
    5
    2.8
    5
    6.4
    2.8
    ASD
    0
    2
    4
    6
    8
    10
    12
    WM
    L6
    L5
    L4
    L3
    L2
    L1
    PE.U
    p
    PE.D
    ow
    n
    BS2.U
    p
    BS2.D
    ow
    n
    BS2.U
    p
    BS2.D
    ow
    n
    PE.U
    p
    PE.D
    ow
    n
    2.1
    2
    3.1
    1.8
    2.2
    1.8
    8.8
    5
    2.7
    2.6
    4.6
    SCZD−DE SCZD−TWAS
    (A) (B)
    DIY at
    http://spatial.libd.org/spatialLIBD/
    Maynard, Collado-Torres, et al, bioRxiv, 2020
    Layer-enriched gene expression profiling
    Autism Spectrum Disorder
    • SFARI: Abrahams et al, Mol Autism,
    2013
    • ASC102: Satterstrom et al, Cell,
    2020
    Break up into:
    • ASD53: ASD dominant traits
    • DDID49: neurodevelopmental
    delay

    View Slide

  27. 27
    Stephanie C Hicks Lukas M Weber
    @stephaniehicks @lmwebr Maynard, Collado-Torres, et al, bioRxiv, 2020
    Data-driven layer-enriched clustering in the DLPFC
    Spatially-varying genes
    Highly-variable genes
    Spot-level
    clustering
    Manual layer
    annotation
    using
    spatialLIBD
    • Which samples to use?
    • All samples?
    • Sample by sample then
    merge?
    • Use image-derived information?

    View Slide

  28. 28
    Maynard, Collado-Torres, et al, bioRxiv, 2020
    Data-driven layer-enriched clustering in the DLPFC
    SpatialDE by Svensson et al, Nature Methods, 2018
    Are the spatial patterns relevant?
    Remember to inspect your data!

    View Slide

  29. 29
    Maynard, Collado-Torres, et al, bioRxiv, 2020
    Data-driven layer-enriched clustering in the DLPFC
    SpatialDE by
    Svensson et al, Nature Methods, 2018
    “ANOVA” model F-statistics
    SpatialDE
    statistic

    View Slide

  30. 30
    Maynard, Collado-Torres, et al, bioRxiv, 2020
    Use known
    marker
    genes only
    Use layer-
    enriched genes
    (scenario where
    you have more
    datasets)
    Only use
    the data
    Requires >=1 expert
    Benefits from known marker
    genes (if expressed) & prior
    knowledge

    View Slide

  31. 31
    Maynard, Collado-Torres, et al, bioRxiv, 2020
    Data-driven layer-enriched clustering in the DLPFC
    Using spatial coordinates does help in some cases

    View Slide

  32. http://spatial.libd.org/spatialLIBD/
    Maynard, Collado-Torres, et al, bioRxiv, 2020
    Explore our spatial data (or adapt for yours) + perform
    spatial registration & gene enrichment analyses

    View Slide

  33. Summary: transcriptome-scale spatial gene
    expression in postmortem human cortex
    33
    http://research.libd.org/spatialLIBD
    Explore the data:
    Maynard, Collado-Torres, et al, bioRxiv, 2020

    View Slide

  34. SRA

    View Slide

  35. GTEx TCGA
    slide adapted from Shannon Ellis

    View Slide

  36. jx 1 jx 2 jx 3 jx 4
    jx 5
    jx 6
    Coverage
    Reads
    Gene
    Isoform 1
    Isoform 2
    Potential
    isoform 3
    exon 1 exon 2 exon 3 exon 4
    Expressed region 1:
    potential exon 5
    doi.org/10.12688/f1000research.12223.1
    Collado-Torres et al, F1000Research, 2017

    View Slide

  37. slide adapted from Jeff Leek

    View Slide

  38. https://jhubiostatistics.shinyapps.io/recount/
    Nellore, Collado-Torres, et al, Nature Biotechnology, 2017

    View Slide

  39. recount3: over 700,000 human and mouse RNA-seq
    samples
    39
    http://research.libd.org/recount3-docs/
    Wilks et al, 2021

    View Slide

  40. 40

    View Slide

  41. 41
    https://bioconductor.org/
    packages/recount3

    View Slide

  42. 42
    Wilks et al, 2021
    Variation: mostly by tissue with some more variable
    (blood) than others (brain)

    View Slide

  43. Acknowledgements
    Lieber Institute
    Keri Martinowich
    Andrew E. Jaffe
    Brianna K. Barry
    Joseph L. Catallini II
    Matthew N. Tran
    Zachary Besich
    Madhavi Tippani
    Joel E. Kleinman
    Thomas M. Hyde
    Daniel R. Weinberger
    JHU Biostatics Dept JHU Oncology Tissue Services (Kristen Lecksell)
    Stephanie C. Hicks JHU SKCCC Flow Core (Jessica Gucwa)
    Lukas M. Weber JHU Transcriptomics & Deep Sequencing Core (Linda Orzolek)
    10x Genomics
    Cedric Uytingco
    Stephen R. Williams
    Jennifer Chew
    Yifeng Yin
    Nikhil Rao
    43
    @kr_maynard
    @lcolladotor
    #spatialLIBD
    Interested in working with us?
    Let us know!

    View Slide

  44. expression data for ~700,000 human samples
    (multiple) positions available
    This project involves the Hansen, Langmead, Leek and Battle labs at JHU & the
    Nellore lab at OHSU & the Collado-Torres lab at LIBD
    Contact:
    • Kasper D. Hansen www.hansenlab.org
    • Ben Langmead www.langmead-lab.org/
    • Leonardo Collado-Torres lcolladotor.github.io/
    • Abhinav Nellore nellore.bio/
    • Alexis Battle battlelab.jhu.edu/
    • Jeff Leek jtleek.com/
    • Andrew Jaffe aejaffe.com/ @chrisnwilks
    #recount3

    View Slide

  45. View Slide