Upgrade to Pro — share decks privately, control downloads, hide ads and more …

bsp2-bog18

 bsp2-bog18

Tweet

More Decks by Leonardo Collado-Torres

Other Decks in Science

Transcript

  1. 11 Schizophrenia-associated expression differences between the hippocampus and the dorsolateral

    prefrontal cortex BrainSeq Phase II L. Collado-Torres @fellgernon @LieberInstitute #BOG18 Andrew Jaffe’s data science team
  2. THE NEED URGENT UNMET NEED Neuropsychiatric disorders cost the U.S.

    Economy > $80 billion/year. Neuropsychiatric conditions are the leading cause of disability in young people worldwide. 70% 70% of youth in thejuvenile justice system are living with at least one mental healthcondition. Traumatic brain injury is the leading cause of long-term disability in children andadults younger than 35years. 1 in 4 experience mentalillness in a givenyear. More veterans die of suicide than in combat at rate of 20 suicides perday. 3
  3. 2018 Elon Musk put a Tesla in space THE SCIENTIFIC

    FRONTIER 65+ YEARS: WAITING FOR A BREAKTHROUGH Molecular targets of all current psychotherapeutic drugs are the same as their 1950’s prototypes. 1957 Sputnik I 1952 Discovery of Antipsychotic Chlorpromazine (DRD2 blockade) 2018 Antipsychotics for treatment of schizophrenia all work via DRD2 blockade ? 4
  4. Animal Models Neuronal Cell Models Drug Discovery New Treatments 2300+

    Human postmortem brains 1000+ Cell lines from individuals Genomics + Transcriptomics + Proteomics 8 Mechanisms of Illness Clinical Genetics BrainSeq: A Human Brain Genomics Consortium THE SCIENTIFIC FRONTIER
  5. Animal Models Neuronal Cell Models Drug Discovery New Treatments 2300+

    Human postmortem brains 1000+ Cell lines from individuals Genomics + Transcriptomics + Proteomics 8 Mechanisms of Illness Clinical Genetics BrainSeq: A Human Brain Genomics Consortium DLPFC 495 samples BrainSeq Phase I polyA+ Jaffe et al., bioRxiv, 2017 THE SCIENTIFIC FRONTIER
  6. Animal Models Neuronal Cell Models Drug Discovery New Treatments 2300+

    Human postmortem brains 1000+ Cell lines from individuals Genomics + Transcriptomics + Proteomics 8 Mechanisms of Illness Clinical Genetics BrainSeq: A Human Brain Genomics Consortium DLPFC 495 samples BrainSeq Phase I polyA+ Jaffe et al., bioRxiv, 2017 DLPFC 453 samples HIPPO 447 samples BrainSeq Phase II RiboZero THE SCIENTIFIC FRONTIER
  7. Animal Models Neuronal Cell Models Drug Discovery New Treatments 2300+

    Human postmortem brains 1000+ Cell lines from individuals Genomics + Transcriptomics + Proteomics 8 Mechanisms of Illness Clinical Genetics BrainSeq: A Human Brain Genomics Consortium DLPFC 495 samples BrainSeq Phase I polyA+ Jaffe et al., Nature Neuroscience, 2018 DLPFC 453 samples HIPPO 447 samples BrainSeq Phase II RiboZero THE SCIENTIFIC FRONTIER Accepted ~ 400 days later.
  8. DATA 9 BrainSeq Phase II RNA-seq samples DLPFC HIPPO total

    adult (age >= 18) 374 370 744 prenatal 29 28 57 0 <= age < 18 50 49 99 total 453 447 900 • Non-psychiatric control and schizophrenia affected individuals • Two brain regions: dorsolateral prefrontal cortex and hippocampus All samples
  9. DATA 10 BrainSeq Phase II RNA-seq samples: by case status

    DLPFC HIPPO total adult 222 238 460 prenatal 29 28 57 0 <= age < 18 49 48 97 total 300 314 614 DLPFC HIPPO total adult 152 132 284 prenatal 0 0 0 0 <= age < 18 1 1 2 total 153 133 286 Control Schizophrenia cases
  10. DATA ANALYSIS 11 Focus on being conservative 1.Use well established

    processing methods 2.Apply strict expression cutoffs 3.Use replication when possible 4.Adjust for RNA quality degradation confounding • Using the qSVA method 5.Avoid potential batch effects • Drop problematic samples 6.Take into account correlation at the individual level
  11. RNA-SEQ APPROACH 12 BrainSeq Phase II Pre-natal Adult ACTG Birth

    Unaffected Controls Patients with Schizophrenia RNA Sequencing Genotyping + + + Gene Exons Expressed Regions Transcripts Junctions Age CC CA AA SZ CONT DLPFC HIPPO + region differences For public data, check recount2
  12. DATA ANALYSIS 13 Main processing steps 1.Quality check (QC) on

    raw reads (FastQC) 2.Failed QC? Then trim reads (Trimmomatic) 3.Align reads to the genome (HISAT2) 4.Count features (featureCounts + others) 5.Calculate coverage (bam2wig) 6.Quantify transcripts (Salmon) 7. Create count tables (R) 8.Genotype samples (samtools + vcftools) L. Collado-Torres & Emily E. Burke
  13. DATA ANALYSIS 14 Main processing steps 1. Quality check (QC)

    on raw reads (FastQC) 2. Failed QC? Then trim reads (Trimmomatic) 3. Align reads to the genome (HISAT2) 4. Count features (featureCounts + others) 5. Calculate coverage (bam2wig) 6. Quantify transcripts (Salmon) 7. Create count tables (R): RangedSummarizedExperiment objects 8. Genotype samples (samtools + vcftools) L. Collado-Torres & Emily E. Burke Nextflow version in preparation with Winter Genomics
  14. DATA ANALYSIS 15 Filter features with low expression • •

    • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • 0.0 0.2 0.4 0.6 0.8 1.0 20000 30000 40000 50000 mean expression cutoff Number of features > cutoff Feature Cutoff Unit Gene 0.25 RPKM Exon 0.30 RPKM Jxn 0.46 RP10M Tx 0.38 TPM Gene • Used mean across all 900 samples • All analyses with filtered data L. Collado-Torres Mean RPKM # genes > cutoff Based on the segmented R package: break-point estimation
  15. DATA ANALYSIS 16 Differential expression models • Region-specific for adult

    or fetal ages • Using only adult samples or only prenatal samples • Test for differences between DLFPC and HIPPO • Development • Linear age splines with breakpoints at developmental stages • Test for interaction between age and brain region at these splines • Case-control • By brain region • Test for differences between non-psychiatric controls and individuals with schizophrenia • For the first two models, we account for the fact that an individual can have two correlated samples: one for each brain region limma::duplicateCorrelation()
  16. DATA ANALYSIS 17 Differential expression models • Region-specific for adult

    or prenatal ages • Alternative: !"#$ = &' + )*+ + ,+" + ∑./0 1 23#45. + 6789:)8+ + 898);<227*3+=>+3+ + :?@ + :+*793 • Development • Alternative: !"#$ = &' + )*+ ∗ :+*793 + B+8); ∗ :+*793 + C7$8ℎ ∗ :+*793 + 73B)38 ∗ :+*793 + Eℎ7;= ∗ :+*793 + 8++3 ∗ :+*793 + )=F;8 ∗ :+*793 + ,+" + ∑./0 1 23#45. + 6789:)8+ + 898);<227*3+=>+3+ + :?@ + :+*793 • Case-control • Alternative: !"#$ = &' + )*+ + ,+" + 6789:)8+ + $:@<GHIJ + ∑./0 1 23#45. + 898);<227*3+=>+3+ + :?@ + $+*793,#+E7B7EK,L2 + M7)*39272
  17. DATA ANALYSIS 18 Using BrainSpan for replication: region-specific model •

    P-value < 0.05 in BrainSpan, consistent direction Similar results for the development model L. Collado-Torres adult fetal exon gene jxn p<0.05 p<0.01 p<0.001 p<1e−04 p<1e−05 p<1e−06 p<0.05 p<0.01 p<0.001 p<1e−04 p<1e−05 p<1e−06 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 p−value threshold Replication rate
  18. DATA ANALYSIS 19 Region-specific by age results • For adults:

    1,612 DE genes with Bonferroni < 0.01 that replicate in BrainSpan • For prenatal samples: 32 DE genes L. Collado-Torres 23 6 2 0 3 0 3 gene exon jxn DE features grouped by gene id (prenatal) 328 422 778 23 839 647 388 gene exon jxn DE features grouped by gene id (adult)
  19. DLPFC HIPPO 1.5 2.0 2.5 3.0 3.5 4.0 4.5 ENSG00000250479.8

    CHCHD10 FDR 4.04e−06 log2(CPM + 0.5) − covariate effects removed DATA ANALYSIS 20 Region-specific by age results: examples with top results Adult Prenatal L. Collado-Torres DLPFC HIPPO 0 2 4 6 8 ENSG00000268089.2 GABRQ FDR 2.87e−84 log2(CPM + 0.5) − covariate effects removed
  20. DATA ANALYSIS 21 Region-specific by age results: adult-only enriched biological

    processes L. Collado-Torres G−protein coupled serotonin receptor signaling pathway serotonin receptor signaling pathway pallium development cGMP−mediated signaling small GTPase mediated signal transduction regulation of small GTPase mediated signal transduction positive regulation of GTPase activity behavior regulation of GTPase activity axon development axonogenesis regulation of membrane potential regulation of hormone levels positive regulation of neuron projection development positive regulation of cell development regulation of cell morphogenesis positive regulation of neuron differentiation regulation of neuron projection development positive regulation of neurogenesis positive regulation of nervous system development cell morphogenesis involved in neuron differentiation G:E:J (714) E:J (583) G:E (260) GeneRatio 0.02 0.04 0.06 0.08 0.01 0.02 0.03 0.04 0.05 p.adjust ontology: BP G: gene, E: exon, J: exon-exon junction
  21. • • • • • • • • • •

    • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • Pluripotency Glial Fetal Infant C hild Teens Adult 50+ Fetal Infant C hild Teens Adult 50+ 0.4 0.6 0.8 0.0 0.1 0.2 0.3 Age Group Cell Type Proportion Region DLPFC Hippo DATA ANALYSIS 22 Development model: similar composition in prenatal across regions by DNAm Stephen A. Semick
  22. DATA ANALYSIS 23 Development model results • 5,982 (~55%) genes

    contain differentially expressed exons and splice junctions that replicated in BrainSpan (Bonferroni < 1%) L. Collado-Torres 2354 2260 1762 243 5982 8501 1558 gene exon jxn DE features grouped by gene id
  23. DATA ANALYSIS 24 Development model results • Normalized expression over

    age for GABRD L. Collado-Torres −2 0 2 4 6 log2(CPM + 0.5) 14 16 18 20 22 PCW 0.0 0.2 0.4 2 4 6 8 12 14 16 18 2020 30 40 50 50 55 60 65 70 75 80 85 ENSG00000187730.7 GABRD p−bonf 0 Age DLPFC HIPPO
  24. DATA ANALYSIS 25 Development model results • Normalized expression over

    age after removing effect of terms from the null model L. Collado-Torres For more check LieberInstitute/jaffelab::cleaningY() #rstats −2 −1 0 1 2 3 log2(CPM + 0.5) − covariate effects removed 14 16 18 20 22 PCW 0.0 0.2 0.4 2 4 6 8 12 14 16 18 2020 30 40 50 50 55 60 65 70 75 80 85 ENSG00000187730.7 GABRD p−bonf 0 Age DLPFC HIPPO
  25. qSVA WORKFLOW 26 Slide adapted from Amy Peterson Jaffe et

    al., PNAS, 2017 Model 1 (6429 genes) Log2 FC Dx Log2 FC Degradation
  26. DEqual HIPPO 27 Model 1 (6429 genes) Model 1. Naïve

    model E / = 1 0 + 1 1 45 DEqual plots demonstrate effectiveness of statistical correction HIPPO 333 samples r = 0.412 Slide adapted from Amy Peterson Log2 FC Dx Log2 FC Degradation
  27. DEqual HIPPO 28 Model 1 (6429 genes) Model 2 (63

    genes) Model 1. Naïve model E / = 10 + 11 45 Model 2. Added RNA-quality and demographic covariates E / = 10 + 11 45 + 12 FGH + 13 JH5+ 14 LMNOPFNH+ 15 RPSTRFte + 16 GHVHPFNH + 17 PXS + ∑Z[\ ] γM JV_`aM DEqual plots demonstrate effectiveness of statistical correction HIPPO 333 samples r = 0.412 r = 0.0712 Slide adapted from Amy Peterson Log2 FC Dx Log2 FC Dx, adj. cov Log2 FC Degradation Log2 FC Degradation
  28. DEqual HIPPO 29 Model 1 (6429 genes) Model 2 (63

    genes) Model 3 (48 genes) Model 1. Naïve model E / = 10 + 11 45 Model 2. Added RNA-quality and demographic covariates E / = 10 + 11 45 + 12 FGH + 13 JH5+ 14 LMNOPFNH+ 15 RPSTRFte + 16 GHVHPFNH + 17 PXS + ∑Z[\ ] γM JV_`aM Model 3. Added qSVs E / = 10 + 11 45 + 12 FGH + 13 JH5+ 14 LMNOPFNH+ 15 RPSTRFte + 16 GHVHPFNH + 17 PXS + ∑Z[\ ] γM JV_`aM + ∑Z[\ d eM fghM DEqual plots demonstrate effectiveness of statistical correction HIPPO 333 samples r = 0.412 r = 0.0712 r = -0.00173 Slide adapted from Amy Peterson Log2 FC Dx Log2 FC Dx, adj. cov Log2 FC Dx, adj qSVs Log2 FC Degradation Log2 FC Degradation Log2 FC Degradation
  29. DATA ANALYSIS 30 Case-control: by region • 48 DE genes

    at FDR <5% in hippocampus, 243 in DLPFC (FDR <5%) suggesting regional heterogeneity of the molecular correlates of schizophrenia diagnosis • DLPFC results agree with BrainSeq Phase I 354 0 242 25 0 0 0 11 146 0 0 0 0 150 0 HIPPO_control HIPPO_schizo DLPFC_control DLPFC_schizo DLPFC FDR10%, HIPPO FDR20% −6 −4 −2 0 2 4 6 −6 −4 −2 0 2 4 6 t−statistic DLPFC t−statistic BSP1 r = 0.809 L. Collado-Torres & Amy Peterson −6 −4 −2 0 2 4 −4 −2 0 2 4 t−statistic HIPPO t−statistic DLPFC r = 0.644 DLPFC Ctrl > SCZD DLPFC Ctrl < SCZD HIPPO Ctrl < SCZD HIPPO Ctrl > SCZD t-stat HIPPO t-stat DLPFC t-stat DLPFC t-stat DLPFC BSP1
  30. DATA ANALYSIS 31 Case-control: by region • Only enrichment in

    Control > SCZD • Immune processes L. Collado-Torres & Amy Peterson regulation of lymphocyte proliferation neutrophil chemotaxis regulation of B cell activation positive regulation of leukocyte activation B cell receptor signaling pathway myeloid leukocyte activation positive regulation of cell activation B cell activation lymphocyte activation lymphocyte migration response to organophosphorus positive regulation of T cell activation leukocyte chemotaxis positive regulation of cell adhesion leukocyte migration cell chemotaxis positive regulation of cell−cell adhesion protein folding protein folding in endoplasmic reticulum H_c (136) D_c (294) GeneRatio 0.04 0.06 0.08 0.10 0.005 0.010 0.015 0.020 0.025 p.adjust ontology: BP HIPPO control > SCZD DLPFC control > SCZD
  31. DATA ANALYSIS 32 HIPPO eQTLs • 11,237,357 eQTL associations (FDR

    <1%) across genes, exons and junctions corresponding to 17,719 genes Emily E. Burke 2061 3183 2163 274 5945 2915 1178 gene exon jxn eQTLs grouped by gene id 0 1 2 0 1 2 3 NDRG4 chr16:58509353−58512046(+) (Jxn) rs42945 Residualized Expression p=5.21e−42 0 1 2 0.0 0.5 1.0 1.5 2.0 2.5 NDRG4 ENST00000565981.5 (Tx) rs42945 Residualized Expression p=5.43e−30 rs7188697 in NDRG4 has also been associated by Watanabe et al., J Clin Psychopharmacol., 2017 Includes 26 risk SNPs from PGC2
  32. DATA ANALYSIS 33 Region dependent eQTLs • 81,837 region-dependent eQTLs

    (FDR <1%) corresponding to 1,484 genes • Includes 5 PCG2 schizophrenia risk loci • We will soon update our eQTL browser at http://eqtl.brainseq.org/ Emily E. Burke 19394 10734 6493 2622 3937 4398 34259 gene exon jxn eQTLs grouped by SNP id 488 246 319 18 99 63 251 gene exon jxn eQTLs grouped by gene id
  33. WRAPPING UP 34 Summary • Used conservative methods/options to reduce

    false positives • Quantified expression at different feature levels • Widespread development differences between HIPPO and DLPFC in postnatal life • Adapted the qSVA framework for 2 brain regions and set the ground work for N > 2 • Results suggest regional specificity for case-control effects with enrichment towards genes with decreased expression in schizophrenia • Potential need to have regionally targeted therapies for schizophrenia because schizophrenia risk seems region-specific In progress: pre-print
  34. Acknowledgements • Leonardo Collado-Torres o [email protected] o @fellgernon • Emily

    E. Burke • Amy Peterson (JHU MPH class 2018) o amy-peterson.github.io • Joo Heon Shin • Stephen A. Semick • Anandita Rajpurohit • Courtney Williams • Ran Tao • Amy Deep-Soboslay • Thomas M. Hyde • Joel E. Kleinman • Daniel R. Weinberger+ • Andrew E. Jaffe+ o [email protected] o @andrewjaffe 35 • BrainSeq Consortium • LIBD @lieberinstitute More from our team at #BOG18: • Poster 48 by Emily E. Burke, Wed 2pm • Poster 251 by Stephen A. Semick, Fri 2pm Funding We are hiring! Multiple positions open