Slide 1

Slide 1 text

11 Analyzing BrainSEQ Phase II and generating the recount-brain resource Leonardo Collado-Torres Staff Scientist II @fellgernon Data Science I with Andrew Jaffe

Slide 2

Slide 2 text

Questions I’ll answer today • Who am I? • What is the BrainSeq Phase II project? • What is the recount-brain resource?

Slide 3

Slide 3 text

What defines me • Bioinformatics • R and Bioconductor • Reproducibility and best practices • Outreach and community building • Back in 2005 at @LCGUNAM: I like math and coding; biology provides the challenging problems

Slide 4

Slide 4 text

History 2005-2009 Undergrad in Genomic Sciences 2009-2011 2011-2016 August 2016+ Data Science Division Leader ! ! PIs: • Jeff Leek: 2012+ • Andrew Jaffe: 2013+ Ph.D. Biostatistics Staff Scientist I → II Data Science Team I PI: Andrew Jaffe

Slide 5

Slide 5 text

2008+ • BioC 2008-2011, 2014, 2017 • useR!2013 • rOpenSci unconf 2018 • RStudio::conf 2019 @fellgernon 2010+ Interests ! @LIBDrstats 2018+ @CDSBMexico 2018+ Defunct: BmoreBiostats, Biostats Cultural Mixers Guest @RLadiesBmore Blog: http://lcolladotor.github.io 2011+ FB: 75k, Tw: 66k weekly

Slide 6

Slide 6 text

Software tools I use every day Mercurial/git user 2009+ https://github.com/lcolladotor From: Feb 21, 2019 Communication tool for work and communities

Slide 7

Slide 7 text

My path to LIBD 2005-2009 Undergrad in Genomic Sciences 2009-2011 2011-2016 August 2016+ Data Science Division Leader ! ! PIs: • Jeff Leek: 2012+ • Andrew Jaffe: 2013+ Ph.D. Biostatistics Staff Scientist Data Science Team I Microarrays, little RNA-seq N = 12 (bacteria) N = 59 N = 72 potential for 1,000 BSP2: N = 900 + many more

Slide 8

Slide 8 text

Ph.D. overview The goal was to develop statistical methods and software [...] in RNA-seq […]. We applied these methods to further our understanding of neuropsychiatric disorders using the Lieber Institute for Brain Development human brains collection (> 1000 samples).

Slide 9

Slide 9 text

Lieber Institute for Brain Development • Role: Staff Scientist II • PI: Andrew Jaffe Team: • Staff Scientist: Emily Burke • Research Associate: Madhavi Tippani • Grad students: Matt Nguyen, Brianna Barry, Kira Perzel Mandell • Research Assistant: Nick Eagles • Recent Alumni: Amanda Price, Stephen Semick • Close collaborators: Carrie Wright, Nina Rajpurohit • Role details: like a postdoc (major role in some projects) with some support projects Lab at Amy Peterson’s MPH capstone presentation amy-peterson.github.io

Slide 10

Slide 10 text

Papers since 2016

Slide 11

Slide 11 text

Papers since 2016 RNA-seq WGBS Biostatistics RNA-seq > 70,000 samples

Slide 12

Slide 12 text

Lieber Institute for Brain Development • DNAm on WGBS across development & cell types biorxiv.org/content/early/2018/09/29/428391 • RNA-seq from stem cells biorxiv.org/content/early/2018/07/31/380758 Peer reviewed: • RNA-seq DE in Schizophrenia disorder & 2 brain regions doi.org/10.1016/j.neuron.2019.05.013 • miRNA kit prep comparison doi.org/10.1186/s12864 • DNAm and gene DE in Alzheimer’s disease doi.org/10.1007/s00401-019-01966-5 • RNA-seq smoking during pregnancy doi.org/10.1038/s41380-018-0223-1 • RNA-seq DE in Schizophrenia disorder on DLPFC doi.org/10.1038/s41593-018-0197-y • Histamine signaling in autism spectrum disorder doi.org/10.1038/tp.2017.87 Pre-prints:

Slide 13

Slide 13 text

Today BrainSeq Phase II brain

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

BrainSeq: A Human Brain Genomics Consortium DLPFC 495 samples BrainSeq Phase I polyA+ Jaffe et al., Nature Neuroscience, 2018 DLPFC 453 samples HIPPO 447 samples BrainSeq Phase II RiboZero THE SCIENTIFIC FRONTIER Neuron, 2015 Collado-Torres et al, bioRxiv, 2018 → Neuron, 2019

Slide 16

Slide 16 text

DATA 16 BrainSeq Phase II RNA-seq samples DLPFC HIPPO total adult (age >= 18) 374 370 744 prenatal 29 28 57 0 <= age < 18 50 49 99 total 453 447 900 • Non-psychiatric control and schizophrenia affected individuals • Two brain regions: dorsolateral prefrontal cortex and hippocampus All samples

Slide 17

Slide 17 text

DATA 17 BrainSeq Phase II RNA-seq samples: by case status DLPFC HIPPO total adult 222 238 460 prenatal 29 28 57 0 <= age < 18 49 48 97 total 300 314 614 DLPFC HIPPO total adult 152 132 284 prenatal 0 0 0 0 <= age < 18 1 1 2 total 153 133 286 Control Schizophrenia cases

Slide 18

Slide 18 text

DATA ANALYSIS 18 Focus on being conservative 1.Use well established processing methods 2.Apply strict expression cutoffs 3.Use replication when possible 4.Adjust for RNA quality degradation confounding • Using the qSVA method 5.Avoid potential batch effects • Drop problematic samples 6.Take into account correlation at the individual level

Slide 19

Slide 19 text

RNA-SEQ APPROACH 19 BrainSeq Phase II Pre-natal Adult ACTG Birth Unaffected Controls Patients with Schizophrenia RNA Sequencing Genotyping + + + Gene Exons Expressed Regions Transcripts Junctions Age CC CA AA SZ CONT DLPFC HIPPO + region differences For public data, check recount2

Slide 20

Slide 20 text

DATA ANALYSIS 20 Main processing steps 1. Quality check (QC) on raw reads (FastQC) 2. Failed QC? Then trim reads (Trimmomatic) 3. Align reads to the genome (HISAT2) 4. Count features (featureCounts + others) 5. Calculate coverage (bam2wig) 6. Quantify transcripts (Salmon) 7. Create count tables (R) 8. Genotype samples (samtools + vcftools) Emily E. Burke Nextflow version in preparation with Winter Genomics and Nick Eagles

Slide 21

Slide 21 text

DATA ANALYSIS 21 Filter features with low expression ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 0.2 0.4 0.6 0.8 1.0 20000 30000 40000 50000 mean expression cutoff Number of features > cutoff Feature Cutoff Unit Gene 0.25 RPKM Exon 0.30 RPKM Jxn 0.46 RP10M Tx 0.38 TPM Gene • Used mean across all 900 samples • All analyses with filtered data Mean RPKM # genes > cutoff Based on the segmented R package: break-point estimation jaffelab::expression_cutoff()

Slide 22

Slide 22 text

DATA ANALYSIS 22 Differential expression models • Region-specific for adult or fetal ages • Using only adult samples or only prenatal samples • Test for differences between DLFPC and HIPPO • Development • Linear age splines with breakpoints at developmental stages • Test for interaction between age and brain region at these splines • Case-control • By brain region • Test for differences between non-psychiatric controls and individuals with schizophrenia • For the first two models, we account for the fact that an individual can have two correlated samples: one for each brain region limma::duplicateCorrelation()

Slide 23

Slide 23 text

DATA ANALYSIS 23 Differential expression models • Region-specific for adult or prenatal ages • Alternative: !"#$ = &' + )*+ + ,+" + ∑./0 1 23#45. + 6789:)8+ + 898);<227*3+=>+3+ + :?@ + :+*793 • Development • Alternative: !"#$ = &' + )*+ ∗ :+*793 + B+8); ∗ :+*793 + C7$8ℎ ∗ :+*793 + 73B)38 ∗ :+*793 + Eℎ7;= ∗ :+*793 + 8++3 ∗ :+*793 + )=F;8 ∗ :+*793 + ,+" + ∑./0 1 23#45. + 6789:)8+ + 898);<227*3+=>+3+ + :?@ + :+*793 • Case-control • Alternative: !"#$ = &' + )*+ + ,+" + 6789:)8+ + $:@+3+ + :?@ + $+*793,#+E7B7EK,L2 + M7)*39272

Slide 24

Slide 24 text

DATA ANALYSIS 24 Using BrainSpan for replication: region-specific model • P-value < 0.05 in BrainSpan, consistent direction Similar results for the development model prenatal adult gene exon jxn p<0.05 p<0.01 p<0.001 p<1e−04 p<1e−05 p<1e−06 p<0.05 p<0.01 p<0.001 p<1e−04 p<1e−05 p<1e−06 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 p−value threshold Replication rate

Slide 25

Slide 25 text

DATA ANALYSIS 25 Region-specific by age results • For adults: 1,612 DE genes with Bonferroni < 0.01 that replicate in BrainSpan • For prenatal samples: 32 DE genes 23 6 2 0 3 0 3 gene exon jxn DE features grouped by gene id (prenatal) 328 422 778 23 839 647 388 gene exon jxn DE features grouped by gene id (adult)

Slide 26

Slide 26 text

DATA ANALYSIS 26 Region-specific by age results: adult enriched biological processes by region G: gene, E: exon, J: exon-exon junction D: DLPFC, H: HIPPO cell−matrix adhesion extracellular matrix organization ameboidal−type cell migration microtubule bundle formation axoneme assembly cilium movement establishment of synaptic vesicle localization synaptic vesicle transport synaptic vesicle localization signal release from synapse neurotransmitter secretion presynaptic process involved in chemical synaptic transmission positive regulation of GTPase activity regulation of GTPase activity synaptic vesicle exocytosis neurotransmitter transport regulation of calcium ion−dependent exocytosis synaptic vesicle cycle regulation of synapse organization positive regulation of synaptic transmission regulation of neuron projection development regulation of small GTPase mediated signal transduction extracellular structure organization positive regulation of nervous system development positive regulation of neurogenesis positive regulation of cell development axon development axonogenesis regulation of hormone levels cardiac chamber development cardiac septum development kidney epithelium development nephron development urogenital system development cardiac chamber morphogenesis cardiac septum morphogenesis renal system development kidney development regulation of transmembrane transport actomyosin structure organization regulation of trans−synaptic signaling modulation of chemical synaptic transmission synapse organization regulation of cell morphogenesis synaptic transmission, glutamatergic regulation of synaptic transmission, glutamatergic potassium ion transmembrane transport cellular potassium ion transport potassium ion transport G.D (609) G.H (573) E.D (1101) E.H (1087) J.D (850) J.H (823) GeneRatio 0.02 0.03 0.04 0.05 0.06 0.01 0.02 0.03 0.04 p.adjust ontology: BP

Slide 27

Slide 27 text

DATA ANALYSIS 27 Development model: similar composition in prenatal across regions by RNA fraction ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● FQN Neurons Oligodendrocytes Prenatal Infant C hild Teen Adult 50+ Prenatal Infant C hild Teen Adult 50+ Prenatal Infant C hild Teen Adult 50+ 0.0 0.2 0.4 0.0 0.2 0.4 0.6 0.0 0.2 0.4 0.6 Cell Type Proportion Region DLPFC HIPPO

Slide 28

Slide 28 text

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Pluripotency Glial Fetal Infant C hild Teens Adult 50+ Fetal Infant C hild Teens Adult 50+ 0.4 0.6 0.8 0.0 0.1 0.2 0.3 Age Group Cell Type Proportion Region DLPFC Hippo DATA ANALYSIS 28 Development model: similar composition in prenatal across regions by DNAm Stephen A. Semick

Slide 29

Slide 29 text

DATA ANALYSIS 29 Development model results • 5,982 (~55%) genes contain differentially expressed exons and splice junctions that replicated in BrainSpan (Bonferroni < 1%) 2354 2260 1762 243 5982 8501 1558 gene exon jxn DE features grouped by gene id

Slide 30

Slide 30 text

DATA ANALYSIS 30 Development model results • Normalized expression over age for GABRD −2 0 2 4 6 log2(CPM + 0.5) 14 16 18 20 22 PCW 0.0 0.2 0.4 2 4 6 8 12 14 16 18 2020 30 40 50 50 55 60 65 70 75 80 85 ENSG00000187730.7 GABRD p−bonf 0 Age DLPFC HIPPO

Slide 31

Slide 31 text

DATA ANALYSIS 31 Development model results • Normalized expression over age after removing effect of terms from the null model jaffelab::cleaningY() and jaffelab::agePlotter() −2 −1 0 1 2 3 log2(CPM + 0.5) − covariate effects removed 14 16 18 20 22 PCW 0.0 0.2 0.4 2 4 6 8 12 14 16 18 2020 30 40 50 50 55 60 65 70 75 80 85 ENSG00000187730.7 GABRD p−bonf 0 Age DLPFC HIPPO

Slide 32

Slide 32 text

qSVA WORKFLOW 32 Slide adapted from Amy Peterson Jaffe et al., PNAS, 2017 Model 1 (6429 genes) Log2 FC Dx Log2 FC Degradation http://research.libd.org/rstatsclub/2018/12/11/quality-surrogate-variable-analysis/

Slide 33

Slide 33 text

DEqual HIPPO 33 Model 1 (6429 genes) Model 1. Naïve model E / = 10 + 11 45 DEqual plots demonstrate effectiveness of statistical correction r = 0.412 Slide adapted from Amy Peterson Log2 FC Dx Log2 FC Degradation HIPPO 333 samples

Slide 34

Slide 34 text

DEqual HIPPO 34 Model 1 (6429 genes) Model 2 (63 genes) Model 1. Naïve model E / = 10 + 1145 Model 2. Added RNA-quality and demographic covariates E / = 10 + 1145 + 12FGH + 13JH5+ 14LMNOPFNH+ 15RPSTRFte + 16GHVHPFNH + 17PXS + ∑Z[\ ] γMJV_`aM DEqual plots demonstrate effectiveness of statistical correction r = 0.412 r = 0.0712 Slide adapted from Amy Peterson Log2 FC Dx Log2 FC Dx, adj. cov Log2 FC Degradation Log2 FC Degradation HIPPO 333 samples

Slide 35

Slide 35 text

DEqual HIPPO 35 Model 1 (6429 genes) Model 2 (63 genes) Model 3 (48 genes) Model 1. Naïve model E / = 10 + 1145 Model 2. Added RNA-quality and demographic covariates E / = 10 + 1145 + 12FGH + 13JH5+ 14LMNOPFNH+ 15RPSTRFte + 16GHVHPFNH + 17PXS + ∑Z[\ ] γMJV_`aM Model 3. Added qSVs E / = 10 + 1145 + 12FGH + 13JH5+ 14LMNOPFNH+ 15RPSTRFte + 16GHVHPFNH + 17PXS + ∑Z[\ ] γMJV_`aM + ∑Z[\ d eMfghM DEqual plots demonstrate effectiveness of statistical correction HIPPO 333 samples r = 0.412 r = 0.0712 r = -0.00173 Slide adapted from Amy Peterson Log2 FC Dx Log2 FC Dx, adj. cov Log2 FC Dx, adj qSVs Log2 FC Degradation Log2 FC Degradation Log2 FC Degradation

Slide 36

Slide 36 text

DATA ANALYSIS 36 Case-control: by region • 48 DE genes at FDR <5% in hippocampus, 243 in DLPFC (FDR <5%) −6 −4 −2 0 2 4 −6 −4 −2 0 2 4 6 t−statistic HIPPO t−statistic DLPFC r = 0.276 Amy Peterson DLPFC FDR<10%, HIPPO FDR<20%

Slide 37

Slide 37 text

DATA ANALYSIS 37 Case-control: by region −6 −4 −2 0 2 4 6 −6 −4 −2 0 2 4 6 t−statistic DLPFC t−statistic BSP1 r = 0.413

Slide 38

Slide 38 text

axon development axonogenesis regulation of axonogenesis regulation of neuron projection development neutrophil chemotaxis regulation of B cell proliferation regulation of B cell activation positive regulation of B cell activation B cell receptor signaling pathway B cell activation lymphocyte activation regulation of leukocyte cell−cell adhesion regulation of cell−cell adhesion positive regulation of alpha−beta T cell activation regulation of leukocyte activation positive regulation of lymphocyte activation positive regulation of cell activation positive regulation of leukocyte activation leukocyte migration positive regulation of leukocyte cell−cell adhesion positive regulation of T cell activation leukocyte chemotaxis response to organophosphorus ATF6−mediated unfolded protein response positive regulation of cell adhesion cell chemotaxis positive regulation of cell−cell adhesion protein folding protein folding in endoplasmic reticulum HgC (137) HeC (306) DgC (297) DeS (250) GeneRatio 0.025 0.050 0.075 0.100 0.01 0.02 0.03 0.04 p.adjust ontology: BP DATA ANALYSIS 38 • BP enrichment in Control > SCZD at gene level • Immune processes g: gene, e: exon, j: exon-exon junction D: DLPFC, H: HIPPO, C: control, S: SCZD Case-control: by region

Slide 39

Slide 39 text

Gene Individual 1 Individual 2 1 10 15 2 5 22 … DLPFC Gene Individual 1 Individual 2 1 10 17 2 6 16 … HIPPO correlation( (10, 5, …), (10, 6, …) ) =~ 0.9 correlation( (15, 22, …), (17, 16, …) ) =~ 0.7 Individual 1 Individual 2 Control vs SCZD

Slide 40

Slide 40 text

Gene: 0.0164 Exon: 0.0499 Jxn: 1.72 * 10-5 Tx: 0.00992 Control SCZD 0.54 0.55 0.56 0.57 cleaned expr (keeping Dx) − jxnRp10m p−value: 1.72e−05 SCZD diagnosis Correlation Decreased coherence in SCZD

Slide 41

Slide 41 text

DATA ANALYSIS 41 HIPPO eQTLs • 11,237,357 eQTL associations (FDR <1%) across genes, exons and junctions corresponding to 17,719 genes Emily E. Burke 2061 3183 2163 274 5945 2915 1178 gene exon jxn eQTLs grouped by gene id Includes 60 risk SNPs from PGC2 8 +,332 E 0 1 2 −1 0 1 2 3 4 chr2:5822204358229ï-[Q rs74563533:58250433:G:A r2:58 250 433 Residualized Expression ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 1 2 −1 0 1 2 3 4 chr2:5822204358229ï-[Q rs75575209:58138192:A:T r2:58 138 192 Residualized Expression ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● p=5.75e−34 p=3.82e−81 F

Slide 42

Slide 42 text

DATA ANALYSIS 42 Region dependent eQTLs • 205,618 region-dependent eQTLs (FDR <1%) corresponding to 1,484 genes • Includes 3 PCG2 schizophrenia risk loci Emily E. Burke 19394 10734 6493 2622 3937 4398 34259 gene exon jxn eQTLs grouped by SNP id 488 246 319 18 99 63 251 gene exon jxn eQTLs grouped by gene id

Slide 43

Slide 43 text

DATA ANALYSIS Region dependent eQTLs 0.D 1.D 2.D 0.H 1.H 2.H −1 0 1 2 3 Overlaps gene NRGN chr11:124742693−124745500(+) (Jxn) rs12293670:124612932:A:G Residualized Expression p=5.27e−26

Slide 44

Slide 44 text

A B 0.D 1.D 2.D 0.H 1.H 2.H 4.0 4.4 4.8 5.2 LRP1(16( (Exon) rs324015 Residualized Expression p=3.91e−05 0.D 1.D 2.D 0.H 1.H 2.H −1 0 1 2 3 2YHUODSVJHQH15*1 FKUï-[Q rs12293670:124612932:A:G Residualized Expression p=5.27e−26 C D 0.D 1.D 2.D 0.H 1.H 2.H −1 0 1 2 3 4 2YHUODSVJHQH1*() FKUïï-[Q rs4144797:233562197:T:C Residualized Expression p=3.5e−14 0.D 1.D 2.D 0.H 1.H 2.H 6.8 7.2 7.6 8.0 LRP1 FKUï-[Q rs324015 Residualized Expression p=2.02e−06 D: DLPFC, H: HIPPO DATA ANALYSIS Region dependent eQTLs These 3 SNPs are also meQTLs (Jaffe et al, 2016)

Slide 45

Slide 45 text

Overall eQTLs at FDR < 1% - rAggr eQTL analysis with PCG2 risk SNPs Hippocampus DLPFC # unique SNPs (unique index SNPs) 5510 (103) 6780 (116) # unique features 1731 2525 # Unique genes 123 171 # Unique transcripts 244 332 # Unique exons 857 1363 # Unique junctions 507 659 Emily E. Burke 21 8 95 DLPFC HIPPO In BSP2: 163/179 (91.1%) FDR <1%: 124/163 (76.6%)

Slide 46

Slide 46 text

• We will soon update our eQTL browser at http://eqtl.brainseq.org/ Bill Ulrich

Slide 47

Slide 47 text

TWAS Gusev et al, Nature Genetics, 2016

Slide 48

Slide 48 text

TWAS SNP ID chromosome position rs101 12 1 rs102 13 10000 … SNP ID chromosome position rs101:A:T 12 1 rs102 13 10005 … SNP ID chromosome position rs101 12 1 rs102 13 10000 … LD reference LIBD BSP2 SNPs GWAS summary SNPs https://github.com/LieberInstitute/brainseq_phase2/blob/master/twas/README.md 1,190,321 82,393,211 6,500,475 CLOZUK+PGC2

Slide 49

Slide 49 text

TWAS SNP ID chromosome position rs101 12 1 rs102 13 10000 … SNP ID chromosome position rs101:A:T 12 1 rs102 13 10005 … SNP ID chromosome position rs101 12 1 rs102 13 10000 … LD reference LIBD BSP2 SNPs GWAS summary SNPs https://github.com/LieberInstitute/brainseq_phase2/blob/master/twas/README.md 1,190,321 82,393,211 6,500,475 CLOZUK+PGC2

Slide 50

Slide 50 text

TWAS SNP ID chromosome position rs101:A:T 12 1 rs102 13 10005 … SNP ID chromosome position rs101:A:T 12 1 rs102 13 10005 … SNP ID chromosome position rs101:A:T 12 1 rs102 13 10005 … LD reference LIBD BSP2 SNPs GWAS summary SNPs https://github.com/LieberInstitute/brainseq_phase2/blob/master/twas/README.md 1,022,527 1,022,546 4,316,013 CLOZUK+PGC2

Slide 51

Slide 51 text

TWAS DLPFC HIPPO gene 7,683 (32.8%) 5,866 (25.1%) exon 61,678 (16.2%) 47,583 (12.5%) jxn 33,754 (12.5%) 27,088 (10.1%) tx 14,118 (15.9%) 11,567 (13%) Features with TWAS weights Removed sex, snp PCs 1-5, expr PCs Kept diagnosis jaffelab::cleaningY()

Slide 52

Slide 52 text

gene exon jxn tx Other Risk Locus −10 −5 0 5 −10 −5 0 5 −10 −5 0 5 −10 −5 0 5 −5 0 5 −5 0 5 DLPFC HIPPO in both FALSE TRUE FDR <5% None DLPFC HIPPO Both TWAS Z by brain region TWAS

Slide 53

Slide 53 text

No content

Slide 54

Slide 54 text

No content

Slide 55

Slide 55 text

No content

Slide 56

Slide 56 text

No content

Slide 57

Slide 57 text

No content

Slide 58

Slide 58 text

No content

Slide 59

Slide 59 text

No content

Slide 60

Slide 60 text

No content

Slide 61

Slide 61 text

No content

Slide 62

Slide 62 text

No content

Slide 63

Slide 63 text

No content

Slide 64

Slide 64 text

No content

Slide 65

Slide 65 text

No content

Slide 66

Slide 66 text

No content

Slide 67

Slide 67 text

No content

Slide 68

Slide 68 text

No content

Slide 69

Slide 69 text

No content

Slide 70

Slide 70 text

No content

Slide 71

Slide 71 text

No content

Slide 72

Slide 72 text

No content

Slide 73

Slide 73 text

No content

Slide 74

Slide 74 text

No content

Slide 75

Slide 75 text

No content

Slide 76

Slide 76 text

No content

Slide 77

Slide 77 text

No content

Slide 78

Slide 78 text

No content

Slide 79

Slide 79 text

No content

Slide 80

Slide 80 text

No content

Slide 81

Slide 81 text

No content

Slide 82

Slide 82 text

No content

Slide 83

Slide 83 text

No content

Slide 84

Slide 84 text

No content

Slide 85

Slide 85 text

No content