Upgrade to Pro — share decks privately, control downloads, hide ads and more …

defense2016

 defense2016

Leonardo Collado-Torres

June 20, 2016
Tweet

More Decks by Leonardo Collado-Torres

Other Decks in Science

Transcript

  1. Note that the podium blocks the
    view!

    View full-size slide

  2. Annotation-agnostic differential
    expression and binding analyses
    Leonardo Collado-Torres
    @fellgernon

    View full-size slide

  3. Mo6va6ng problem: iden6fy and validate
    regions of the genome that change expression
    during brain development

    View full-size slide

  4. Research theme
    - annota6on-agnos6c
    - reproducible analyses
    - easily accessible data
    - sta6s6cal tools for the genomics community

    View full-size slide

  5. RNA-seq
    reads
    Genome
    (DNA)
    RNA transcripts
    (many possible
    variants)
    Measuring gene expression: RNA-seq
    Adapted from @jtleek

    View full-size slide

  6. Genome
    (DNA)
    Mapped reads
    Adapted from @jtleek
    Common analysis pipelines:
    • Feature coun6ng (gene or exon level)
    • Transcript assembly

    View full-size slide

  7. Challenges in counting
    hFp://www-huber.embl.de/users/anders/HTSeq/doc/count.html

    View full-size slide

  8. Annotation variation
    Frazee et al, Biostatistics, 2014

    View full-size slide

  9. DER finder approach
    •  Find con6guous base pairs with
    Differen6al Expression signal à DE
    Regions or DERs
    •  Find nearest annotated feature

    View full-size slide

  10. coverage
    vector
    2 6 0 11 6
    Genome
    (DNA)
    Read coverage
    Adapted from @jtleek

    View full-size slide

  11. Collado-Torres et al, bioRxiv, 2016
    GTEx data
    0 4 8 16 32 64
    Liver Heart Testis
    Coverage
    STK19, 0 bp from tss: covers

    +
    Genes
    31978500 31979500 31980500 31981500
    +
    Tx
    chr6

    View full-size slide

  12. 0 1 2
    Neo.F Neo.A notNeo.F notNeo.A CBC.F CBC.A
    Mean coverage
    IL17RD, 72831 bp from tss: overlaps 3'

    +
    Genes
    57124000 57126000 57128000 57130000

    Tx
    chr3
    Coverage dip
    Detect artifacts

    View full-size slide

  13. Input data
    n samples →
    ~348 million nt
    11.24%
    coverage
    Rows with at least 1 sample with coverage > 5
    Adapted from @jtleek

    View full-size slide

  14. •  Null model
    •  Alterna6ve Model
    Finding DERs by expressed regions
    • Find regions of the genome with expression data
    • Compute coverage matrix
    • Apply sta6s6cal tests
    i: expressed region
    j: sample

    View full-size slide

  15. Project size is increasing






































    2009
    (11)
    2010
    (46)
    2011
    (121)
    2012
    (235)
    2013
    (408)
    2014
    (625)
    2015
    (548)
    2016
    (18)
    7
    8
    9
    10
    11
    12
    Project size in base−pairs over the years
    log10 base−pairs per project
    hFps://jhubiosta6s6cs.shinyapps.io/recount/

    View full-size slide

  16. • Data: 3 6ssues (liver, tes6s, heart), 8 samples each
    • Align with
    • Iden6fy expressed regions with derfinder
    – Adjust coverage (40 million)
    – Find expressed regions (cutoff 5)
    – Discard ERs < 9 bp
    GTEx: DERs via expressed regions

    View full-size slide

  17. Presence of intronic ERs
    Can strictly intronic ERs differen6ate 6ssues?

    View full-size slide

  18. PCs differentiate tissues

    View full-size slide

  19. Differential intronic ERs | exonic ERs

    View full-size slide

  20. Differential intronic ERs | exonic ERs
    0 8 32 64 128
    Liver Heart Testis
    Coverage
    RBX1, 395 bp from tss: inside

    +
    Genes
    40951000 40951500 40952000 40952500
    +
    Tx
    chr22

    View full-size slide

  21. Simulation setup
    3 replicates:
    2 groups, each with 5 samples
    ~2 million paired-end reads for chr17
    1/6 high, 1/6 low in group 2 vs group 1
    Annota6on:
    complete
    missing 20% of transcripts (8.28% exons)
    Reference set:
    3868 exons that overlap only 1 transcript

    View full-size slide

  22. Simulation results
    •  Similar power to methods that have complete
    annota6on
    •  Methods with incorrect annota6on lose a lot of power
    •  Higher empirical FDR/FPR

    View full-size slide

  23. Identifying brain development DERs
    Fetal Infant
    Child Teen
    Adult 50+
    6 / group, N = 36
    Discovery data
    Null:
    Alt:
    Models
    Initial results
    Jaffe et al, Nat. Neuroscience, 2015
    50,650 DERs replicated
    63,135 DERs
    Final results

    View full-size slide

  24. Age-associated DERs lack regional specificity
    in the human brain
    BrainSpan data
    Jaffe et al, Nat. Neuroscience, 2015

    View full-size slide

  25. Propor6on of Cells
    Expression changes across development may
    represent a changing neuronal phenotype
    Jaffe et al, Nat. Neuroscience, 2015
    Estimation method: Houseman et al, BMC Bioinformatics, 2012

    View full-size slide

  26. Collado-Torres et al, F1000Research, 2015
    regionReport
    chr start end strand p-value
    chr1 1000 2000 + 0.9
    chr2 5000 8000 - 0.001
    chr3 2468 2668 + 0.051
    . . . . .
    . . . . .
    . . . . .
    chrX 6000 6300 + 0.009
    chrX 6500 6800 - 0.5
    Genomic workflow:
    identify regions
    renderReport
    (A) default (B) custom
    (C) derfinderReport
    (D) DESeq2Report (E) edgeReport
    Create HTML/PDF
    report

    View full-size slide

  27. Collado-Torres et al, F1000Research, 2015
    regionReport

    View full-size slide

  28. Collado-Torres et al, F1000Research, 2015
    Interactive HTML reports
    (A)
    (B) Clickable buttons:
    show/hide code

    View full-size slide

  29. Collado-Torres et al, F1000Research, 2015
    Interactive HTML reports
    (A)
    (B)

    View full-size slide

  30. Chromatin immunoprecipitation seq.
    http://assets.illumina.com/content/dam/illumina-marketing/images/techniques/large/web-graphic-chipseq-workflow-large.jpg

    View full-size slide

  31. Common ChIP-seq analysis pipeline
    Sample 1
    Sample 2
    ...
    Sample N
    1
    call peaks
    call peaks
    call peaks
    U
    11
    U
    12
    U
    1N
    Number of
    unique peaks
    Merge unique
    peaks
    Total peaks:
    U
    merged
    ≤ U
    1
    + U
    2
    Determine differentially
    bound peaks
    between groups 1 & 2
    Total tests: U
    merged
    1
    Total: U
    1
    Sample 1
    Sample 2
    ...
    Sample N
    2
    call peaks
    call peaks
    call peaks
    U
    21
    U
    22
    U
    2N
    Number of
    unique peaks
    2
    Total: U
    2
    Group 1
    Group 2
    (A)
    (A)
    (B)
    (C)

    View full-size slide

  32. Common ChIP-seq analysis pipeline
    10400 10600 10800 11000 11200 11400
    0 8 16 32 64 128 256
    (−1,0] (0,1] (1,10] (10,20] (20,30] (30,100]
    Coverage
    MROH1 , 84229 bp from tss: inside

    +
    Genes
    145287800 145288000 145288200 145288400 145288600 145288800

    +
    Tx
    chr8
    0 8 16 32 64 128 256
    (−1,0] (0,1] (1,10] (10,20] (20,30] (30,100]
    Coverage
    MROH1 , 84229 bp from tss: inside

    +
    Genes
    145287800 145288000 145288200 145288400 145288600 145288800

    +
    Tx
    chr8
    Merged peak
    Non-overlapping
    peaks
    (A) (B)
    Shuhla et al, PLoS Gene6cs 2013

    View full-size slide

  33. derfinder applied to ChIP-seq data
    • Find regions of the genome with binding data
    • Smooth binding signal
    • Compute coverage matrix
    • Apply sta6s6cal tests

    View full-size slide

  34. Brain ChIP-seq data
    EpiMap project

    View full-size slide

  35. Regions with binding signal
    DBRs overlap with Ensembl v75 features
    exon intergenic
    intron
    0
    6170
    5963
    0
    711
    8269
    923
    8720
    DBRs overlap with Ensembl v75 features
    exon intergenic
    intron
    0
    155897
    57577
    0
    8426
    36591
    2833
    7125
    H3K4me3 H3K27ac

    View full-size slide

  36. Variation mostly explained


























































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































    ● ●


























































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































    0 20 40 60 80 100
    Percentage variance explained
    Brain
    region
    C
    ell type
    Age
    at death
    H
    em
    isphere
    PM
    I
    pH
    Sex
    H
    eight
    BM
    I
    C
    hrom
    atin
    am
    ount
    M
    apped
    reads
    Individual
    Flow
    cell batch
    Library
    batch
    R
    esidual variation
    All Regions
    H3K4me3 data: all regions with binding signal

    View full-size slide

  37. Differentially bound regions
    0 16 64 128 256 512
    A:N−:− A:N+:− A:N−:+ A:N+:+ D:N−:− D:N+:− D:N−:+ D:N+:+
    Coverage
    PRB2, 0 bp from tss: overlaps 5'

    +
    Genes
    11652500 11653000 11653500 11654000 11654500

    Tx
    chr12

    View full-size slide

  38. Differentially bound regions
    H3K4me3 DBRs by main covariates
    Brain region Cell type
    Age at death
    0
    1
    14604
    209
    0
    0
    48
    0
    H3K27ac DBRs by main covariates
    Brain region Cell type
    Age at death
    0
    2
    178869
    1235
    0
    0
    156
    0
    H3K4me3 H3K27ac

    View full-size slide
























































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































  39. ● ●



























































































































































































































































































































































































































































































































































































































































































































































































    ● ●







































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































    ● ●


































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































    0 20 40 60 80 100
    Percentage variance explained
    Brain
    region
    C
    ell type
    Age
    at death
    H
    em
    isphere
    PM
    I
    pH
    Sex
    H
    eight
    BM
    I
    C
    hrom
    atin
    am
    ount
    M
    apped
    reads
    Individual
    Flow
    cell batch
    Library
    batch
    R
    esidual variation
    All Regions
    Variation in differentially bound regions
    H3K4me3 data: only differen6ally bound regions
    Differential binding by brain region, cell type or age at death

    View full-size slide

  40. •  Resource with data from 2,040 projects
    •  Aligned with
    •  Total RNA-seq samples:
    49,657 + 9,662 = 59,319
    hFps://jhubiosta6s6cs.shinyapps.io/recount/

    View full-size slide

  41. recount: via the web
    hFps://jhubiosta6s6cs.shinyapps.io/recount/

    View full-size slide

  42. Mo6va6ng problem: iden6fy and validate
    regions of the genome that change expression
    during brain development
    1. derfinder permits discovery of novel expressed
    & differen6ally bound regions
    2.  we iden6fied & validated gene expression
    changes in the developing brain
    3.  we have developed tools for reproducible
    shareable repor6ng
    4.  these tools can be easily be used to process
    2,040 projects via recount

    View full-size slide

  43. References + software + code
    •  Collado-Torres, et al. bioRxiv (2016) doi:10.1101/015370
    –  http://bioconductor.org/packages/derfinder
    –  http://leekgroup.github.io/derSupplement/
    •  Collado-Torres, et al. F1000Research (2015) doi:10.12688/f1000research.6379.1
    -  http://www.bioconductor.org/packages/regionReport
    -  http://leekgroup.github.io/regionReportSupp/
    •  Collado-Torres and Nellore, et al. in prep
    –  https://github.com/leekgroup/recount
    –  https://jhubiostatistics.shinyapps.io/recount
    •  Nellore, Collado-Torres, et al. bioRxiv (2015) doi:10.1101/019067
    -  rail.bio
    • Nellore, …, Collado-Torres, et al. bioRxiv (2016) doi:10.1101/038224
    - intropolis.rail.bio
    •  Jaffe, Shin, Collado-Torres, et al. Nat. Neurosci. (2015) doi:10.1038/nn.3898
    –  https://github.com/lcolladotor/libd_n36
    –  https://github.com/leekgroup/enrichedRanges

    View full-size slide

  44. Acknowledgements
    Committee members
    Jeffrey Leek
    Daniele Fallin
    Kasper Hansen
    Alexis Battle
    Andrew Jaffe
    Hongkai Ji
    Fernando Pineda
    Collaborators
    Alyssa Frazee
    Abhinav Nellore
    Michael Love
    Ben Langmead
    Rafael Irizarry
    Funding
    NIH
    LIBD
    JHU-Biostats
    CONACyT México

    View full-size slide

  45. Single-base F-statistics
    •  Null model
    •  Alterna6ve Model
    •  F-sta6s6c
    i: base-pair
    j: sample
    Collado-Torres et al, bioRxiv, 2015

    View full-size slide

  46. Single-base F-statistics
    Collado-Torres et al, bioRxiv, 2015
    BrainSpan data

    View full-size slide

  47. Compare DERs vs annotation
    Collado-Torres et al, bioRxiv, 2015
    BrainSpan data

    View full-size slide

  48. Jaffe et al, Nat. Neuroscience, 2015

    View full-size slide

  49. Widespread differential expression of novel
    transcriptional activity
    Jaffe et al, Nat. Neuroscience, 2015

    View full-size slide

  50. DERs validate: Cytosolic vs total mRNA
    fractions
    Jaffe et al, Nat. Neuroscience, 2015

    View full-size slide

  51. Age-associated DERs are conserved in the
    developing mouse cortex
    Mouse cerebral cortex, comparing E17 (N=4) to
    adult (N=3) C57BL/6 mice Data from Dillman 2013
    Jaffe et al, Nat. Neuroscience, 2015

    View full-size slide

  52. CBC: 28
    MD: 24
    STR: 28
    AMY: 31
    HIP: 32
    DFC: 34
    Total N samples: 487
    BrainSpan data
    Coverage Data from BrainSpan:
    hFp://download.allenins6tute.org/brainspan/MRF_BigWig_Gencode_v10/
    VFC: 30 MFC: 32 OFC: 30 M1C: 25
    S1C: 26 IPC: 33 A1C: 30 STC: 35 ITC: 33
    V1C: 33

    View full-size slide