Upgrade to Pro — share decks privately, control downloads, hide ads and more …

gbs2015

 gbs2015

Leonardo Collado-Torres

October 22, 2015
Tweet

More Decks by Leonardo Collado-Torres

Other Decks in Science

Transcript

  1. Annotation-agnostic differential
    expression analysis
    Leonardo Collado-Torres
    @fellgernon

    View full-size slide

  2. motivating problem: identify and validate
    regions of the genome that change
    expression during brain development

    View full-size slide

  3. RNA-seq
    reads
    Genome
    (DNA)
    RNA transcripts
    (many possible
    variants)
    Measuring gene expression: RNA-seq
    Adapted from @jtleek

    View full-size slide

  4. Challenges in counting
    h"p://www-huber.embl.de/users/anders/HTSeq/doc/count.html

    View full-size slide

  5. Annotation variation
    Frazee et al, Biostatistics, 2014

    View full-size slide

  6. DER finder approach
    •  Find contiguous base pairs with
    Differential Expression signal à DE
    Regions or DERs
    •  Find nearest annotated feature

    View full-size slide

  7. coverage
    vector
    2 6 0 11 6
    Genome
    (DNA)
    Read coverage
    Adapted from @jtleek

    View full-size slide

  8. Jaffe et al, Nat. Neuroscience, 2015

    View full-size slide

  9. Single-base F-statistics
    •  Null model
    •  Alternative Model
    •  F-statistic
    i: base-pair
    j: sample
    Collado-Torres et al, bioRxiv, 2015

    View full-size slide

  10. Single-base F-statistics
    Collado-Torres et al, bioRxiv, 2015
    BrainSpan data

    View full-size slide

  11. Compare DERs vs annotation
    Collado-Torres et al, bioRxiv, 2015
    BrainSpan data

    View full-size slide

  12. Input data
    n samples →
    ~348 million nt
    11.24%
    coverage
    Rows with at least 1 sample with coverage > 5
    Adapted from @jtleek

    View full-size slide

  13. Finding DERs by expressed-regions

    View full-size slide

  14. Simulation
    similar in power, yet allows new
    discoveries

    View full-size slide

  15. Identifying brain development DERs
    Fetal Infant
    Child Teen
    Adult 50+
    6 / group, N = 36
    Discovery data Null:
    Alt:
    Models
    Cutoff
    Details
    •  Rank DERs by area
    •  1000 permutations
    •  Control FWER (≤ 5%) by max area
    per permutation
    Results
    63,135 DERs
    20.509
    Corresponds to p-value 10-08
    Jaffe et al, Nat. Neuroscience, 2015

    View full-size slide

  16. Replicating DERs
    Fetal Infant
    Child Teen
    Adult 50+
    6 / group, N = 36
    Replication data Null:
    Alt:
    Models
    Cutoff
    Details
    Per sample and per DER calculate
    average expression
    Results
    50,650 DERs replicated
    Single F-statistic per DER
    p-value < 0.05
    Jaffe et al, Nat. Neuroscience, 2015

    View full-size slide

  17. Jaffe et al, Nat. Neuroscience, 2015

    View full-size slide

  18. Widespread differential expression of novel
    transcriptional activity
    Jaffe et al, Nat. Neuroscience, 2015

    View full-size slide

  19. DERs validate: Cytosolic vs total mRNA
    fractions
    Jaffe et al, Nat. Neuroscience, 2015

    View full-size slide

  20. CBC: 28
    MD: 24
    STR: 28
    AMY: 31
    HIP: 32
    DFC: 34
    Total N samples: 487
    BrainSpan data
    Coverage Data from BrainSpan:
    h"p://download.alleninsUtute.org/brainspan/MRF_BigWig_Gencode_v10/
    VFC: 30 MFC: 32 OFC: 30 M1C: 25
    S1C: 26 IPC: 33 A1C: 30 STC: 35 ITC: 33
    V1C: 33

    View full-size slide

  21. Age-associated DERs lack regional specificity
    in the human brain
    BrainSpan data
    Jaffe et al, Nat. Neuroscience, 2015

    View full-size slide

  22. ProporUon of Cells
    Expression changes across development may
    represent a changing neuronal phenotype
    Jaffe et al, Nat. Neuroscience, 2015
    Estimation method: Houseman et al, BMC Bioinformatics, 2012

    View full-size slide

  23. LIBD Human DLPFC Development
    •  UCSC “Track Hub”
    Jaffe et al, Nat. Neuroscience, 2015

    View full-size slide

  24. • Data: 3 tissues, 12 samples each
    • Align with
    • Identify expressed regions with derfinder
    – Adjust coverage (40 mi)
    – Find expressed regions (cutoff 5)
    – Discard ERs < 9 bp
    GTEX: expressed regions

    View full-size slide

  25. •  221246 ERs
    – 160817 strictly exonic (73%)
    – 26740 exonic + intronic (12%)
    – 22375 strictly intronic (10%)
    •  Can strictly intronic ERs differentiate
    tissues?
    Presence of intronic ERs

    View full-size slide

  26. PCs differentiate tissues

    View full-size slide

  27. PCs differentiate tissues

    View full-size slide

  28. Differential intronic ERs adjusting for
    exonic ERs

    View full-size slide

  29. Differential intronic ERs | exonic ERs

    View full-size slide

  30. Differential intronic ERs | exonic ERs

    View full-size slide

  31. Collado-Torres et al, F1000Research, 2015
    regionReport

    View full-size slide

  32. motivating problem: identify and validate
    regions of the genome that change
    expression during brain development
    1. derfinder permits discovery of novel
    expressed regions
    2. we identified & validated gene
    expression changes in the developing
    brain
    3. we have developed tools for
    reproducible/shareable reporting

    View full-size slide

  33. Acknowledgements
    Hopkins
    Jeffrey Leek
    Alyssa Frazee
    Abhinav Nellore
    Ben Langmead
    LIBD
    Andrew Jaffe
    Jooheon Shin
    Nikolay Ivanov
    Amy Deep
    Ran Tao
    Yankai Jia
    Thomas Hyde
    Joel Kleinman
    Daniel Weinberger
    Harvard
    Rafael Irizarry
    Michael Love
    Funding
    NIH
    LIBD
    CONACyT México

    View full-size slide

  34. References + software + code
    •  Collado-Torres L, et al. bioRxiv (2015) doi:10.1101/015370
    –  http://bioconductor.org/packages/derfinder
    •  Collado-Torres L, et al. F1000Research (2015) doi:10.12688/f1000research.6379.1
    -  http://www.bioconductor.org/packages/regionReport
    -  http://lcolladotor.github.io/regionReportSupp/
    •  Nellore, et al. bioRxiv (2015) doi:10.1101/019067
    - rail.bio
    •  Jaffe AE, et al. Nat. Neurosci. (2015) doi:10.1038/nn.3898
    –  https://github.com/lcolladotor/libd_n36
    –  https://github.com/lcolladotor/enrichedRanges
    •  Frazee AC, et al. Biostatistics. (2014) doi:10.1093/biostatistics/kxt053
    –  https://github.com/leekgroup/derfinder

    View full-size slide