dbfinder2015

 dbfinder2015

7382f7fe30561274624635116513ca37?s=128

Leonardo Collado-Torres

November 30, 2015
Tweet

Transcript

  1. dbFinder Leonardo Collado-Torres November 30, 2015

  2. DER finder approach • Find contiguous base pairs with Differential

    Expression signal à DE Regions or DERs • Find nearest annotated feature
  3. coverage vector 2 6 0 11 6 Genome (DNA) Read

    coverage Adapted from @jtleek
  4. Jaffe et al, Nat. Neuroscience, 2015

  5. Single-base F-statistics • Null model • Alternative Model • F-statistic

    i: base-pair j: sample Collado-Torres et al, bioRxiv, 2015
  6. Single-base F-statistics Collado-Torres et al, bioRxiv, 2015 BrainSpan data

  7. Compare DERs vs annotation Collado-Torres et al, bioRxiv, 2015 BrainSpan

    data
  8. Common ChIP-seq analysis pipeline Peak Call Peak Call Peak Call

    Peak Call Peak Call Peak Call … Sample 1 Sample 2 Sample 3 Sample N Sample N-1 Sample N-2 2100 4230 7654 1236 5400 5954 # Unique Peaks Merge* All Unique Peaks (40000) ir Identify which merged peaks are differentially expressed using coverage (40000 tests)
  9. Common ChIP-seq analysis pipeline Peak Call Peak Call Peak Call

    Peak Call Peak Call Peak Call … Sample 1 Sample 2 Sample 3 Sample N Sample N-1 Sample N-2 2100 4230 7654 1236 5400 5954 # Unique Peaks Merge* All Unique Peaks (40000) ir Identify which merged peaks are differentially expressed using coverage (40000 tests) Biological variability within a group is not incorporated into finding peaks Variability across peaks is not formally incorporated into merging step
  10. Base-resolution differential binding ChIP-seq analysis pipeline … Sample 1 Sample

    2 Sample 3 Sample N Sample N-1 Sample N-2 Identify differentially bound peaks using single base-level derfinder analysis Single List of Candidate Peaks Empirical p-values via permutations and FDRs Significant Peaks for Differential Binding “dbFinder”
  11. Re-analysis of H3K4me3 data from developing and aging human brain

    • Downloaded H3K4me3 data: NeuN+ fraction of postnatal frontal cortex samples (Shuhla et al, PLoS Genetics 2013) • Modeled linear age-related changes in coverage across the genome • Identified 561 dbPeaks at FDR < 10% (using 100 permutations, 2.5 hours on JHPCE)
  12. Re-analysis of H3K4me3 data from developing and aging human brain

  13. Re-analysis of H3K4me3 data from developing and aging human brain

    Post-hoc analysis on mean coverage per sample per dbPeak
  14. Re-analysis of H3K4me3 data from developing and aging human brain

    Overlap with published 1157 peaks (742 decrease across age, 415 increase): Down Up Not in dbPeaks 605 397 In dbPeaks 137 18 Down Up In Publish Peaks 278 21 Not Publish Peaks 262 Published peaks overlapping significant dbPeaks Significant dbPeaks overlapping published peaks
  15. Re-analysis of H3K4me3 data from developing and aging human brain

    • Much shorter peaks in dbFinder analysis: median of 104bp (IQR: 87-132) versus 2047bp (1490-2959) in published peaks
  16. Future directions • Add smoothing to test statistics prior to

    dbPeak finding • Analyze other datasets: – differentially binding by tissue/cell type from ENCODE across multiple groups
  17. Acknowledgements Hopkins Jeffrey Leek LIBD Andrew Jaffe Indigo Rose Funding

    NIH LIBD CONACyT México