Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Banff 2017

Banff 2017

Banff International Research Station: Statistical and Computational Challenges in Large Scale Molecular Biology

James Taylor

March 30, 2017
Tweet

More Decks by James Taylor

Other Decks in Science

Transcript

  1. 1. Crosslink Protein/DNA complex 2. Restriction Enzyme Digest 3. Biotin

    ill and Ligate 4. Pull down Junctions 4. Sequence Reminder: Hi-C for measuring chromatin interactions A B Count matrix, for pairs of restriction fragments or larger bins (Lieberman-Aiden et al. 2009)
  2. Hi-C Analysis Raw Count Matrix Normalization & JK  %

    J K G J G K DJK ɿ ' GJ GK EJK FH ɿ 1PJTTPO GJ GK % EJK &JK &YQFDUFE DPVOU CFUXFFO CJOT J BOE K % J K  %JTUBODF DPNQPOFOU FJUIFS FYQMJDJU PS FNQJSJDBMMZ FT GJ GK #JO TQFDJöD DPSSFDUJPO GBDUPS EJSFDUMZ MFBSOFE PS QBSBN ċ
  3. Hi-C Analysis Raw Count Matrix Normalization N ora Calling Structures

    (Hardish) N ora Calling point to point interactions (Helluv hard)
  4. What we think we might know from Hi-C data E

    F Global organization into two compartments: A and B (Lieberman-Aiden et al. 2009)
  5. What we think we might know from Hi-C data E

    F Global organization into two compartments: A and B (Lieberman-Aiden et al. 2009) 0 3.0 0 3.0 3.0 –3.0 Chr2: 2 Mb hg18 138000000 139000000 140000000 30 _ –30 _ 30 _ –30 _ 16 _ Local Organization into “Topologically Associated Domains” (Dixon et al. 2012)
  6. What we think we might know from Hi-C data E

    F Global organization into two compartments: A and B (Lieberman-Aiden et al. 2009) 0 3.0 0 3.0 3.0 –3.0 Chr2: 2 Mb hg18 138000000 139000000 140000000 30 _ –30 _ 30 _ –30 _ 16 _ Local Organization into “Topologically Associated Domains” ~1Mb, CTCF enriched at boundaries (Dixon et al. 2012) And more dynamic sub-TADs Associated with different combinations of CTCF, cohesion, mediator… others? (Phillips-Cremins et al. 2013)
  7. What we think we might know from Hi-C data E

    F Global organization into two compartments: A and B (Lieberman-Aiden et al. 2009) 0 3.0 0 3.0 3.0 –3.0 Chr2: 2 Mb hg18 138000000 139000000 140000000 30 _ –30 _ 30 _ –30 _ 16 _ CTCF Dependent, Cohesin Dependent And more dynamic sub-TADs Associated with different combinations of CTCF, cohesion, mediator… others? (Phillips-Cremins et al. 2013) CTCF Independent, Cohesin Independent (Nora et al. 2017 bioRxiv, Schwarzer et al. 2017 bioRxiv) Local Organization into “Topologically Associated Domains” ~1Mb, CTCF enriched at boundaries (Dixon et al. 2012)
  8. What about 3D reconstructions? Using the distances implied by the

    binned interactions we can reconstruct 3D positions (For example, using PCA)
  9. Chromosome conformation data is unsatisfying for at least two reasons

    1) It only measures chromatins relationship with itself, no connection to the structure of the nucleus 2) It only gives us a population average over millions of genome copies Can we provide context with alternative approaches?
  10. Identifying DNA near a protein of interest with DamID Dam+POI

    fusion Dam only control (Southall and Brand, 2007 Aughey and Southall 2015)
  11. Identifying DNA near a protein of interest with DamID 10

    kb Dam-fusion binding Sequence and compare (Southall and Brand, 2007 Aughey and Southall 2015)
  12. Identifying DNA near Lamina with DamID (Southall and Brand, 2007

    Aughey and Southall 2015) m m N GAT CNNN N N N N N N C T A G NGAT CN N m Dam Nuclear lamina
  13. Lamina Associated Domains (LMNB1) in Mouse Fibroblasts Chr 11: Chr

    12: DamID Lad Call (Luperchio, Sauria et al. 2017, bioRxiv doi:10.1101/122226)
  14. High resolution compartment calling 'PS FBDI CJO J TJ 

    # JG JO DPNQBSUNFOU # PS " JG " *OJUJBMJ[F T CBTFE PO TJHO PG FJHFOWFDUPS TDPSF 1SPCBCJMJUZ PG CJO J CFJOH JO DPNQBSUNFOU " JT 1 T J  H  ୑ Kɋ"J 1 Y  D JK ]T J  H 1 Y  D JK ]T J  H ɿ 1PJTTPO Ʉ JKH Ʉ JKH  ͜ ͟ ͝ ͟ ͞ G JK % HH E JK JG T K  H G JK % HHĤ E JK PUIFSXJTF 8IFSF DJK JT UIF SFBE DPVOU CFUXFFO GSBHNFOUT J BOE K GJK JT UIF TVN PG JOUFSBDUJPO OPSNBMJ[BUJPO WBMVFT GPS CJO JK BT MFBSOFE CZ )J'JWF EJK JT UIF JOUFSGSBHNFOU EJTUBODF BOE %HH  %HHĤ JT UIF EJTUBODF EFDBZ GVODUJPO GPS JOUFSBDUJPOT CPUI JO TUBUF H  JO EJòFSFOU TUBUFT 6QEBUF CJO BTTJHONFOU CBTFE PO 1 TJ JUFSBUF BOE TBNQMF
  15. LADs are the B compartment C B DamID Comp LADs

    Igh chr12 (Luperchio, Sauria et al. 2017, bioRxiv doi:10.1101/122226)
  16. LADs are the B compartment (Luperchio, Sauria et al, bioRxiv,

    2017) Rrank =0.716 Data point density 0 243 0 -39.0 49.8 0 -5.7 4.4 Compartment Score DamID Score B (Luperchio, Sauria et al. 2017, bioRxiv doi:10.1101/122226)
  17. LADs are the B compartment DamID Comp LADs Igh 1Mb

    110M 118M (Luperchio, Sauria et al, bioRxiv, 2017) (Luperchio, Sauria et al. 2017, bioRxiv doi:10.1101/122226)
  18. -100 0 100 -3.28 3.10 DamID -100 0 100 -19.29

    33.99 Compartment -100 0 100 -0.41 0.91 Boundary -100 0 100 0.11 0.92 H3K9me2 -100 0 100 -0.23 0.50 H3K27me3 -100 0 100 0.00 1.22 CTCF Distance from boundary (Kb) LAD Boundaries
  19. Fluorescent probes for LAD and nonLAD regions — Reddy Lab

    LAD Pool nonLAD Pool Remove repetitive elements In silico selection of probes based on Tm and GC content Remove probes with high homology to off target loci Chemical synthesis of 150bp oligos ULS label with Cy3 or Cy5 dyes Chr 11: Chr 12: (Luperchio, Sauria et al, bioRxiv, 2017) (Luperchio, Sauria et al. 2017, bioRxiv doi:10.1101/122226)
  20. Chr11 Chr12 non-LAD LAD LaminB1 Hoechst Merge Primary Fibroblasts (Luperchio,

    Sauria et al, bioRxiv, 2017) Chromosome Conformation Paints — Reddy Lab (Luperchio, Sauria et al. 2017, bioRxiv doi:10.1101/122226)
  21. LaminB1 LAD nonLAD Distance from periphery Fluorescence Chromosomes are organized

    into LAD and non-LAD domains (Luperchio, Sauria et al, bioRxiv, 2017) LAD regions are constrained at the lamina 0.0 1.0 2.0 3.0 4.0 0.00 0.02 0.04 Chromosome 11 0.0 1.0 2.0 3.0 4.0 Chromosome 12 Distance from Periphery ( m) 0.0 1.0 2.0 3.0 4.0 Overlay Chromosome 11 Chromosome 12 E Ď (Luperchio, Sauria et al. 2017, bioRxiv doi:10.1101/122226)
  22. LADs preferentially interact with other LADs -2 2 0 ME

    F Inter-domain Log2 Interaction E nrichment P < 10− 5 P < 10− 5 LAD by LAD nonLAD by LAD nonLAD by nonLAD (Luperchio, Sauria et al, bioRxiv, 2017) (Luperchio, Sauria et al. 2017, bioRxiv doi:10.1101/122226)
  23. LAD Igh locus UCS C Genes ≤ -3 ≥ 3

    Log2 Interaction E nrichment ME F Chr12 3M 121.3M (Luperchio, Sauria et al, in preparation) (Luperchio, Sauria et al, bioRxiv, 2017) (Luperchio, Sauria et al. 2017, bioRxiv doi:10.1101/122226)
  24. LAD Igh locus UCS C Genes ≤ -3 ≥ 3

    Log2 Interaction E nrichment ME F Chr12 3M 121.3M (Luperchio, Sauria et al, in preparation) (Luperchio, Sauria et al, bioRxiv, 2017) (Luperchio, Sauria et al. 2017, bioRxiv doi:10.1101/122226)
  25. Chr12 3M 121.3M 112M 120M L G ME F M

    L (Luperchio, Sauria et al, in preparation) (Luperchio, Sauria et al, bioRxiv, 2017) (Luperchio, Sauria et al. 2017, bioRxiv doi:10.1101/122226)
  26. Developmental regulation of local and higher order organization Chr12 3M

    121.3M 112M 120M L G ME F 10M 20M 30M 40M 50M 60M 70M 80M 90M 100M 110M 120M 112M 120M L ProB (Luperchio, Sauria et al, in preparation) (Luperchio, Sauria et al, bioRxiv, 2017) (Luperchio, Sauria et al. 2017, bioRxiv doi:10.1101/122226)
  27. Developmental regulation of local and higher order organization Chr12 121.3M

    L G ME F M 70M 80M 90M 100M 110M 120M L ProB (Luperchio, Sauria et al, in preparation) (Luperchio, Sauria et al, bioRxiv, 2017) (Luperchio, Sauria et al. 2017, bioRxiv doi:10.1101/122226)
  28. Dips: regions in LADs with low DamID signal -3.76 3.76

    DamID -9.58 9.58 Compartment -0.17 0.89 Boundary Genes 133M 134M 135M 136M LAD dip call (Luperchio, Sauria et al, in preparation) (Luperchio, Sauria et al, bioRxiv, 2017) (Luperchio, Sauria et al. 2017, bioRxiv doi:10.1101/122226)
  29. Dips contain putative regulatory modules H3K27me3 H3K27ac H3K4me1 H3K4me3 CBP

    DamID Compartment Boundary CTCF Cohesin -100 D 100 (Luperchio, Sauria et al, in preparation) (Luperchio, Sauria et al, bioRxiv, 2017) (Luperchio, Sauria et al. 2017, bioRxiv doi:10.1101/122226)
  30. DamID Compartment Boundary CTCF Cohesin H3K27me3 H3K27ac H3K4me1 H3K4me3 CBP

    UCSC TSS Log2 (H3K4me1/3) TSS present -100 D 100 -3.8 3.8 -100 D 100 -27 36 -100 D 100 -0.5 1.0 -100 D 100 0.0 2.0 -100 D 100 0.0 1.5 -100 D 100 -0.3 0.5 -100 D 100 0.0 3.6 -100 D 100 0.0 2.7 -100 D 100 0.0 2.7 -100 D 100 0.2 1.9 -100 D 100 0 1+ 7.5 -7.4 Distance from DIP boundary (Kb)
  31. Long range interaction from within dips -2.2 3.3 Log2 enrichment

    per 1Kb bin UCSC Genes CTCF 102.9M 103.0M 103.1M 103.2M Chr2 ≤-3 ≥6 Log2 Interaction Enrichment Enhancer-anchored TSS-anchored LAD DIP (Luperchio, Sauria et al, in preparation) (Luperchio, Sauria et al, bioRxiv, 2017) (Luperchio, Sauria et al. 2017, bioRxiv doi:10.1101/122226)
  32. Using the distances implied by the binned interactions we can

    reconstruct 3D positions (For example, again using PCA) These are ensemble models over millions of cells, do they recapitulate the single cell organization?
  33. MEF Chr12 ProB Chr12 -6.32 6.32 DamID Score -3.32 3.32

    DamID Score ĊĎ IGH 1µm 1µm D MEF pro-B
  34. Summary 1. LADs are the chromatin B-compartment 2. LADs/B-compartment is

    constrained at the nuclear lamina 3. LAD state / compartmentalization is developmentally coordinated 4. Small regions within LADs — much smaller than a stereotypical TAD — organize into the A compartment and contain evidence for regulatory activity
  35. C Repressed gene Activated gene TSS-containing DamID dip Putative enhancer-

    containing DamID dip Repressed subdomain Gene-rich LAD Gene-poor LAD TAD Nuclear membrane Nuclear interior A-Compartment B-Compartment
  36. ACKnowledgements Chromatin analysis and methods developed by Michael E. G.

    Sauria. Chromosome conformation paints: Teresa Luperchio and Karen Reddy HiFive available from github.com/bxlab/hiive, or . Our lab: Enis Afgan, Dannon Baker, Boris Brenerman, Min Hyung Cho, Dave Clements, PeterDeFord, German Uritskiy, Mallory Freeberg, Michael E. G. Sauria, Mo Heydarian, Sam Guerler Other collaborators: Anton Nekrutenko and the group,
 Craig Stewart and the group
 Ross Hardison and the VISION group
 Jennifer Phillips-Cremins and Victor Corces (sub-TADS and HiFive)
 Johnston, Kim, Hilser, and DiRuggiero labs (JHU Biology)
 JHU Genomics Collective NHGRI (HG005133, HG004909, HG005542, HG005573, HG006620)
 NIDDK (DK065806) and NSF (DBI 0543285, DBI 0850103) install with bioconda