Slide 1

Slide 1 text

Unsupervised tensor decomposition-based method to extract candidate transcriptionfactors as histone modification bookmarks in post-mitotic transcriptional reactivation Y-h. Taguchi Department of Physics, Chuo University, Tokyo, Japan Turki Turki Department of Computer Science, King Abdulaziz University, Jeddah, Saudi Arabia bioRxiv https://www.biorxiv.org/content/10.1101/2020.09.23.309633v1

Slide 2

Slide 2 text

Introduction Transcription ← ⨉ → DNA replication ⇓ DNA replication ⇒ “Transcription → ⨉” ⇓ Genome state (TF binding, histone modification) → Reset? ⇓ No, should be bookmarked bookmarked ! ⇓ But how?

Slide 3

Slide 3 text

whole-genome histone modification profile GSE141081 x ijkms ∈ℝN ×2×4×3×2 ⇒ formatted as tensor i: genome regions of 25,000 bps j: cell lines 1:RPE1, 2:USO2 k: histone modification 1: H3K27ac, 2: H3K4me1, 3: H3K4me3, 4: input m:cell cycle 1:interphase, 2:prometaphase, 3 :anaphase/telophase s:replicates

Slide 4

Slide 4 text

Tensor decomposition (Tucker) by higher order singular value decomposition (HOSVD) x ijkms x ijkms ≃∑ l 1 =1 2 ∑ l 2 =1 4 ∑ l 3 =1 3 ∑ l 4 =1 2 ∑ l 5 =1 N G(l 1 l 2 l 3 l 4 l 5 )u l 1 j u l 2 k u l 3 m u l 4 s u l 5 i G u l 2 k u l 3 m ≃ ● ● ● ● ● ● u l 1 j

Slide 5

Slide 5 text

∑ l 1 =1 2 ∑ l 2 =1 4 ∑ l 3 =1 3 ∑ l 4 =1 2 ∑ l 5 =1 N G(l 1 l 2 l 3 l 4 l 5 )u l 1 j u l 2 k u l 3 m u l 4 s u l 5 i l 1 =1: no dependence upon cell lines 2 ≤ l 2 ≤4: some dependence upon histone modification l 3 =3: most significant reactivation during phases l 4 =1: no dependence upon replicates ∑ l 2 =2 4 G(1,l 2 ,3,1,l 5 )2 l l 5 5 =4 =4

Slide 6

Slide 6 text

Dependence upon cell cycle Reactivation Reactivation l l 3 3 =1 =1 l l 3 3 =2 =2 l l 3 3 =3 =3

Slide 7

Slide 7 text

control Histone modification dependence l l 2 2 =1 =1 l l 2 2 =2 =2 l l 2 2 =3 =3 l l 2 2 =4 =4

Slide 8

Slide 8 text

Cell line dependence Replicate dependence l l 1 1 =1 =1 l l 4 4 =1 =1

Slide 9

Slide 9 text

∑ l 2 =2 4 G(1,l 2 ,3,1,l 5 )2 l l 5 5 =4 =4

Slide 10

Slide 10 text

Attributing P-values to ith genomic region P i =P χ2 [> (u 4 i σ4 )2] Cumulative chi squared distribution Assuming that u 4i obeys Gaussian

Slide 11

Slide 11 text

507 DNA regions : BH criterion corrected P values less than 0.01 ⇓ 525 gene symbols are included

Slide 12

Slide 12 text

Are these genes associated with histone modification coincident with re-activation or bookmark?

Slide 13

Slide 13 text

Enrichment analysis TFs binding

Slide 14

Slide 14 text

Some biologically critical TFs are included.

Slide 15

Slide 15 text

Some biologically critical TFs are included.

Slide 16

Slide 16 text

Top 10 most frequently listed transcription factor (TF) families These are candidate bookmark TFs! These are candidate bookmark TFs!

Slide 17

Slide 17 text

Conclusion ● Bookmark mechanism of reactivation of transcription after DNA duplication was studied. ● TD based unsupervised FE was applied to three histone modification profiles before/after DNA replication ● Some candidate bookmark TFs were identified.