Slide 1

Slide 1 text

Identification of Transcription Factors, Biological Pathways, and Diseases as Mediated by N6- methyladenosine using Tensor Decomposition-Based Unsupervised Feature Extraction. Y-h. Taguchi, Tokyo, Japan S. Akila Parvathy Dharshini, M. Michael Gromiha, Chennai, India. 中央大学 中央大学 Chuo University Chuo University IIT Madras IIT Madras

Slide 2

Slide 2 text

Introduction How gene expression profile is regulated (i.e. transcription, DNA → RNA) is still unclear. N6-methyladenosine (in short, m6A) is one of recently identified mechanism that regulates gene expression profile. m6A is methylation of RNA, a sort of post- transcription RNA modification, which is the transcription mechanism of interest very recently. M6A is known to contribute to regulation of gene expression in various aspect.

Slide 3

Slide 3 text

A Review in Research Progress Concerning m6A Methylation and Immunoregulation, Caiyan Zhang, Jinrong Fu and Yufeng Zhou

Slide 4

Slide 4 text

Liu et al (Science 2020) measured two histone modification (H3K27ac & H3K4me3), m6A and RNA expression simultaneously in human (HEC-1-A) and mouse cell lines (mESC) with METTL3 KO. METTL3 is methylation writer that controls m6A. Thus METTL3 KO is supposed to suppress m6A. In their analysis, Liu et al showed that m6A can regulate gene expression through recruiting histone modification to DNA. Thus, METTL3 KO is supposed to control histone modification, m6A and gene expression simultaneously. But which genes are affected was not identified specifically, because of small number of replicates because of small number of replicates (three for human cell lines and two for mESC).

Slide 5

Slide 5 text

In this study, we aimed to identify genes genes whose expression is affected by this whose expression is affected by this mechanism mechanism (transcription regulation through recruiting histone modification to DNA by m6A) using tensor decomposition (TD) based unsupervised feature extraction (FE) (Taguchi, 2020).

Slide 6

Slide 6 text

TD based unsupervised FE is unsupervised feature selection/ extraction method using TD. TD is a mathematical method that decompose tensor to a product of vectors. Although there are multiple implementation of TD, we specifically employed higher order singular value decomposition (HOSVD) to derive TD. x ijk G u l 1 i u l 2 j u l 3 k ≃ x ijk ≃∑ l 1 ∑ l 2 ∑ l 3 G(l 1 l 2 l 3 )u l 1 i u l 2 j u l 3 k

Slide 7

Slide 7 text

In this study, measurements are formatted as tensors x ijks ∈ℝN ×4×K×2 i i: ith gene j j:jth measurements j=1: H3K4me3 j=2: H3K27ac j=3: m6A j=4: gene expression (mRNA) Mouse: k=1: control k=2: 1st KO k=3: 2nd KO k k: METTL3 KO vs control Human: k=1: control k=2: KO s s: replicates

Slide 8

Slide 8 text

x ijks =∑ l 1 =1 4 ∑ l 2 =1 K ∑ l 3 =1 2 ∑ l 4 =1 N G (l 1 l 2 l 3 l 4 )u l 1 j u l 2 k u l 3 s u l 4 i Constant over histone modifications, m6A and mRNA expression Distinct between METTL3 KO vs Control Constant over replicates l l 1 1 =1 =1 l l 2 2 =2 =2 l l 3 3 =1 =1

Slide 9

Slide 9 text

x ijks =∑ l 1 =1 4 ∑ l 2 =1 K ∑ l 3 =1 2 ∑ l 4 =1 N G (l 1 l 2 l 3 l 4 )u l 1 j u l 2 k u l 3 s u l 4 i Constant over histone modifications, m6A and mRNA expression Distinct between METTL3 KO vs Control Constant Over replicates l l 1 1 =1 =1 l l 2 2 =2 =2 l l 3 3 =1 =1 l l 2 2 =3 =3

Slide 10

Slide 10 text

Which l 4 is associated with selected l 1 ,l 2 , and l 3 ? l l 4 4 =5 =5

Slide 11

Slide 11 text

P-values are attributed to ith gene as (i.e. assuming that u l4i obeys Gaussian) P i s were corrected with considering multiple comparison corrections 740 human and 667 mouse genes (i) associated with P i <0.01 Cumulative χ2 distribution

Slide 12

Slide 12 text

They are highly overlapped though human and mouse experiments are independent of each other. Odds ratio : 4.00, P = 1.91 × 10−17

Slide 13

Slide 13 text

Selected genes enriched in many biological terms “ENCODE and ChEA Consensus TFs from ChIP-X” Twenty seven commonly selected TFs

Slide 14

Slide 14 text

REACTOME enrichment analysis for 27 TFs

Slide 15

Slide 15 text

Although there are many other biological enrichment, we have not time to discuss them. Conclusions We successfully applied TD based unsupervised FE to the integrated analysis of histone modification, m6A and mRNA expression. We successfully identified several hundred genes shared with two species human and mouse. TFs that commonly regulate genes identified for two species are also identified and are enriched with many biological features. TD based unsupervised FE is promising method to be applied to integrated analysis of multi omics data set.