Identification of Transcription Factors, Biological Pathways, and Diseases as Mediated by N6-methyladenosine using Tensor Decomposition-Based Unsupervised Feature Extraction
Identification of Transcription Factors, Biological Pathways, and Diseases as Mediated by N6-methyladenosine using Tensor Decomposition-Based Unsupervised Feature Extraction
by N6- methyladenosine using Tensor Decomposition-Based Unsupervised Feature Extraction. Y-h. Taguchi, Tokyo, Japan S. Akila Parvathy Dharshini, M. Michael Gromiha, Chennai, India. 中央大学 中央大学 Chuo University Chuo University IIT Madras IIT Madras
→ RNA) is still unclear. N6-methyladenosine (in short, m6A) is one of recently identified mechanism that regulates gene expression profile. m6A is methylation of RNA, a sort of post- transcription RNA modification, which is the transcription mechanism of interest very recently. M6A is known to contribute to regulation of gene expression in various aspect.
& H3K4me3), m6A and RNA expression simultaneously in human (HEC-1-A) and mouse cell lines (mESC) with METTL3 KO. METTL3 is methylation writer that controls m6A. Thus METTL3 KO is supposed to suppress m6A. In their analysis, Liu et al showed that m6A can regulate gene expression through recruiting histone modification to DNA. Thus, METTL3 KO is supposed to control histone modification, m6A and gene expression simultaneously. But which genes are affected was not identified specifically, because of small number of replicates because of small number of replicates (three for human cell lines and two for mESC).
expression is affected by this whose expression is affected by this mechanism mechanism (transcription regulation through recruiting histone modification to DNA by m6A) using tensor decomposition (TD) based unsupervised feature extraction (FE) (Taguchi, 2020).
using TD. TD is a mathematical method that decompose tensor to a product of vectors. Although there are multiple implementation of TD, we specifically employed higher order singular value decomposition (HOSVD) to derive TD. x ijk G u l 1 i u l 2 j u l 3 k ≃ x ijk ≃∑ l 1 ∑ l 2 ∑ l 3 G(l 1 l 2 l 3 )u l 1 i u l 2 j u l 3 k
∈ℝN ×4×K×2 i i: ith gene j j:jth measurements j=1: H3K4me3 j=2: H3K27ac j=3: m6A j=4: gene expression (mRNA) Mouse: k=1: control k=2: 1st KO k=3: 2nd KO k k: METTL3 KO vs control Human: k=1: control k=2: KO s s: replicates
=1 K ∑ l 3 =1 2 ∑ l 4 =1 N G (l 1 l 2 l 3 l 4 )u l 1 j u l 2 k u l 3 s u l 4 i Constant over histone modifications, m6A and mRNA expression Distinct between METTL3 KO vs Control Constant over replicates l l 1 1 =1 =1 l l 2 2 =2 =2 l l 3 3 =1 =1
=1 K ∑ l 3 =1 2 ∑ l 4 =1 N G (l 1 l 2 l 3 l 4 )u l 1 j u l 2 k u l 3 s u l 4 i Constant over histone modifications, m6A and mRNA expression Distinct between METTL3 KO vs Control Constant Over replicates l l 1 1 =1 =1 l l 2 2 =2 =2 l l 3 3 =1 =1 l l 2 2 =3 =3
u l4i obeys Gaussian) P i s were corrected with considering multiple comparison corrections 740 human and 667 mouse genes (i) associated with P i <0.01 Cumulative χ2 distribution
time to discuss them. Conclusions We successfully applied TD based unsupervised FE to the integrated analysis of histone modification, m6A and mRNA expression. We successfully identified several hundred genes shared with two species human and mouse. TFs that commonly regulate genes identified for two species are also identified and are enriched with many biological features. TD based unsupervised FE is promising method to be applied to integrated analysis of multi omics data set.