Y-h. Taguchi, Department of Physics, Chuo Univeristy, Tokyo, Japan Turki Turki, Department of Computer Science, King Abdulaziz University, Jeddah 21589, Saudi Arabia the 8th Annual Congress of the European Society for Translational Medicine on novel therapeutics in solid tumors (EUSTM-2021) during 20-26 September, 2021.
international. I am glad if the audience can buy it and learn my method. Y-h. Taguchi, Unsupervised Feature Extraction Applied to Bioinformatics --- A PCA and TD Based Approach --- Springer International (2020)
i : a set of scholars in line Matrix x ij : a set of scholars aligned in a table (i.e. rows and columns) Tensor x ijk : a set of scholars aligned in an array more then two rows x ijk i j k 1 (1,2,3,4,...) (1 2 3 4 5 6 7 8 9 )
product of vectors, x ijk i:genes j:persons k:tissues G k j i l 1 l 2 l 3 = u l 1 i u l 2 j u l 3 k u l 1 i u l 2 j u l 3 k x ijk ≃∑ l 1 =1 L 1 ∑ l 2 =2 L 2 ∑ l 3 =1 L 3 G (l 1 l 2 l 3 )u l 1 i u l 2 j u l 3 k
ijk upon i” → u l1i “Dependence of x ijk upon j” → u l2j “Dependence of x ijk upon k” → u l3k ← Healthy control vs patient ← tissue specificity Gene selection ↑ We can answer the question : Which genes are expressed between healthy controls and patients in tissue specific manner?
ChIP-seq, ATAC-seq, etc…) is always problematic, because their causal relationship unclear. This prevents us from developing the suitable model to understand what the relationship between them. In this talk, I apply Tensor Decomposition (TD) based unsupervised Feature Extraction (FE) to epigetic multiomics data in fully unsupervised manner.
=1 L 1 ∑ l 2 =1 L 2 ∑ l 3 =1 L 3 ∑ l 4 =1 L 4 G (l 1 l 2 l 3 l 4 )u l 1 j u l 2 k u l 3 m u l 4 i G : weight of contribution of individual terms to x ijkm u l1j : the l 1 th unit vector represents j (epigenetics) dependence u l2k : the l 2 th unit vector represents k (normal vs tumor) dependence u l3m : the l 3 th unit vector represents m (biological replicates) dependence u l4i : the l 4 th unit vector represents i (25000 NA region) dependence
region with assuming u l4i obeys Gaussian (null hypothesis) using cumulative χ2 distribution. P i s are collected by Benjamini-Hochberg criterion. 1447 regions associated with adjusted P i less than 0.01 are selected. P(p i ) 1-p i 0 1
TD based unsupervised FE to integrated analysis of prostate cancer multiomics data sets. TD based unsupervised FE selected genomic regions whose value correctly discriminate not only kind of epigeneitc data but also normal tissues from tumors. 1785 genes are significantly related to prostate cancer. TD based unsupervised FE can identify promising compounds that can be used for prostate cancer treatment.