Slide 1

Slide 1 text

Tensor Decomposition Based Integrated Analysis of Protein-Protein Interaction with Cancer Gene Expression Can Improve the Coincidence with Clinical Labels Y-h. Taguchi, Department of Physics, Chuo University Turki Turki, Department of Computer Science, King Abdulaziz University For material request Wechat →

Slide 2

Slide 2 text

https://doi.org/10.1101/2023.02.26.530076 https://doi.org/10.1101/2023.02.26.530076

Slide 3

Slide 3 text

Motivation: Integrated analysis of protein-protein interaction (PPI) and gene expression gene expression profiles: N genes ✕ M samples PPI: N genes (proteins) ✕ N genes (proteins) How can we integrate two matrices with distinct dimensions?

Slide 4

Slide 4 text

Using tensor decomposition (TD) = Tucker decomposition x ijk ∈ℝN ×M×K =∑ l 1 =1 N ∑ l 2 =1 M ∑ l 3 =1 K G(l 1 l 2 l 3 )u l 1 i u l 2 j u l 3 k How can we make use of TD to integrate PPI and gene expression?

Slide 5

Slide 5 text

Apply singular value decomposition (SVD) to gene expression profile x ij ∈ℝN✕M and PPI n ii’ ∈ℝN✕N x ij =∑ l=1 L λl ' u ' li v lj n ii' =∑ l=1 L λl u li u li' Bundle u li and u’ li to generate tensor x ilk ∈ℝN✕L✕2 x il1 =u li , x il 2 =u ' li Apply TD to x ilk x ilk ∈ℝN ×L×2 =∑ l 1 =1 N ∑ l 2 =1 L ∑ l 3 =1 2 G(l 1 l 2 l 3 )~ u l 1 i ~ u l 2 l ~ u l 3 k (gene expression) (PPI)

Slide 6

Slide 6 text

N genes M samples Gene expression N genes N genes SVD N genes N genes L SVV L SVV N ⨉ L ⨉ 2 2 (gene expression or network)⨉ 2 SVV N genes ⨉ N SVV L SVV ⨉ L SVV N genes L SVV ⨉ M samples L SVV M samples Class labeling Comparisons HOSVD M samples L SVV PPI

Slide 7

Slide 7 text

Recover singular value vector (SVV) attributed to sample j as ~ v lj =∑ i=1 N x ij ~ u li Compare coincidence between v lj or v~ lj with class label (categorical regression) v lj =a l +∑ s=1 S b ls δjs ~ v lj =a ' l +∑ s=1 S b' ls δjs δ js = 1 only when sample j belongs to class s otherwise 0 (SVD) (TD)

Slide 8

Slide 8 text

N genes M samples Gene expression N genes N genes SVD N genes N genes L SVV L SVV N ⨉ L ⨉ 2 2 (gene expression or network)⨉ 2 SVV N genes ⨉ N SVV L SVV ⨉ L SVV N genes L SVV ⨉ M samples L SVV M samples Class labeling Comparisons HOSVD M samples L SVV PPI

Slide 9

Slide 9 text

(1) “PATIENT. VITAL STATUS”, (2) “PATIENT . STAGE EVENT.PATHOLOGIC STAGE ”, (3)“PATIENT.STAGE EVENT.TNM CATEGORIES . PATHOLOGIC CATEGORIES. PATHOLOGIC M” (4) “PATIENT. STAGE EVENT.TNM CATEGORIES.PATHOLOGIC CATEGORIES .PATHOLOGIC T ” (5) “ PATIENT.STAGE EVENT. TNM CATEGORIES. PATHOLOGIC CATEGORIES .PATHOLOGIC N ” (6) AUC for “ PATIENT. VITAL STATUS”. 27 cancers with 5 kinds of classes

Slide 10

Slide 10 text

Scatter plot of P P values values (ascending order) obtained by obtained by categorical regression categorical regression for SVD and TD for 27 cancers in class (1) Red triangles: significant Blue crosses: not significant TD (vertical axis) is more coincident with classes than SVD (horizontal axis) Class (1)

Slide 11

Slide 11 text

Class (2) Class (3)

Slide 12

Slide 12 text

Class (4) Class (5)

Slide 13

Slide 13 text

Compare P values obtained by categorical regression for SVD and TD with either t test or Wilcoxon test for class (1) to (5). TD is more coincident with classes than SVD other than class (3)

Slide 14

Slide 14 text

AUC for the discrimination task for class (1) with 11 cancers with significant P-values for categorical regression. TD (vertical axis) is almost always better than SVD.

Slide 15

Slide 15 text

Conclusions Integrated analysis of gene expression and PPI can improve the coincidence between SVV with class labels although PPI does not PPI does not include class information at all. include class information at all. doi:10.18129/B9.bioc.TDbasedUFE doi:10.18129/B9.bioc.TDbasedUFE doi: doi:10.18129/B9.bioc.TDbasedUFEadv 10.18129/B9.bioc.TDbasedUFEadv Analyses with TD can be performed in two bioconductor packages by myself