Slide 1

Slide 1 text

Tensor decomposition based unsupervised feature extraction applied to drug discovery from gene expression analysis Y-h. Taguchi Department of Physics, Chuo University, Tokyo, Japan This slide can be obtained from this QR code

Slide 2

Slide 2 text

Three Reseach Projects Project 1: Inference drugs effective to cancers from gene expression profiles of drug treated cancer cell lines Project 2: Integrated analysis of gene expression profiles of drug treated tissues and human patients Project 3: Inference of drug efective to Alzheimer disease of mice brain single cell gene expression (without drug treated gene expression)

Slide 3

Slide 3 text

Project 1: Inference drugs effective to cancers from gene expression profiles of drug treated cancer cell lines

Slide 4

Slide 4 text

Drug candidate identification based on gene expression of treated cells using tensor decomposition-based unsupervised feature extraction for large-scale data Y-h. Taguchi BMC Bioinformatics volume 19, Article number: 388 (2019) OPEN ACCESS

Slide 5

Slide 5 text

Drug discovery (DD) = Dose dependence Dose (density) Effect No high throughput (HT) methods are available ←→ gene expression = HT sequencing/mincroarry Is it possible HT DD from HT gene expression methods?

Slide 6

Slide 6 text

By extending matrix to tensor, x ijl ,we can deal with data of “dose density(i) ⨉ compounds(j) ⨉ gene(l)” → Tensors can be decomposed. x ijl G u k1i u k2j u k3l x ijl ≒ΣΣ k1,k2,k3 G k1,k2,k3 u k1i u k2j u k3l gene compounds dose density compounds dose density gene

Slide 7

Slide 7 text

Dose density Genes Compounds Genes 2nd Component k2£6 Compounds Genes xijl Gk1 ,k2 ,k3 u k3 l u k1 i u k2 j Dose density Outlier compounds Outlier genes u k2 j u k3 l G2,k2 ,k3 u k3 l u k2 j Compounds k3£6

Slide 8

Slide 8 text

A compounds Genes Single gene perturbation Gene A Gene B Gene C TD based unsupervised FE A B C B C

Slide 9

Slide 9 text

Gene expression profiles with drug compounds treatments Identification of pairs of genes and compounds with dose dependence by tensor decomposition Target proteins identification by the comparisons with single gene KO/KI experiments Validation by the comparison with known drug target proteins by Fisher’s exact test, Over all data analysis flow

Slide 10

Slide 10 text

Results for 13 cancer cell lines (LINCS) Identification by tensor decomposition Target protein by the comparison with KO/KI experiments ( )

Slide 11

Slide 11 text

Evaluations Comparisons with drug2gene.com and DsigDB ○: significant overlap by Fisher’s exact test (1)-(13): Cancer cell lines in the previous table

Slide 12

Slide 12 text

Project 2: Integrated analysis of gene expression profiles of drug treated tissues and human patients

Slide 13

Slide 13 text

Identification of candidate drugs using tensor-decomposition- based unsupervised feature extraction in integrated analysis of gene expression between diseases and DrugMatrix datasets Y.-h. Taguchi Scientific Reports volume 7, Article number: 13733 (2017) OPEN ACCESS

Slide 14

Slide 14 text

× x ij x il x ij ×x il x ijl Tensor decomposition G u k1i u k2j u k3l x ijl =x ij ×x il ≒ΣΣ k1,k2,k3 G k1,k2,k3 u k1i u k2j u k3l i:genes j:Patients vs healthy contol l:dose density Patients vs healthy contol Dose density

Slide 15

Slide 15 text

x j 3 i x j 1 j 2 j 3 i =x j 1 j 2 i x j 3 i =∑G(l 1 ,l 2 ,l 3 ,l 4 )u l 1 j 1 u l 2 j 2 u l 3 j 3 u l 4 i x j 1 j 2 i u l 1 j 1 u l 2 j 2 u l 3 j 3 u l 4 i j 1 j 2 j 3 i Compounds Time dulation after treatment Patients vs Health control gene Gene X Target protein

Slide 16

Slide 16 text

days Dulation after treatment: The 1st to 4th singlar value vectors

Slide 17

Slide 17 text

Heart disease The 1st to 3rd singular value vectors

Slide 18

Slide 18 text

Compounds:the 2nd singular value vectors

Slide 19

Slide 19 text

Top ranked 10 vectors with larger absolute values l 1 =2

Slide 20

Slide 20 text

Feature extraction Feature extraction No real data separated well Assume Gaussian Detect outliers P i =P[ >∑ k ( x ik σ ) 2 ] Benjamini-Hochberg corrected P <0.01 P-values by χ2 dist P(p) 1-p 0 1

Slide 21

Slide 21 text

274 genes are selected Akt1 A2m Abcb10 Acads Accn3 Acox3 Acsl1 Acta1 Actg2 Actn1 Actr1b Acvr1c Adcy3 Adora3 Adra1b Adrb2 Agpat1 Agrn Ak3 Akap1 Alas1 Amhr2 Anxa2 Aoc3 Apob Apod Aqp4 Areg Arf4 Atf3 Atp1b1 Atp5a1 Atp6v1h Azgp1 B4galt7 Bag3 Bmpr1a Bpgm Btbd9 Btg2 Bves Bzw1 C1qa C3 Ca3 Canx Cast Ccl2 Ccnd2 Ccnl1 Ccr1 Cd36 Cd63 Cd74 Cdh23 Cebpg Ces1 Chchd4 Chdh Ciapin1 Cmklr1 Col5a1 Cryab Csda Csnk2b Csrp3 Ctf1 Cyb5b Cybb Dcps Ddit4l Dhrs1 Dlc1 Dnajc5 Dpp4 Dusp11 Ednra Eef2k Egfr Egr1 Eif2s2 Eif4a1 Ephx1 F8 Fabp5 Fbl Fbxo22 Fgf9 Flt1 Fndc5 Fos Fstl1 Fut8 Fyttd1 Gapdh Gatm Gbe1 Ghr Git2 Glul Gna12 Gnb1 Gnb2l1 Gnb3 Gosr1 Got1 Got2 Gpx3 Grwd1 Gstp1 Gucy1a3 Hapln1 Hmbs Hmgb1 Hrc Hspa5 Htr4 Idh3a Idh3g Il1rl1 Il6r Il6st Immt Ing4 Itga6 Itm2c Itpr3 Junb Kcmf1 Kcnj8 Kcnk3 Kcnt1 Kpna1 Lactb2 Laptm4b Lcat Lcp1 Ldha Lphn3 Lrp1 Lss Ltbp4 Man2c1 Map2k4 Map4k3 Mapk10 Mapk14 Mapk6 Mapk9 Mfn2 Mgat3 Mgp Mknk2 Mlycd Mme Mpp3 mrpl9 Msn Mterf Mtus1 Mvd Mxd3 Myc Myl2 Ncoa2 Ndfip1 Ndufs2 Nedd4l Nes Nexn Nf1 Nfatc4 Nfe2l2 Npr3 Nr0b2 Nr3c1 Nr3c2 Nsf Obscn Odz2 P2rx3 Pacsin2 Pccb Pdcl3 Pde4b Pdia4 Pdk2 Pdk4 Pdrg1 Pdxk Pggt1b Pi4k2a Pold4 Ppara Ppif Ppm1a Ppm1b Ppp1r14a Ppp1r14c Ppp2ca Ppp2r2d Prelp Psmb4 Psmc1 Psmd12 Ptgds Ptger2 Pvrl2 Pycr2 Rab15 Ramp2 Rbm10 Rela Rhoa Rplp1 Rps18 Rps20 Rps6 Rxrg Samm50 Sccpdh Schip1 Scn2b Sdhd Sephs2 Serpinh1 Sfrp4 Sharpin Sirt5 Slc25a4 Slc2a4 Slc38a2 Slc40a1 Slc5a1 Slc6a1 Sln Slpi Smad4 Smpd1 Sod1 Sox18 Spin2b Spp1 Stat3 Steap3 Stip1 Stx7 Suclg1 Synj1 Tarbp2 Tfam Tmem30a Tnfaip6 Tnfrsf12a Tnfrsf1a Tnni3 Tpm1 Tpsab1 Trpc4ap Ttn Txndc12 Txnip Uchl1 Uqcrc2 Usp14 Vdac2 Vezt Vim Vsnl1 Vtn Wbp4 Yme1l1 Ywhae Ywhah → Based upon gene KO experiments, 556(up)/449(down) genes are selected

Slide 22

Slide 22 text

Amitriptyline Atropine Baclofen Bezafibrate Caffeine Calcitriol Chlorambucil Cimetidine Citalopram Clemastine Clonazepam Cyclophosphamide D-Tubocurarine Chloride Dexamethasone Dexchlorpheniramine Digitoxin Diphenhydramine Doxazosin Ebastine Fenofibrate Fluphenazine Gabapentin Ifosfamide Iproniazid Lacidipine Loratadine Nevirapine Nimodipine Nitrendipine Ofloxacin Oxymetazoline Paroxetine Phenacemide Phenytoin Rosiglitazone Sparteine Stavudine Valsartan Vecuronium Bromide Venlafaxine Vinblastine Vincristine Zidovudine 43 compounds

Slide 23

Slide 23 text

No content

Slide 24

Slide 24 text

No content

Slide 25

Slide 25 text

Evaluation by SwissDock Chrosis HnF4a vs Bezafibrate K i = 0.13μMM

Slide 26

Slide 26 text

CYPOR vs Bezafibrate K i = 79nM Evaluation by SwissDock Chrosis

Slide 27

Slide 27 text

Yin et al, “Systematic review and meta-analysis: bezafibrate in patients with primary biliary cirrhosis”, Drug Des Devel Ther. 2015 ;9:5407-19. CONCLUSION: Combination therapy improved liver biochemistry and the prognosis of PBC, but did not improve clinical symptoms or incidence of death. Attention should be paid to adverse events when using bezafibrate.

Slide 28

Slide 28 text

Project 3: Inference of drug efective to Alzheimer disease of mice brain single cell gene expression (without drug treated gene expression)

Slide 29

Slide 29 text

Neurological Disorder Drug Discovery from Gene Expression with Tensor Decomposition Author(s): Y-h. Taguchi*, Turki Turki. Journal Name: Current Pharmaceutical Design Volume 25 , Issue 43 , 2019 OPEN ACCESS

Slide 30

Slide 30 text

Data & experiments (mice) Two genotypes (APP_NL-F-G and C57Bl/6), Two tissues (Cortex and Hippocampus), Four ages (3, 6, 12, and 21 weeks), Two sexes (male and female) Four 96 well plates (the number of cells). Aim: Understanding Alzheimer’s disease

Slide 31

Slide 31 text

Tensor x ij1j2j3j4j5j6 represents gene expression of ith gene of j 1 th cell (well) j 2 th genotype (j 2 = 1:APP_NL-F-G and j 2 = 2: C57Bl/6), j 3 th tissue (j 3 = 1:Cortex and j 3 = 2:Hippocampus), j 4 th age (j 4 = 1: three weeks,j 4 = 2: six weeks, j 4 = 3: twelve weeks, and j 4 = 4: twenty one weeks), j 5 th sex (j5 = 1:female and j5 = 2:male) j 6 th plate.

Slide 32

Slide 32 text

x i j 1 j 2 j 3 j 4 j 5 j 6 =∑ l 1 l 2 l 3 l 4 l 5 l 6 l 7 G(l 1 ,l 2 ,l 3 ,l 4 ,l 5 ,l 6 ,l 7 ) ×u l 1 j 1 u l 2 j 2 u l 3 j 3 u l 4 j 4 u l 5 j 5 u l 6 j 6 u l 7 i (A) u l1j1 :96 wells (cells), l 1 =1 (B) u l2j2 : genotype APP_NL-F-G vs C57Bl/6, l 2 =1 (C) u l3j3 : Cortex vs Hippocampus, l 3 =1 (D) u l4j4 : 3, 6, 12, 21 weeks , l 4 =2 (E) u l5j5 : female vs male, l 5 =1 (F) u l6j6 : 4 plates , l 1 =1 → l 7 =2 with G(1,1,1,2,1,1,l 7 ) (the largest absolute values)

Slide 33

Slide 33 text

P i =P χ2 [( u 2i σ2 ) 2 ] Attributing P-values to genes After correcting P-values by BH criterion, 401 genes with P i <0.01 are selected. → Evaluate how these are overlapped with genes affected by known Alzheimer’s drug treatments. 401 genes are uploaded to Enrichr

Slide 34

Slide 34 text

Top ranked 10 compounds listed in “LINCS L1000 Chem Pert up” category in Enrichr. Overlap is that between selected 401 genes and genes selected in individual experiments. known Alzheimer’s drug

Slide 35

Slide 35 text

known Alzheimer’s drug Top ranked 10 compounds listed in “DrugMatrix” category in Enrichr. Overlap is that between selected 401 genes and genes selected in individual experiments.

Slide 36

Slide 36 text

known Alzheimer’s drug Top ranked 10 compounds listed in “Drug Perturbations from GEO up” category in Enrichr. Overlap is that between selected 401 genes and genes selected in individual experiments.

Slide 37

Slide 37 text

37 Summary We can infer effective drugs to diseases from gene expression profile using TD based unsupervised FE I have published a monograph from Springer. I am happy if you can but it, although it is extremely expensive.