Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Quantitative Estimate Index for Early-Stage Screening of Compounds Targeting Protein-Protein Interactions

Quantitative Estimate Index for Early-Stage Screening of Compounds Targeting Protein-Protein Interactions

Tokyo Bioinformatics Meeting 第5回研究会の発表資料

GitHub :
https://github.com/ohuelab/QEPPI

Reference :
Kosugi T, Ohue M. Quantitative Estimate Index for Early-Stage Screening of Compounds Targeting Protein-Protein Interactions. International Journal of Molecular Sciences. 2021; 22(20):10925. https://doi.org/10.3390/ijms222010925

くろたんく

November 13, 2021
Tweet

More Decks by くろたんく

Other Decks in Science

Transcript

  1. /PWUI5#. 2VBOUJUBUJWF&TUJNBUF*OEFYGPS&BSMZ4UBHF 4DSFFOJOHPG$PNQPVOET5BSHFUJOH 1SPUFJO1SPUFJO*OUFSBDUJPOT 5BLBUTVHV,PTVHJ %FQBSUNFOUPG$PNQVUFS4DJFODF 4DIPPMPG$PNQVUJOH  5PLZP*OTUJUVUFPG5FDIOPMPHZ +BQBO

    5IFUI5PLZP#JPJOGPSNBUJDT.FFUJOH /PWFNCFSUI  ,PTVHJ 50IVF . 2VBOUJUBUJWF&TUJNBUF*OEFYGPS&BSMZ4UBHF4DSFFOJOHPG$PNQPVOET5BSHFUJOH1SPUFJO1SPUFJO*OUFSBDUJPOT *OU+.PM4DJ  IUUQTEPJPSHJKNT
  2. /PWUI5#. "HFOEB  "CTUSBDU  #BDLHSPVOE  .FUIPET  3FTVMUT

     1FBLBOEXFJHIUPGEFTDSJQUPSTJO2&11*  %JTUSJCVUJPOPG2&%BOE2&11*JOUIF11*UBSHFUJOHDPNQPVOETBOE'%"ESVH  3VMFPG'PVS  %JTDVTTJPO  "QQMJDBUJPOPG2&11*UP11*UBSHFUJOHDPNQPVOETBOEPUIFSTNBMMNPMFDVMFESVHTJODMJOJDBMUSJBMT  3FMFWBODFPG2&11*UPUIFDPNQMFYJUZPG11*JOUFSGBDFT  2&11*NBZCFFGGFDUJWFGPSIPTUQBUIPHFO11*T  5PXBSETUIFBQQMJDBUJPOPGNPMFDVMBSHFOFSBUJPOCZ3- ("/BOE7"&  -JNJUBUJPOTBOE$IBMMFOHFT  $PODMVTJPO 
  3. /PWUI5#.  "CTUSBDU  #BDLHSPVOE  .FUIPET  3FTVMUT 

    1FBLBOEXFJHIUPGEFTDSJQUPSTJO2&11*  %JTUSJCVUJPOPG2&%BOE2&11*JOUIF11*UBSHFUJOHDPNQPVOETBOE'%"ESVH  3VMFPG'PVS  %JTDVTTJPO  "QQMJDBUJPOPG2&11*UP11*UBSHFUJOHDPNQPVOETBOEPUIFSTNBMMNPMFDVMFESVHTJODMJOJDBMUSJBMT  3FMFWBODFPG2&11*UPUIFDPNQMFYJUZPG11*JOUFSGBDFT  2&11*NBZCFFGGFDUJWFGPSIPTUQBUIPHFO11*T  5PXBSETUIFBQQMJDBUJPOPGNPMFDVMBSHFOFSBUJPOCZ3- ("/BOE7"&  -JNJUBUJPOTBOE$IBMMFOHFT  $PODMVTJPO "HFOEB 
  4. /PWUI5#. "CTUSBDU w 5IFSFBSFJOEFYFTUPTDPSFDPNQPVOETUPEFWFMPQESVHT -JQJOTLJ`TSVMFPGGJWF 30 BOE2&%BSFWFSZGBNPVT w 1SPUFJO1SPUFJO*OUFSBDUJPO 11*

    JTBSFDFOUESVHEJTDPWFSZUBSHFU CVUJUTWFSZ EJGGJDVMUUPEFWFMPQESVHT UIFSFBSFOPJOEFYFTUPTDPSFXIFUIFSUPUBSHFU11* w *OUIJTTUVEZ JUXPVMECFVTFGVMUPIBWFBJOEFYUPTDPSFXIFOEFWFMPQJOH 11*UBSHFUJOHDPNQPVOET UIFSFGPSFXFEFWFMPQFEBOJOEFYDBMMFEUIF2&11* 2VBOUJUBUJWF&TUJNBUF*OEFYGPS&BSMZ4UBHF4DSFFOJOHPG$PNQPVOET5BSHFUJOH1SPUFJO1SPUFJO*OUFSBDUJPOT  11* 8IBUJTUIFTDPSFʁ 11*UBSHFUJOHDPNQPVOE JOEFWFMPQNFOU $VSS0QJO$IFN#JPM WPM OP QQr  /BU$IFN  r
  5. /PWUI5#.  "CTUSBDU  #BDLHSPVOE  .FUIPET  3FTVMUT 

    1FBLBOEXFJHIUPGEFTDSJQUPSTJO2&11*  %JTUSJCVUJPOPG2&%BOE2&11*JOUIF11*UBSHFUJOHDPNQPVOETBOE'%"ESVH  3VMFPG'PVS  %JTDVTTJPO  "QQMJDBUJPOPG2&11*UP11*UBSHFUJOHDPNQPVOETBOEPUIFSTNBMMNPMFDVMFESVHTJODMJOJDBMUSJBMT  3FMFWBODFPG2&11*UPUIFDPNQMFYJUZPG11*JOUFSGBDFT  2&11*NBZCFFGGFDUJWFGPSIPTUQBUIPHFO11*T  5PXBSETUIFBQQMJDBUJPOPGNPMFDVMBSHFOFSBUJPOCZ3- ("/BOE7"&  -JNJUBUJPOTBOE$IBMMFOHFT  $PODMVTJPO "HFOEB 
  6. /PWUI5#. 5IFOVNCFSPG11*TUIBUDBOCF DPOTJEFSFEBTESVHUBSHFUT d   11*TBTESVHUBSHFUT #JP(3*%%BUBCBTF4UBUJTUJDT $VSSFOU#VJME4UBUJTUJDT 

    0DUPCFS  $SFBUFEXJUI#JP3FOEFSDPN *OSFDFOUZFBST 11*TIBWFBUUSBDUFEBUUFOUJPOBTESVHUBSHFUTBNPOHWBSJPVTUBSHFUT JOESVHEFWFMPQNFOU (1$3T $IBOOFMT &O[ZNFT 5SBOTDSJQUJPO'BDUPST 5IFOVNCFSPG DPOWFOUJPOBMESVHUBSHFUT d 
  7. /PWUI5#. 11*ESVHEJTDPWFSZIBTEJGGJDVMUQSPCMFNT *UJTEJGGJDVMUUPEFTJHOESVHTGPS11*TCBTFEPODPOWFOUJPOBMJOEFY TVDIBT2&%  CFDBVTFUIFJSQIZTJDPDIFNJDBMQSPQFSUJFTBSFWFSZEJGGFSFOUGSPNUIPTFPG DPOWFOUJPOBMESVHUBSHFUT  "EW"QQM#JPJOGPSN$IFN WPM

    QQr  OPO11*UBSHFUJOH 11*UBSHFUJOH *OU+.PM4DJ  'JHVSF Int. J. Mol. Sci. 2021, 22, 10925 3 of 15 (a) MW (b) ALogP (c) HBD (d) HBA (e) TPSA (f) ROTB (g) AROM Figure 1. Histograms of seven molecular physicochemical properties for a set of non-redundant compounds of iPPI-DB. Molecular weight (MW) (a), LogP value estimated by Ghose-Crippen method (ALogP) (b), number of hydrogen bond donors (HBD) (c), number of hydrogen bond acceptors (HBA) (d), topological molecular polar surface area (TPSA) (e), number of rotatable bonds (ROTB) (f), and number of aromatic rings (AROM) (g). The solid red lines describe the asymmetric double sigmoid (ADS) function (1) used to model the QEPPI histograms. The black dashed lines describe the ADS function used to model the quantitative estimate of drug-likeness (QED) histograms.
  8. /PWUI5#. 3VMFPG'PVS3VMFCBTFEGPSJOEFYGPS11*JOIJCJUPST .PSFMMJet alQSPQPTFEUIF l3VMFPG'PVSz 30 UPFWBMVBUF11*JOIJCJUPST 5IJTQSPQPTBMXBTCBTFEPOBTUBUJTUJDBMBOBMZTJT PG11*JOIJCJUPSTJO1*EC 30DPOTJTUFEPGUIFGPMMPXJOHGPVSDSJUFSJB

     $VSS0QJO$IFN#JPM WPM OP QQr  %BUBCBTF WPM OPCBX  w .PMFDVMBSXFJHIU w "-PH1 w 5IFOVNCFSPGIZESPHFOCPOEBDDFQUPST w 5IFOVNCFSPGSJOHT 3VMFPG'PVSDSJUFSJB 3FQSFTFOUBUJWFTFUPG11*JOIJCJUPST 5/'Ћ5/'3D  5/'3"5/'#  #DM#BL /$ 9JBQ$BTQBTF  9JBQ4NBD #* )17&&  )%.Q %*; 9%.Q *.: )%.Q 88
  9. /PWUI5#. *OUIJTTUVEZ XFEFWFMPQFEBOJOEFYDBMMFE2&11*UPTPMWFUIFTFQSPCMFNT 2VBOUJUBUJWF&TUJNBUF*OEFYGPS&BSMZ4UBHF4DSFFOJOHPG$PNQPVOET5BSHFUJOH1SPUFJO1SPUFJO*OUFSBDUJPOT %FWFMPQNFOUPGBOFXJOEFYUPFWBMVBUF11*UBSHFUJOHDPNQPVOET  11*UBSHFUJOHDPNQPVOET 5SBJOEBUBTFU f(x) .PEFMJOH

    $PNQPVOETPGJOUFSFTU 1SPCMFNT w 2&%JTNPEFMFEGPSPSBMESVHT5IFESVHMJLFOFTTSFQSFTFOUTJTPSBMESVHMJLF w 30JTVTFGVMGPSGJMUFSJOH11*JOIJCJUPST CVUJUJTOPUXFMMRVBOUJUBUJWF SVMFCBTFE 2&11*4DPSF
  10. /PWUI5#.  "CTUSBDU  #BDLHSPVOE  .FUIPET  3FTVMUT 

    1FBLBOEXFJHIUPGEFTDSJQUPSTJO2&11*  %JTUSJCVUJPOPG2&%BOE2&11*JOUIF11*UBSHFUJOHDPNQPVOETBOE'%"ESVH  3VMFPG'PVS  %JTDVTTJPO  "QQMJDBUJPOPG2&11*UP11*UBSHFUJOHDPNQPVOETBOEPUIFSTNBMMNPMFDVMFESVHTJODMJOJDBMUSJBMT  3FMFWBODFPG2&11*UPUIFDPNQMFYJUZPG11*JOUFSGBDFT  2&11*NBZCFFGGFDUJWFGPSIPTUQBUIPHFO11*T  5PXBSETUIFBQQMJDBUJPOPGNPMFDVMBSHFOFSBUJPOCZ3- ("/BOE7"&  -JNJUBUJPOTBOE$IBMMFOHFT  $PODMVTJPO "HFOEB 
  11. /PWUI5#. 2&11*XBTDBMDVMBUFEVTJOHFTTFOUJBMMZUIFTBNFQSPDFEVSFBTUIBUPGUIFPSJHJOBM 2&% FYDFQUGPSBGFXQPJOUT 5IFEJGGFSFODFTGSPNUIFPSJHJOBM2&%BSFBTGPMMPXT *UJTCBTJDBMMZUIFTBNFBTUIFNPEFMJOHQSPDFEVSFGPS2&% .PEFMJOHPG2&11*ʢ%JGGFSFODFTGSPN2&%ʣ  /BU$IFN 

    r #JPJOGPSNBUJDT WPM OP QQr  2&% 2&11* 5IFBJNPG UIFJOEFY 2VBOUJUBUJWFFTUJNBUF PSBMESVHMJLFOFTT 2VBOUJUBUJWFFTUJNBUF 11*UBSHFUJOHESVHMJLFOFTT %BUBTFU PSBM'%"BQQSPWFEESVHT JO$I&.#-%SVH4UPSFEBUBCBTF 11*UBSHFUJOHDPNQPVOET JOJ11*%# 5IFOVNCFSPG %FTDSJQUPST   EJEO`UVTFl"-&354z )PXUPCVJME UIFNPEFM 5BCMF$VSWF% -FWFOCFSH.BSRVBSEUBMHPSJUIN JO4DJ1Z
  12. /PWUI5#.  $SFBUF%BUBTFUGPSUIF2&11*NPEFM w 5PDSFBUFBOPOSFEVOEBOUEBUBTFUGPSUIF2&11*NPEFM XFEPXOMPBEFE  4.*-&4BOEPUIFSEBUBPGDPNQPVOETSFHJTUFSFEJOJ11*%# BOE 

    DPNQPVOETXFSFTFMFDUFEGSPNBMMDMVTUFSTPOFCZPOFXJUIUIFCFTU BDUJWJUJFTEFUFSNJOFECZDMVTUFSJOHXJUI#FNJT.VSDLPBUPNJDGSBNFXPSLT  $BMDVMBUJOHTFWFONPMFDVMBSQSPQFSUJFT w 'PMMPXJOHTFWFONPMFDVMBSQIZTJDPDIFNJDBMQSPQFSUJFTXFSFDBMDVMBUFECZUIF 3%,JUGVODUJPO 1SPDFEVSFUPNPEFM2&11*  +.FE$IFN WPM OP QQr  d to Int. J. Mol. Sci. 2 of 13 Table 1: RDKit functions used to calculate the molecular properties used in quantitative estimate of protein-protein interaction targeting drug-likeness (QEPPI) and Rule-of-Four (RO4) property RDKit function MW Chem.rdMolDescriptors.CalcExactMolWt ALogP Chem.Crippen.MolLogP HBD Chem.rdMolDescriptors.CalcNumHBD HBA Chem.rdMolDescriptors.CalcNumHBA TPSA Chem.rdMolDescriptors.CalcTPSA ROTB Chem.rdMolDescriptors.CalcNumRotatableBonds AROM Chem.rdMolDescriptors.CalcNumAromaticRings RING Chem.rdMolDescriptors.CalcNumRings
  13. /PWUI5#.  1MPUUJOHIJTUPHSBNT w 5IFTBNFQSPDFEVSFBTUIBUPGUIFPSJHJOBM2&%  'JUUJOHPGBTZNNFUSJDEPVCMFTJHNPJE "%4 GVODUJPO w

    5IF"%4GVODUJPO  &R  CZJNQMFNFOUJOHUIF-FWFOCFSH.BSRVBSEU BMHPSJUINJO4DJ1Z  /PSNBMJ[BUJPOPGBMMGJUUJOHGVODUJPOT w JTBMMGJUUJOHGVODUJPOT  XFSFEJWJEFECZUIFNBYJNVNWBMVFBOEOPSNBMJ[FEUPHJWFBWBMVFUP Q(x) ˜ Qi (x) Qi (x)(i ∈ {MW, ALogP, HBD, HBA, TPSA, ROTB, AROM}) 1SPDFEVSFUPNPEFM2&11*  
  14. /PWUI5#.  8FJHIUFEEFTJSBCJMJUZGVODUJPOT w 5IF2&11*TDPSFPGDPNQPVOEkXBTBTTJHOFEBTUIFXFJHIUFEHFPNFUSJDNFBO PGBMMEFTJSBCJMJUZGVODUJPOT &R   

    "TTJHONFOUPGXFJHIUT w 5IFTFWFOXFJHIUTXFSFUFTUFEPOBMMQBUUFSOTJOJODSFNFOUTPGGSPNUP  BOEUIFBWFSBHFPGUIF DPNCJOBUJPOTPGXFJHIUTUIBUSFTVMUFEJOUIFIJHIFTU 4IBOOPOFOUSPQZXBTBEPQUFE5IF4IBOOPOFOUSPQZPGUIFNPEFMXBT DBMDVMBUFE &R   1SPDFEVSFUPNPEFM2&11*    8IFSFnSFQSFTFOUTUIFOVNCFSPGDPNQPVOETVTFEJOUIFNPEFMJOH
  15. /PWUI5#. "HFOEB  "CTUSBDU  #BDLHSPVOE  .FUIPET  3FTVMUT

     1FBLBOEXFJHIUPGEFTDSJQUPSTJO2&11*  %JTUSJCVUJPOPG2&%BOE2&11*JOUIF11*UBSHFUJOHDPNQPVOETBOE'%"ESVH  3VMFPG'PVS  %JTDVTTJPO  "QQMJDBUJPOPG2&11*UP11*UBSHFUJOHDPNQPVOETBOEPUIFSTNBMMNPMFDVMFESVHTJODMJOJDBMUSJBMT  3FMFWBODFPG2&11*UPUIFDPNQMFYJUZPG11*JOUFSGBDFT  2&11*NBZCFFGGFDUJWFGPSIPTUQBUIPHFO11*T  5PXBSETUIFBQQMJDBUJPOPGNPMFDVMBSHFOFSBUJPOCZ3- ("/BOE7"&  -JNJUBUJPOTBOE$IBMMFOHFT  $PODMVTJPO 
  16. /PWUI5#. 1FBLBOEXFJHIUPGEFTDSJQUPSTJO2&11*  *OUFSFTUJOHMZ UIFXFJHIUPG"-PH1 BOJNQPSUBOUEFTDSJQUPSPGPSBMBCTPSQUJPOJO2&%  JTNPTUMZJHOPSFEJO2&11*0OUIFPUIFSIBOE UIFXFJHIUTPG)#"BOE514" XIJDI

    BSFNPTUMZJHOPSFEJO2&% BSFHJWFONPSFTJHOJGJDBODFJO2&11* *UTVHHFTUTUIBU2&11*DBODBQUVSF11*UBSHFUJOHESVHMJLFQSPQFSUJFTDPNQBSFEUP 2&%BOEDBOQMBZBEJGGFSFOUSPMFJOUIFTFFEDPNQPVOEEJTDPWFSZQSPDFTT *OU+.PM4DJ  'JHVSF Int. J. Mol. Sci. 2021, 22, 10925 3 of 15 (a) MW (b) ALogP (c) HBD (d) HBA (e) TPSA (f) ROTB (g) AROM Figure 1. Histograms of seven molecular physicochemical properties for a set of non-redundant compounds of iPPI-DB. Molecular weight (MW) (a), LogP value estimated by Ghose-Crippen method (ALogP) (b), number of hydrogen bond donors (HBD) (c), number of hydrogen bond acceptors (HBA) (d), topological molecular polar surface area (TPSA) (e), number of rotatable bonds (ROTB) (f), and number of aromatic rings (AROM) (g). The solid red lines describe the asymmetric double sigmoid (ADS) function (1) used to model the QEPPI histograms. The black dashed lines describe the ADS function used to model the quantitative estimate of drug-likeness (QED) histograms. Table 1. Distribution peaks and optimized desirability function weightings of each molecular physicochemical property. MW ALogP HBD HBA TPSA ROTB AROM peak QED * 305.8 2.70 1.20 2.38 57.5 3.04 1.8 QEPPI 492.7 4.78 1.61 4.79 76.9 6.37 2.8 wi QED * 0.66 0.46 0.61 0.05 0.06 0.65 0.48 QEPPI 0.47 0.10 0.82 0.81 0.37 0.53 0.89 * QED was modeled as a function that includes ALERTS; the peak value of ALERTS in QED was 24.6, and its weight wALERTS was 0.95. Figure 1 and Table 1 show that oral drugs and PPI-targeting compounds have very different properties. Table 1 shows that the peak values of all properties were higher for Figure 1. Histograms of seven molecular physicochemical properties for a set of non-redundant compounds of iPPI-DB. Molecular weight (MW) (a), LogP value estimated by Ghose-Crippen method (ALogP) (b), number of hydrogen bond donors (HBD) (c), number of hydrogen bond acceptors (HBA) (d), topological molecular polar surface area (TPSA) (e), number of rotatable bonds (ROTB) (f), and number of aromatic rings (AROM) (g). The solid red lines describe the asymmetric double sigmoid (ADS) function (1) used to model the QEPPI histograms. The black dashed lines describe the ADS function used to model the quantitative estimate of drug-likeness (QED) histograms. Table 1. Distribution peaks and optimized desirability function weightings of each molecular physicochemical property. MW ALogP HBD HBA TPSA ROTB AROM peak QED * 305.8 2.70 1.20 2.38 57.5 3.04 1.8 QEPPI 492.7 4.78 1.61 4.79 76.9 6.37 2.8 wi QED * 0.66 0.46 0.61 0.05 0.06 0.65 0.48 QEPPI 0.47 0.10 0.82 0.81 0.37 0.53 0.89 * QED was modeled as a function that includes ALERTS; the peak value of ALERTS in QED was 24.6, and its weight wALERTS was 0.95. Figure 1 and Table 1 show that oral drugs and PPI-targeting compounds have very different properties. Table 1 shows that the peak values of all properties were higher for QEPPI than for QED. Particularly, the major difference between QEPPI and QED is the peak value of ALogP (QEPPI: 4.78, QED: 2.70), suggesting that low lipophilicity and high hydrophilicity are important for oral drugs in terms of oral absorption. This suggests that QEPPI can capture PPI-targeting drug-like properties compared to QED and has a different role in the seed compound discovery process, which is the early-stage of drug discovery. 2.2. Evaluation of QEPPI To evaluate whether QEPPI, which was developed in this study, is a more useful index for early-stage PPI drug discovery compared to QED, we obtained data on 321 PPI- targeting compounds from the iPPI-DB that were not used for model building (iPPI-DB dataset). In addition, we obtained data on 1596 FDA-approved drugs, excluding duplicates and approved drugs targeting PPI (FDA dataset). The QED score was calculated using these data; the distribution of these values is shown in Figure 2a. Similarly, the QEPPI score was calculated, and the distribution of the values is shown in Figure 2b.
  17. /PWUI5#.  %JTUSJCVUJPOPG2&%BOE2&11*JOUIF11*UBSHFUJOHDPNQPVOETBOE'%" ESVH 5IFSFBSFGFX11*UBSHFUJOHDPNQPVOETJOUIF'%"EBUBTFU UIFTNBMMFS2&11*TDPSFTJOUIF '%"EBUBTFUDPNQBSFEUPUIPTFJOUIFJ11*%#EBUBTFUBSFDPOTJTUFOU )PXFWFS UIFSFTVMUTTIPXUIBUGPSFBDIEBUBTFU 2&%BOE2&11*IBEBMNPTUPQQPTJUFUSFOET

     BOEUIVT —2&%NBZCFBTJNJMBSJOEFYUP2&11* Int. J. Mol. Sci. 2021, 22, 10925 4 of 15 (a) QED (b) QEPPI Figure 2. Distribution of QED and QEPPI in the PPI-targeting compounds dataset and FDA-approved drug dataset. Each filled area extends to represent the entire data range, with optional lines at the median. The QED score was calculated for both datasets (a). The QEPPI score was calculated for both datasets (b). Figure 2a shows that PPI-targeting compounds exhibit a lower distribution of QED scores compared to conventional drugs, suggesting that QED is not an appropriate measure for PPI-targeting compounds, as it typically represents oral drug-like properties rather than drug-likeness. Figure 2b shows that PPI-targeting compounds have a higher distribution of QEPPI scores compared to conventional drugs, and a QEPPI threshold of 0.5 is sufficient to identify approximately 75% of PPI-targeting compounds. Furthermore, PPI-target drugs Int. J. Mol. Sci. 2021, 22, 10925 (a) QED (b) QEPPI Figure 2. Distribution of QED and QEPPI in the PPI-targeting compounds dataset and FDA-approv filled area extends to represent the entire data range, with optional lines at the median. The QED sc both datasets (a). The QEPPI score was calculated for both datasets (b). Figure 2a shows that PPI-targeting compounds exhibit a lo scores compared to conventional drugs, suggesting that QED is no for PPI-targeting compounds, as it typically represents oral drug-l drug-likeness. Figure 2b shows that PPI-targeting compounds h of QEPPI scores compared to conventional drugs, and a QEPPI th to identify approximately 75% of PPI-targeting compounds. Furth have been removed from the FDA dataset based on the literat
  18. /PWUI5#. 3VMFPG'PVS 30  #PUI30$"6$BOE13"6$BSFIJHIFSUIBO 2&% 2&%@JOW *OBEEJUJPO 'TDPSFJTBMTP IJHIFSUIBO305IFTFSFTVMUTTVHHFTUUIBU

    2&11*QFSGPSNTCFUUFSUIBOPUIFSJOEFYFT *OUFSFTUJOHMZ XIFOFBDIWBMVFPG30XBT QMPUUFEPOUIF30$BOE"6$DVSWFTPG 2&11* UIFZXFSFWFSZDMPTFUPFBDIPUIFS  TVHHFTUJOHUIBU30 BOJOEFYPGEJTDSFUF WBMVF DPVMECFFYUFOEFEUPBOJOEFYPG DPOUJOVPVTWBMVF5IFSFTVMUTTVHHFTUFE UIBU2&11*JTBHFOFSBMFYUFOTJPOPGUIF 30DPODFQU 2&% 2&%@JOW 2&11*      Int. J. Mol. Sci. 2021, 22, 10925 6 of 15 Table 3. Precision, Recall, and F-score values for one violation of RO4 and QEPPI scores with a threshold value of 0.5196. Precision Recall F-Score RO4 0.405 0.508 0.451 QEPPI 0.379 0.735 0.501 Finally, to compare the classification performance of two different metrics, namely, RO4 (rule-based) and QEPPI (threshold-based), we compared the value of Recall between
  19. /PWUI5#. "HFOEB  "CTUSBDU  #BDLHSPVOE  .FUIPET  3FTVMUT

     1FBLBOEXFJHIUPGEFTDSJQUPSTJO2&11*  %JTUSJCVUJPOPG2&%BOE2&11*JOUIF11*UBSHFUJOHDPNQPVOETBOE'%"ESVH  3VMFPG'PVS  %JTDVTTJPO  "QQMJDBUJPOPG2&11*UP11*UBSHFUJOHDPNQPVOETBOEPUIFSTNBMMNPMFDVMFESVHTJODMJOJDBMUSJBMT  3FMFWBODFPG2&11*UPUIFDPNQMFYJUZPG11*JOUFSGBDFT  2&11*NBZCFFGGFDUJWFGPSIPTUQBUIPHFO11*T  5PXBSETUIFBQQMJDBUJPOPGNPMFDVMBSHFOFSBUJPOCZ3- ("/BOE7"&  -JNJUBUJPOTBOE$IBMMFOHFT  $PODMVTJPO 
  20. /PWUI5#. "QQMJDBUJPOPG2&11*UPDPNQPVOETJODMJOJDBMUSJBMT  5IFEJTUSJCVUJPOPG2&11*XBTGPVOEUPCFTJHOJGJDBOUMZEJGGFSFOUCFUXFFOBQQSPWFE 11*UBSHFUFEESVHTBOE11*UBSHFUFEDPNQPVOETJOUIFDMJOJDBMQIBTF 34$.FE$IFN /PWFN +$IFN*OG.PEFM WPM OP

    QQr  properties more similar to those of standard drugs than those of PPI modulators currently on the market. We also applied QEPPI to the above datasets. The distribution of the QEPPI is shown in Figure 5. Our application of QEPPI to the 30 clinical candidates used by Truong et al. showed a median value of approximately 0.59, which is higher than that of commercially available PPI modulators, Figure 5. Although the physicochemical properties of the PPI-targeting compounds registered in iPPI-DB and FDA-approved drugs are different, as shown in Figure 1 and Table 1, the QEPPI modeled from iPPI-DB shows potential to be adapted to more recent PPI modulators. Figure 5. Distribution of QEPPI with respect to compounds in the clinical phase or approved PPI- targeting compounds dataset. The Truong clinical and Truong approved datasets represent clinical and FDA-approved PPI-targeting compound data, respectively. The iPPI-DB and Soga datasets represent positive and negative controls, respectively. The jitter overlaid on the boxplots shows the QEPPI scores for all samples in each data set. In addition, we also looked at when the PPI-targeted compounds included in the Truong approved data were marketed and when the PPI-targeted compounds included in the Truong clinical data were used in clinical trials. Figure 6 shows the QEPPI of PPI- targeting marketed drugs and compounds in clinical trials within the last 30 years (in detail Supplementary Table S4). Figure 6a shows the PPI-targeting drugs on the market, year the drug was first marketed (as identified in DrugBank), QEPPI value, and target PPI for each drug. PPI-targeting drugs launched in the 1990s showed lower QEPPI scores, $MJOJDBMUSJBM "QQSPWFE OPO11* $POUSPM 1BDMJUBYFM 2&11*  .JDSPUVCVMF  .BSLFUTUBSUJO 3PNJEFQTJO 2&11*  )%"$1*,  .BSLFUTUBSUJO 5FNTJSPMJNVT 2&11*  N503  .BSLFUTUBSUJO .BSLFUFEFYBNQMFT 7FSDJSOPO 2&11*  $$3  1IBTF *EBTBOVUMJO 2&11*  Q.EN  1IBTF "QBCFUBMPOF 2&11*  #&5  1IBTF $MJOJDBMUSJBMFYBNQMFT *OU+.PM4DJ  'JHVSF
  21. /PWUI5#. "QQMJDBUJPOPG2&11*UPDPNQPVOETJODMJOJDBMUSJBMT  3FHBSEMFTTPGUIFZFBS UIF2&11*TDPSFTPGl$MJOJDBMUSJBMzDPNQPVOETIBWFSFNBJOFEIJHI 5IJTJTDPOTJTUFOUXJUIUIFSFDFOUUSFOEPGIJHIFS2&11*TDPSFTGPS.BSLFUFEESVHT 34$.FE$IFN /PWFN +$IFN*OG.PEFM WPM

    OP QQr  $YDWURPERSDJ &DED]LWD[HO 'LPHWK\OIXPDUDWH 'RFHWD[HO (OWURPERSDJ (ULEXOLQPHV\ODWH (YHUROLPXV /HYHWLUDFHWDP /LILWHJUDVW /XVXWURPERSDJ 0DUDYLURF 3DFOLWD[HO 3LPHFUROLPXV 3OHUL[DIRU 5RPLGHSVLQ 6HOLQH[RU 5DSDP\FLQ 7DFUROLPXV 7DIDPLGLV 7HPVLUROLPXV 7LURILEDQ 9HQHWRFOD[ 9LQEODVWLQH 9RULQRVWDW <HDURIPDUNHWLQJVWDUW 4(33,VFRUH              03/732 0LFURWXEXOH .($315) ).%3 &$&1$% ,7*$/ &&5 &;&5 +'$& &50 775WHWUDPHU ,7*% %&/%$; .BSLFUFE $0* $SDEHWDORQH $3* $67; %LULQDSDQW &DURWHJUDVWPHWK\O &&; &&; &HQLFULYLURF &*0 &3, &8'& (SRWKLORQH% ,GDVDQXWOLQ /&/ 0LODGHPHWDQ 1DYDUL[LQ 1DYLWRFOD[ 2EDWRFOD[ 3HYRQHGLVWDW 3) 35, 5HSDUL[LQ 5* 52 6$5 6HUGHPHWDQ 6LUHPDGOLQ 9HUFLUQRQ ;HYLQDSDQW <HDURIFOLQLFDOWULDO 4(33,VFRUH              0'0S %(7SURWHLQV ;,$3FDVSDVH ĮLQWHJULQ &&5 0LFURWXEXOH &;&5 %&/%$; 1$( ȕFDWHQLQ&%3 ,$3 $MJOJDBMUSJBM *OU+.PM4DJ  'JHVSF
  22. /PWUI5#.  %JTUSJCVUJPOPG2&11*TDPSFTGPS11*GBNJMJFT 2&11*TDPSFTPGDPNQPVOET UBSHFUJOHQSJNBSZFQJUPQFT TVDI BTMJOFBSQFQUJEF UFOEUPCF IJHIFSUIBOUIPTFPGDPNQPVOET UBSHFUJOHTFDPOEBSZFQJUPQFT

    TVDIBTUIFIFMJYTUSVDUVSF  UIFEJGGFSFODFJOUIFDPNQMFYJUZPG UIF11*JOUFSGBDFNBZBGGFDUUIF QIZTJDPDIFNJDBMQSPQFSUJFTPG11* UBSHFUJOHDPNQPVOET GVSUIFSNPSF UIFDPNQMFYJUZPGUIF11*JOUFSGBDF JTSFMBUFEUPUIF2&11*TDPSF $IFN#JPM  r *OUFSBDUJPOT#JPDIFNJTUSZ  r +#JPM$IFN  r Figure 7. Distribution of QEPPI scores for 9 PPI families with more than 50 compounds targeting each PPI family in iPPI-DB. The jitter overlaid on the boxplots shows the QEPPI scores for all samples in each dataset. Statistics related to this figure are shown in Supplementary Table S5. be low. Thus, the difference in the complexity of the PPI interface may affect the physico- chemical properties of PPI-targeting compounds such as molecular weight; furthermore, the complexity of the PPI interface is related to the QEPPI score. #SPNPEPNBJO )JTUPOF 9*"1 4NBD -'" *$". $% HQ #$--JLF #"9 .%.-JLF Q $% $% -&%(' */ 553 UFUSBNFS 1SJNBSZ FQJUPQFT 4FDPOEBSZ FQJUPQFT 4NBMMFS JOUFSGBDFBSFB
  23. /PWUI5#. "TBGVUVSFQSPTQFDU 2&11*DBOCFVTFEBTBSFXBSEJOTFRVFODFCBTFENPMFDVMBS HFOFSBUJPONPEFMTVTJOHSFJOGPSDFNFOUMFBSOJOHBOEBTBDPOEJUJPOGPSTFRVFODF CBTFENPMFDVMBSHFOFSBUJPONPEFMTVTJOHDPOEJUJPOBM("/PS7"& XIJDIXJMMFOBCMF NPMFDVMBSEFTJHOXJUIIJHI11*UBSHFUJOHDPNQPVOEQSPQFSUJFT 5PXBSETUIFBQQMJDBUJPOPGNPMFDVMBSHFOFSBUJPOCZ3- ("/BOE7"& 

    &YBNQMFPGDPNCJOBUJPOPG2&11* XJUIDPOEJUJPOBM7"& 6TF2&11*TDPSFBTBDPOEJUJPO .PMDVMBSQIZTJDPDIFNJDBMQSPQFSUZ .8 "-PH1 514" ʜ 2&11*TDPSF -BUFOU TQBDF &ODPEFS %FDPEFS (FOFSBUPS %JTDSJNJOBUPS 3FJOGPSDFNFOU MFBSOJOH .PMDVMBS(SBQI [dQ [ PS PS &WBMVBUFXJUI2&11*TDPSF &YBNQMFPGDPNCJOBUJPOPG2&11* XJUISFJOGPSDFNFOUMFBSOJOHBOE("/
  24. /PWUI5#. w *OUIJTTUVEZ UIFQIZTJDPDIFNJDBMQSPQFSUJFTPGUIFTUSVDUVSBMBMFSUTXFSFOPUVTFE JOUIFNPEFMCVJMEJOHQSPDFTT w JUJTQPTTJCMFUPGJMUFSPVUDPNQPVOETXJUIbbVOXBOUFEHSPVQbbTUSVDUVSFXIJDIXBT BMTPVTFEJO2&%BOEPUIFSBSUJGBDUTDBMMFE1"*/4 1BOBTTBZJOUFSGFSJOH DPNQPVOET

     w 8FGPDVTFEPOUIFQIZTJDPDIFNJDBMQSPQFSUJFTBOEEJEOPUDPOTJEFSUIFTUFSFP DPPSEJOBUJPOPGUIFDPNQPVOETUBSHFUJOH11* w UIFJOWFTUJHBUJPOPGNFUIPEPMPHJFTUIBUUBLFJOUPBDDPVOUTUFSFPDPPSEJOBUJPOJT POFPGUIFJNQPSUBOUJTTVFTUPCFBEESFTTFEJOUIFGVUVSF -JNJUBUJPOTBOE$IBMMFOHFT  /BUVSF  r EPJB /BU$IFN  r
  25. /PWUI5#. "HFOEB  "CTUSBDU  #BDLHSPVOE  .FUIPET  3FTVMUT

     1FBLBOEXFJHIUPGEFTDSJQUPSTJO2&11*  %JTUSJCVUJPOPG2&%BOE2&11*JOUIF11*UBSHFUJOHDPNQPVOETBOE'%"ESVH  3VMFPG'PVS  %JTDVTTJPO  "QQMJDBUJPOPG2&11*UP11*UBSHFUJOHDPNQPVOETBOEPUIFSTNBMMNPMFDVMFESVHTJODMJOJDBMUSJBMT  3FMFWBODFPG2&11*UPUIFDPNQMFYJUZPG11*JOUFSGBDFT  2&11*NBZCFFGGFDUJWFGPSIPTUQBUIPHFO11*T  5PXBSETUIFBQQMJDBUJPOPGNPMFDVMBSHFOFSBUJPOCZ3- ("/BOE7"&  -JNJUBUJPOTBOE$IBMMFOHFT  $PODMVTJPO 
  26. /PWUI5#. $PODMVTJPO w *OUIJTTUVEZ XFEFWFMPQFEBRVBOUJUBUJWFJOEFYDBMMFEUIF2&11* 2VBOUJUBUJWF&TUJNBUF*OEFYGPS&BSMZ4UBHF4DSFFOJOHPG$PNQPVOET5BSHFUJOH1SPUFJO1SPUFJO*OUFSBDUJPOT  w 2&11*QFSGPSNTCFUUFSUIBOPUIFSDPOWFOUJPOBMJOEFYFTTVDIBT30BOE2&% w

    2&11*XBTBMTPDPOTJEFSFEUPCFBOFYUFOTJPOPGUIFDPODFQUPG30 w 2&11*IBTCFFOBQQMJFEUP11*UBSHFUJOHDPNQPVOETJODMJOJDBMUSJBMBOEBQQSPWFE11* UBSHFUJOHDPNQPVOET SFTVMUJOHJO2&11*IBTUIFQPUFOUJBMUPCFNPSFTVJUBCMFGPS NPSFSFDFOU11*UBSHFUJOHDPNQPVOET w 2&11*TDPSFJTSFMBUFEUPDPNQMFYJUZPGUIF11*JOUFSGBDF w 5IFVTFPG2&11*BTBSFXBSEPSDPOEJUJPOJOTFRVFODFCBTFENPMFDVMBS HFOFSBUJPONPEFMTXJMMFOBCMFUIFEFTJHOPGNPMFDVMFTXJUIUIFQSPQFSUJFTPG11*UBSHFU DPNQPVOET  ,PTVHJ 50IVF . 2VBOUJUBUJWF&TUJNBUF*OEFYGPS&BSMZ4UBHF4DSFFOJOHPG$PNQPVOET5BSHFUJOH1SPUFJO1SPUFJO*OUFSBDUJPOT *OU+.PM4DJ  IUUQTEPJPSHJKNT