Slide 1

Slide 1 text

ੜ੒ϞσϧΛத৺ͱͨ͠"*૑ༀ࠷લઢ גࣜձࣾ&MJY $&0݁৓৳࠸ 2019/10/22 1 $#*ֶձ೥େձ

Slide 2

Slide 2 text

໨࣍ 2 • ΠϯτϩμΫγϣϯ • ཁૉٕज़ • Fingerprint, SMILESϕʔεͷϞσϧ • άϥϑϕʔεͷϞσϧ • ੜ੒Ϟσϧͷར༻๏ • ੜ੒ϞσϧͷੑೳධՁ • ࠓޙͷൃలͷํ޲ੑ • Elix Chem

Slide 3

Slide 3 text

ΠϯτϩμΫγϣϯ 3

Slide 4

Slide 4 text

3FTUSJDUFE˜&MJY *OD ෼ࢠઃܭ 4 Sanchez-Lengeling et al. (2018) ࣮ݧ/γϛϡϨʔγϣϯ ༧ଌϞσϧ ੜ੒Ϟσϧ Drug-likeͳ෼ࢠ͸ʙ10^60ݸ

Slide 5

Slide 5 text

3FTUSJDUFE˜&MJY *OD Α͘༻͍ΒΕΔදݱํ๏ 5 Fingerprint SMILES Graph Meter & Coote (2019) Schwalbe-Koda & Gómez-Bombarelli (2019)

Slide 6

Slide 6 text

ಛʹΑ͘༻͍ΒΕΔදݱํ๏ 6 • Fingerprint • ༷ʑͳछྨ͕ଘࡏ͢Δ͕ECFPͳͲ͕ಛʹ༗໊ • ֤Ϗοτ͕ಛఆͷߏ଄ʹରԠ • Collision͕ى͖ͯ͠·͏Մೳੑ͕͋Δ • InvertibleͰͳ͍ • SMILES • Խ߹෺Λจࣈྻͱͯ͠දݱ • ҰͭͷԽ߹෺ʹରͯ͠Ұҙʹܾ·Βͳ͍ • Θ͔ͣʹҟͳΔԽ߹෺΋SMILESͱͯ͠͸େ͖͘มΘͬͯ͠·͏৔߹΋ ʢԽ߹෺ͷsimilarityΛදݱ͢ΔΑ͏ʹσβΠϯ͞Ε͍ͯͳ͍ʣ • Graph • Խ߹෺ΛϊʔυΛΤοδͱͯ͠දݱ • ࣗવͳදݱํ๏ʹࢥ͑Δ https://arxiv.org/abs/1802.04364 https://arxiv.org/abs/1903.04388

Slide 7

Slide 7 text

༷ʑͳ༧ଌϞσϧ 7 Wu et al. (2017) άϥϑϕʔεͷϞσϧͷํ͕ྑ͍݁Ռ͕Ͱ͋Δ͜ͱ͕ଟ͍

Slide 8

Slide 8 text

ੜ੒ϞσϧͷϕʔεͱͳΔΞʔΩςΫνϟ 8 Sanchez-Lengeling&Aspuru-Guzik (2018)

Slide 9

Slide 9 text

༷ʑͳ૊Έ߹Θͤ 9 Schwalbe-Koda & Gómez-Bombarelli (2019)

Slide 10

Slide 10 text

3FTUSJDUFE˜&MJY *OD ࠷৽ͷੜ੒ϞσϧҰཡ 10 Elton et al. (2019)

Slide 11

Slide 11 text

Α͘࢖ΘΕΔެ։σʔληοτҰཡ 11 https://arxiv.org/abs/1903.04388 Elton et al. (2019)

Slide 12

Slide 12 text

ཁૉٕज़ 12

Slide 13

Slide 13 text

3FTUSJDUFE˜&MJY *OD (FOFSBUJWF"EWFSTBSJBM/FUXPSLT ("/T 13 Karras et al. (2018)

Slide 14

Slide 14 text

3FTUSJDUFE˜&MJY *OD (FOFSBUJWF"EWFSTBSJBM/FUXPSLT ("/T 14 ੜ੒ϞσϧͷҰछ Generator (G): ِ෺ͷը૾Λੜ੒͠ɺDΛὃͦ͏ͱ͢Δ Discriminator (D): ຊ෺ͷը૾ͱِ෺ͷը૾Λݟ෼͚Α͏ͱ͢Δ Noise G D ຊ෺ or ِ෺ʁ ِ෺ͷը૾ ʢੜ੒ը૾ʣ ຊ෺ͷը૾ ʢTraining setʣ Karras et al. (2017)

Slide 15

Slide 15 text

3FTUSJDUFE˜&MJY *OD (FOFSBUJWF"EWFSTBSJBM/FUXPSLT ("/T 15

Slide 16

Slide 16 text

3FTUSJDUFE˜&MJY *OD "VUPFODPEFST 16

Slide 17

Slide 17 text

3FTUSJDUFE˜&MJY *OD "VUPFODPEFST 17

Slide 18

Slide 18 text

3FTUSJDUFE˜&MJY *OD 7BSJBUJPOBM"VUPFODPEFST 7"&T 18 reconstruction ਖ਼ن෼෍͔ΒͷͣΕ

Slide 19

Slide 19 text

3FTUSJDUFE˜&MJY *OD 3FDVSSFOU/FVSBM/FUXPSLT 3//T 19 Segler et al. (2017)

Slide 20

Slide 20 text

3FTUSJDUFE˜&MJY *OD (SBQI3FQSFTFOUBUJPOT 20 Peter et al. (2018) https://www.businessinsider.com/explainer-what-exactly-is-the-social-graph-2012-3

Slide 21

Slide 21 text

3FTUSJDUFE˜&MJY *OD (SBQI/FVSBM/FUXPSLT 21 Peter et al. (2018)

Slide 22

Slide 22 text

3FTUSJDUFE˜&MJY *OD (SBQI/FVSBM/FUXPSLT 22 Peter et al. (2018)

Slide 23

Slide 23 text

3FTUSJDUFE˜&MJY *OD (SBQI$POWPMVUJPOBM/FUXPSLT 23 2D Convolution Graph Convolution Graph Convolutional Networks Wu et al. (2019)

Slide 24

Slide 24 text

3FTUSJDUFE˜&MJY *OD 3FJOGPSDFNFOU-FBSOJOH 3- ڧԽֶश 24 Sutton & Barto (2018) Mnih et al. (2015)

Slide 25

Slide 25 text

3FTUSJDUFE˜&MJY *OD 3FJOGPSDFNFOU-FBSOJOH 3- ڧԽֶश 25 Sutton & Barto (2018) Mnih et al. (2015) ex) QED, logP

Slide 26

Slide 26 text

3FTUSJDUFE˜&MJY *OD 5SBOTGFS-FBSOJOHʢసҠֶशʣ 26 ඇৗʹେ͖ͳϥϕϧͳ͠σʔλ গྔͷڭࢣσʔλ RDKitͰlogPͳͲΛ஋Λܭࢉ͠ɺ pre-train Goh et al. (2017)

Slide 27

Slide 27 text

'JOHFSQSJOU 4.*-&4ϕʔεͷϞσϧ 27

Slide 28

Slide 28 text

3FTUSJDUFE˜&MJY *OD .PMFDVMFSFQSFTFOUBUJPO 28 Fingerprint SMILES Graph Meter & Coote (2019) Schwalbe-Koda & Gómez-Bombarelli (2019)

Slide 29

Slide 29 text

3FTUSJDUFE˜&MJY *OD ,BEVSJOFUBM  29 • ೖग़ྗ • Binary fingerprints (MACCS) • Log concentration (LCONC) • தؒ૚ • 5ͭͷχϡʔϩϯͰߏ੒ • 1ͭ͸Growth Inhibition percentage (GI) • ࢒Γ4ͭ͸ਖ਼ن෼෍ʹۙͮ͘Α͏ʹֶश The cornucopia of meaningful leads: Applying deep AAEs for new molecule development in oncology

Slide 30

Slide 30 text

3FTUSJDUFE˜&MJY *OD ,BEVSJOFUBM  30 σʔληοτ Λ༻ҙֶ͠श Ϟσϧ͔Β αϯϓϧ நग़ ࣅͨಛ௃ͷ Խ߹෺Λ୳ࡧ • NCI-60, MCF-7 • 6252ͷԽ߹෺ • Fingerprint, LCONC, GI͔Β੒Δσʔλ •640ݸͷϕΫτϧ ʢԾ૝తͳԽ߹ ෺ʣΛαϯϓϧ •LCONC < -5.0 M ͷ΋ͷΛநग़ •32ݸͷϕΫτϧΛಘΔ •ࣅͨಛ௃ͷԽ߹෺Λ PubChem͔Β୳͠ ग़͢ ࣮ݧͷྲྀΕ

Slide 31

Slide 31 text

3FTUSJDUFE˜&MJY *OD ,BEVSJOFUBM  31 • PubChemɿ7200ສͷԽ߹෺ • ੜ੒ͨ͠32ݸͷϕΫτϧͱࣅͨಛ௃Λ࣋ͭԽ߹෺ ΛPubChem͔Βநग़ • ࠷ऴతʹ69ݸͷԽ߹෺Λಘͨ • طʹ߅͕Μࡎͱͯ͠஌ΒΕ͍ͯΔ΋ͷ͕ෳ਺ • 13ݸ͸ಛڐ͕औΒΕ͍ͯΔ΋ͷ • ΄ͱΜͲ͸ΞϯτϥαΠΫϦϯܥ ʢݱࡏ࠷΋ޮՌతͳ߅͕Μࡎʣ ྘: PubChem ੨: ֶशσʔλ ੺: ੜ੒ϕΫτϧʢԾ૝తͳԽ߹෺ʣ ࣮ݧ݁Ռ

Slide 32

Slide 32 text

3FTUSJDUFE˜&MJY *OD .PMFDVMFSFQSFTFOUBUJPO 32 Fingerprint SMILES Graph Meter & Coote (2019) Schwalbe-Koda & Gómez-Bombarelli (2019)

Slide 33

Slide 33 text

3FTUSJDUFE˜&MJY *OD 4FHMFSFUBM  33 • LSTMʹΑΓԽ߹෺Λੜ੒ • ೖग़ྗ͸SMILES • ԼهΛ܁Γฦ͢ʢHillclimb-MLEͱ΋ݺ͹ΕΔʣ 1. LSTMͰֶशɾαϯϓϧ 2. Target filtering modelͰϑΟϧλϦϯά ʢػցֶशҎ֎΋Մʣ Generating Focussed Molecule Libraries for Drug Discovery with Recurrent Neural Networks

Slide 34

Slide 34 text

3FTUSJDUFE˜&MJY *OD (PNF[#PNCBSFMMJFUBM  $7"& 34 • RNN+VAEʹΑΓԽ߹෺Λੜ੒ • ೖग़ྗ͸SMILES • λʔήοτͱ͢Δಛੑ͕େ͖͍latent code Λݟ͚ͭΔ Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules

Slide 35

Slide 35 text

3FTUSJDUFE˜&MJY *OD ,VTOFSFUBM  (7"& 35 Grammar Variational Autoencoder Encoder Decoder จ๏ʢcontext free grammarʣΛߟྀͯ͠ੜ੒

Slide 36

Slide 36 text

3FTUSJDUFE˜&MJY *OD :BOHFUBM  $IFN54 36 MCTSͱRNNʹΑΓSMILESΛੜ੒ Penalized logPΛ࠷దԽ

Slide 37

Slide 37 text

3FTUSJDUFE˜&MJY *OD 1PQPWBFUBM  3F-FB4& 37 https://arxiv.org/abs/1711.10907 Popova et al. (2017) • SMILESϕʔεͷੜ੒Ϟσϧ • ໨ඪಛੑΛ࠷దԽ͢ΔͨΊʹڧԽֶशͱ૊Έ߹Θͤ ͍ͯΔ • ௨ৗ͸rewardΛRDKit౳Ͱܭࢉ͢Δ͜ͱ͕ଟ͍͕ɺ SMILESϕʔεͷ༧ଌϞσϧʹΑΓrewardΛܭࢉͯ͠ ͍Δ • ͜ΕʹΑΓRDKit౳Ͱ͸ܭࢉͰ͖ͳ͍ಛੑ΋࠷దԽ

Slide 38

Slide 38 text

3FTUSJDUFE˜&MJY *OD (VJNBSBFTFUBM  03("/ 38 • SeqGANͱ͍͏sequential data༻ͷRNNϕʔεͷGAN͕جʹͳ͍ͬͯΔ • DruglikenessͳͲͷಛੑΛ࠷దԽ͢ΔͨΊʹڧԽֶशΛಋೖ

Slide 39

Slide 39 text

3FTUSJDUFE˜&MJY *OD "MM4.*-&47"& 39 • άϥϑܥϞσϧ • 3ʙ7૚͘Β͍ͷ΋ͷ͕ଟ͍ • 1૚ʹ͖ͭ1ͭ෼ͷڑ཭ʹ͋Δ৘ใ͕఻೻ • ZINC250kʹؚ·ΕΔ෼ࢠ • ฏۉ௚ܘ͕11.1 • ࠷େ௚ܘ24 • ෼ࢠશମʹ৘ใΛ఻͖͑Δ͜ͱ͕Ͱ͖ͳ͍ • RNNͰ͸௕͍৘ใΛ௥͑Δ • SMILES͸Ұҙʹܾ·Βͳ͍ • ෳ਺ͷSMILESΛೖྗʹར༻ Alperstein et al. (2019)

Slide 40

Slide 40 text

άϥϑϕʔεͷϞσϧʢPOFTIPUܕʣ 40

Slide 41

Slide 41 text

3FTUSJDUFE˜&MJY *OD .PMFDVMFSFQSFTFOUBUJPO 41 Fingerprint SMILES Graph Meter & Coote (2019) Schwalbe-Koda & Gómez-Bombarelli (2019)

Slide 42

Slide 42 text

3FTUSJDUFE˜&MJY *OD %F$BP,JQG  .PM("/ 42 • DiscriminatorͰgraph convΛར༻͢Δ͜ͱʹΑΓorder invariantʹ • ֤ಛੑΛ࠷దԽ͢Δ͜ͱ͸͏·͍͍ͬͯ͘ΔΑ͏ʹݟ͑Δ • ͔͠͠ɺuniqueness͕2%ఔ౓ͱඇৗʹ௿͍ʢGoal-directedͳ৔߹ʣ • GAN΍RLͰ͸ग़ྗΛଟ༷ʹ͢ΔΑ͏ͳ੍໿͕ͳ͍ͨΊ • ҰൃͰάϥϑΛੜ੒͢ΔͨΊܭࢉ͕࣌ؒ୹͍ • QM9Ͱ࣮ݧɻߋʹେ͖ͳάϥϑʹద༻͢Δͷ͸೉ͦ͠͏ άϥϑΛҰൃͰੜ੒͢ΔλΠϓͷϞσϧɻGANͱڧԽֶश΋ར༻ɻ

Slide 43

Slide 43 text

3FTUSJDUFE˜&MJY *OD 1ÖMTUFSM8BDIJOHFS  -'.PM("/ 43 • MolGANͷΑ͏ʹάϥϑΛҰൃͰੜ੒͢ΔλΠϓɻ͜ͷϞσϧͰ͸valencyʹؔ͢Δ੍໿Λಋೖ • Reconstruction lossΛexplicitʹܭࢉ͢Δ͜ͱ͕ͳ͘ɺgraph isomorphism problemΛճආ • ී௨ͷGANͱҧͬͯencoder΋ؚΉߏ଄ʹͳ͍ͬͯͯɺlatent spaceͰsimilarity͕ߴ͍෼ࢠΛ୳͢͜ͱ͕༰қ • QM9Ͱ࣮ݧ

Slide 44

Slide 44 text

άϥϑϕʔεͷϞσϧʢSFDVSSFOUܕʣ 44

Slide 45

Slide 45 text

3FTUSJDUFE˜&MJY *OD -JFUBM  45 Learning Deep Generative Models of Graphs SMILESͰ͸ͳ͘άϥϑͱͯ͠ϊʔυͱΤοδΛॱʹ௥Ճ GrammarVAEͳͲΑΓ΋ྑ͍݁Ռ

Slide 46

Slide 46 text

3FTUSJDUFE˜&MJY *OD +JOFUBM  +57"& 46 Junction Tree Variational Autoencoder for Molecular Graph Generation • ୯७ʹ͸ϊʔυΛҰͭͻͱͭ௥Ճ͍ͯ͘͠Ξϓϩʔν͕ߟ͑ ΒΕΔ • ͔͠͠ɺ͜Εͩͱ࣮ࡍʹ͸ଘࡏ͠ͳ͍Խ߹෺͕ੜ੒͞Εͯ͠ ·͏Մೳੑ͕͋Δ • ͦ͜ͰΫϥελ͝ͱʹੜ੒͍ͯ͘͠

Slide 47

Slide 47 text

3FTUSJDUFE˜&MJY *OD +JOFUBM  +57"& 47 ࣄલʹఆ͓͍ٛͯͨ͠ΫϥελΛ࢖ͬ ͯπϦʔߏ଄ʹ෼ղ EmbeddingΛ΋ͱʹ৽ͨͳπϦʔߏ଄Λߏங ʢϊʔυΛҰͭͻͱͭ௥Ճ͍ͯ͘͠ํࣜʣ Neural message passing ʹΑΓΤϯίʔυ ಘΒΕͨgraph embeddingͱπϦʔߏ଄ͷ ྆ํΛ࢖ͬͯ࠷ऴతͳԽ߹෺Λੜ੒ ʢΫϥελΛͲ͏૊Έ߹ΘͤΔ͔ͱ͍͏ࣗ༝ ౓͕͋ΔͨΊ͜ͷεςοϓ͕ඞཁʣ GRUʹΑΓΤϯίʔυ

Slide 48

Slide 48 text

3FTUSJDUFE˜&MJY *OD :PVFUBM  ($1/ 48 Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation ΤοδΛҰͭͣͭ௥Ճ͢Δ͜ͱͰάϥϑΛੜ੒ GANͱڧԽֶशΛ૊Έ߹ΘͤͨϞσϧ

Slide 49

Slide 49 text

3FTUSJDUFE˜&MJY *OD -JFUBM  .PM.1.PM3// 49 QED΍SAscore౳ͷ conditional codeΛೖΕΔ λʔήοτͱ͢ΔಛੑͳͲͰcondition͢ΔλΠϓͷϞσϧ

Slide 50

Slide 50 text

3FTUSJDUFE˜&MJY *OD ("/ͱ7"&ͷൺֱ 50 GAN • ϝϦοτ • ͏·͘νϡʔχϯάͰ͖Δͱྑ͍݁Ռ • Reconstruction lossΛܭࢉ͠ͳͯ͘ྑ͍ʢgraph isomorphism problemΛճආʣ • σϝϦοτ • ϋΠύʔύϥϝʔλνϡʔχϯά͕ࠔ೉ • Mode-collapseʢಉ͡΋ͷ͹͔Γੜ੒ͯ͠͠·͏ʣ VAE • ϝϦοτ • GANΑΓ΋҆ఆͯ͠ಈ͘ • ϋΠύʔύϥϝʔλνϡʔχϯάָ͕ • Mode-collapse΋ى͖ʹ͍͘ • σϝϦοτ • Reconstruction lossΛܭࢉ͢ΔͨΊgraph isomorphism problem͕ग़ͯ͘Δ

Slide 51

Slide 51 text

3FTUSJDUFE˜&MJY *OD 'JOHFSQSJOU 4.*-&4 (SBQIͷൺֱ 51 • Fingerprintϕʔε • Fingerprint͸invertibleͰͳ͍ͨΊ࢖͍ͮΒ͍ ʢͦͷͨΊ΄ͱΜͲݟ͔͚ͳ͍ʣ • SMILESϕʔε • ҆ఆͨ͠ੑೳ • Validity͕௿͘ͳͬͯ͠·͏܏޲ • Fragment-base generation͕೉͍͠ • Graphϕʔεʢone-shotܕʣ • ߴ଎ • Heavy atom͕9ҎԼͷখ͞ͳ෼ࢠ͔͠࡞Ε͍ͯͳ͍ • Validity΍uniqueness͕௿͍ • Graphϕʔεʢrecurrentܕʣ • Validity͕ߴ͍ • ϊʔυͱΤοδͷorderingͷ໰୊

Slide 52

Slide 52 text

ੜ੒Ϟσϧͷར༻๏ 52

Slide 53

Slide 53 text

3FTUSJDUFE˜&MJY *OD .PMFDVMFHFOFSBUJPO 53 Distribution Learning Predefined Scaffold Molecule Optimization

Slide 54

Slide 54 text

%JTUSJCVUJPO-FBSOJOH 54 https://github.com/NVlabs/ffhq-dataset Karras et al. (2018) ֶशσʔλ ੜ੒͞Εͨσʔλ

Slide 55

Slide 55 text

"SPVT1PVTFUBM   &YQMPSJOHUIF(%#DIFNJDBMTQBDFVTJOHEFFQHFOFSBUJWFNPEFMT 55 • GDB-13: 13ݸ·Ͱͷheavy atomͰߏ੒͞ΕΔ9.75ԯ෼ࢠ͔ΒͳΔ σʔληοτ • ͦͷ͏ͪͷ0.1%ʹ૬౰͢Δ100ສ෼ࢠΛ࢖ֶͬͯश • SMILESΛGRUʹ༩͑ΔγϯϓϧͳϞσϧ • 20ԯ෼ࢠΛαϯϓϧ͢Δ͜ͱʹΑΓGDB-13ͷ68.9%Λ෮ݩ͢Δ͜ ͱ͕Ͱ͖ͨ • GDB-13ʹؚ·ΕΔԽ߹෺ͷಛ௃΋͔ͭΉ͜ͱ͕Ͱ͖ͨ • SMILESͷه๏ʹىҼͯ͠ੜ੒ͮ͠Β͍λΠϓͷ෼ࢠ͕͋Δ͜ͱ ΋෼͔ͬͨʢringΛଟؚ͘Ή΋ͷͳͲʣ

Slide 56

Slide 56 text

.PMFDVMBSPQUJNJ[BUJPO 56 Choi et al. (2017)

Slide 57

Slide 57 text

.PMFDVMBSPQUJNJ[BUJPO 57 Latent space಺Λ୳ࡧ • Gradient ascent • ϕΠζ࠷దԽ ڧԽֶश Hillclimb-MLE ʢϑΟϧλϦϯάΛ܁Γฦֶͯ͠शʣ Conditioning code ʢ৚݅΋ೖྗͱͯ͠ѻ͏ʣ

Slide 58

Slide 58 text

.PMFDVMBSPQUJNJ[BUJPOʢಛఆͷ෦෼ߏ଄͔Βελʔτʣ 58 Penalized logPΛ࠷దԽ

Slide 59

Slide 59 text

ͦͷଞʢ༩͑ͨ෼ࢠͱྨࣅ౓ͷߴ͍෼ࢠΛੜ੒ʣ 59 Drug Analogs from Fragment Based Long Short-Term Memory Generative Neural Networks 1. ChEMBL, DrugBank, FDB17౳ͷσʔλΛ࢖ͬͯLSTMΛ pre-train 2. ͦͷޙ1ͭͷ෼ࢠͰfine-tuningʢ10छྨͷ෼ࢠͰ࣮ݧʣ 3. SMILESΛੜ੒ • Retain correct SMILES • Remove duplicates • Remove undesirable functional groups 4. ྨࣅ౓ͷߴ͍෼ࢠΛબͿ ༩͑ͨ෼ࢠͱྨࣅ౓ͷߴ͍෼ࢠΛੜ੒ Awale et el. (2018)

Slide 60

Slide 60 text

ͦͷଞʢ༩͑ͨ෼ࢠͱྨࣅ౓ͷߴ͍෼ࢠΛੜ੒ʣ 60 Drug Analogs from Fragment Based Long Short-Term Memory Generative Neural Networks Awale et el. (2018)

Slide 61

Slide 61 text

ੜ੒ϞσϧͷੑೳධՁ 61

Slide 62

Slide 62 text

ੜ੒ϞσϧͷධՁͷ೉͠͞ 62 Karras et al. (2018) • ఆੑతʹྑͦ͞͏ͳ͜ͱ͸෼͔Δ͕ɺఆྔతʹධՁ͢Δ͜ͱ͕೉͍͠ • Խ߹෺ͷ৔߹͸ఆੑతʹධՁ͢Δ͜ͱ΋إը૾ͳͲΑΓ΋೉͍͠

Slide 63

Slide 63 text

ੜ੒ϞσϧͷϕϯνϚʔΫ 63 • ͦΕͧΕͷ࿦จͰҟͳΔσʔληοτʢChEMBL, ZINC, QM9ͳͲʣɺҟͳΔϝτϦΫεΛ࢖༻͍ͯ͠ΔͨΊൺֱ͕೉ ͍͠ঢ়گ • ·ͨɺൺֱʹ༻͍͍ͯΔϝτϦΫεͷछྨ΋े෼Ͱ͸ͳ͍

Slide 64

Slide 64 text

#SPXOFUBM  (VBDB.PM %JTUSJCVUJPO-FBSOJOHϕϯνϚʔΫ 64 • Distribution-learningϕϯνϚʔΫͷ໨త • ܇࿅σʔλͷ܏޲Λ൓өͯ͠෼෍Λ͏·͘࠶ݱͰ͖͍ͯΔ͔ΛධՁ • ͜ͷλεΫ͕͏·͘͜ͳͤΔΑ͏ʹͳΔͱɺԽ߹෺ͷಛ௃Λ͏·͘ͱΒ͑ΒΕΔΑ͏ʹͳ͍ͬͯΔ͸ͣͰɺgoal-directed taskʹ΋໾ཱͭͱߟ͑ΒΕΔ • Validity • ੜ੒͞ΕͨԽ߹෺ͷ͏ͪͲΕ͘Β͍ͷׂ߹͕༗ޮͰ͋Δ͔ • ༗ޮ͔Ͳ͏͔͸RDKitͰνΣοΫ • Uniqueness • ॏෳΛνΣοΫɻϢχʔΫͳԽ߹෺ͷׂ߹ • Novelty • ৽نੑɻ܇࿅σʔλʹଘࡏ͠ͳ͍Խ߹෺ͷׂ߹ • Frechet ChemNet Distance (FCD) • ੜ෺׆ੑ༧ଌͰֶशͨ͠ChemNetͷಛ௃Λ࢖͍ɺ܇࿅σʔλͷ෼෍ͱͲΕ͘Β͍͍͔ۙΛൺֱ͢Δࢦඪ • ը૾Ͱ͸ੜ੒ϞσϧͷੑೳΛൺֱ͢ΔͨΊʹFrechet Inception Distance (FID)ͱ͍͏ࢦඪ͕࢖ΘΕΔ͕FCD͸ͦͷԽ߹෺൛ • KL Divergence • 2ͭͷ֬཰෼෍ͷࠩΛଌΔͨΊͷࢦඪ • ෺ཧԽֶతಛ௃Λॏࢹ

Slide 65

Slide 65 text

(PBM%JSFDUFEϕϯνϚʔΫʢNPMFDVMBSPQUJNJ[BUJPOʣ 65 • Goal-DirectedϕϯνϚʔΫͷ໨త • ಛఆͷείΞΛ࠷େԽ͢Δͱ͍͏ઃఆͰධՁ • Similarity • ྨࣅੑɻ܇࿅σʔλ͔ΒऔΓআ͔ΕͨλʔήοτʹͲΕ͘Β͍͚ۙͮΒΕΔ͔ • Rediscovery • ্هͱࣅ͍ͯΔ͕similarityͰ͸ͳ͘ɺશ͘ಉ͡෼ࢠΛੜ੒Ͱ͖Δ͔ • ͪ͜Β͸׬શҰகΛඞཁͱ͢Δ • Isomers • ྫ͑͹C7H8N2O2ͷΑ͏ͳ෼ࢠʹରͯ͠ͲΕ͘Β͍ҟੑମΛੜ੒Ͱ͖Δ͔ • ૑ༀͱ͸௚઀తʹ͸ؔ܎ͳ͍͕ϞσϧͷॊೈੑΛධՁ • Median molecules • ෳ਺ͷ෼ࢠͱͷsimilarityΛಉ࣌ʹ࠷େԽ

Slide 66

Slide 66 text

.FBTVSJOH$PNQPVOE2VBMJUZ 66 • Measuring Compound Qualityͷ໨త • ઌߦݚڀͷde novo design algorithmʹΑͬͯੜ੒͞ΕͨԽ߹෺͸ෆ҆ఆɺ൓Ԡੑ͕ߴ͍ɺ߹੒͕ࠔ೉ɺmedicinal chemist͕ݟΔ ͱ͓͔͍͠౳ͷ໰୊͕͋ΔՄೳੑ͕͋Δ • ͦͷͨΊɺ·ͱ΋ͳԽ߹෺Ͱ͋Δ͔ΛνΣοΫ͢Δඞཁ͕͋Δ • Medicinal chemist͕࣋ͭ஌ݟΛ͢΂ͯϧʔϧԽͯ͠νΣοΫ͢Δ͜ͱ͸೉͍͠ • ͜͜Ͱ͸rd_filterΛద༻ • https://github.com/PatWalters/rd_filters

Slide 67

Slide 67 text

࣮ݧ݁Ռɿ%JTUSJCVUJPOMFBSOJOHϕϯνϚʔΫ 67 • Random samplerɿChEMBL͔Βऔ͖͍ͬͯͯΔ͚ͩͳͷͰഁ୼͍ͯ͠ΔԽ߹෺͸ͳ͘ɺvalidity͸100%ɻ͔͠͠ɺnovelty͸θϩ • SMILES LSTMɿશମతʹྑ͍ • Graph MCTSɿׂͱྑ͍͕ɺKLͱFCD͕ѱ͍ • AAEɿFCDҎ֎͸ྑ͍ • ORGANɿશମతʹѱ͍ • VAEɿશମతʹྑ͍

Slide 68

Slide 68 text

࣮ݧ݁Ռɿ(PBMEJSFDUFEϕϯνϚʔΫ 68 • Best of Data Set • ܇࿅σʔλͷத͔Β࠷΋είΞͷߴ͍Խ߹෺ΛબΜͩ৔߹ɻ ࠷௿ݶ௒͑ͳ͚Ε͹ͳΒͳ͍ࢦඪɻ • Graph GA • Ұ൪ྑ͍݁Ռ • SMILES LSTM • Graph GAͱ΄΅ಉ౳ͷྑ͍݁Ռ • ͦͷଞϞσϧ • Graph GAͱSMILES LSTMʹൺ΂Δͱ໌Β͔ʹѱ͍݁Ռ

Slide 69

Slide 69 text

࣮ݧ݁Ռɿ$PNQPVOE2VBMJUZ.FBTVSFNFOU 69 • Goal-directedͳλεΫʹ͓͍ͯੜ੒͞ΕͨԽ߹෺Λrd_filterͰΫΦ ϦςΟʔνΣοΫ • SMILES LSTM͕໌Β͔ʹྑ͍݁Ռ • SMILES LSTMͰ͸·ͣpre-training͕͋ΓɺͦΕ͔Β֤είΞͷ࠷ େԽΛߦ͏ͱ͍͏ྲྀΕʹͳ͍ͬͯΔɻPre-trainingͷϑΣʔζͰԽ߹ ෺ͱͯ͠ॏཁͳಛ௃Λ͏·ֶ͘शͰ͖ͨͷͩͱߟ͑ΒΕΔɻ • ҰํɺGraph GA͸͋·Γྑ͘ͳ͍݁Ռɻࣄલ஌ࣝΛ࣋ͭ͜ͱͳ͘ ͍͖ͳΓείΞΛ࠷େԽ͠Α͏ͱ͢Δ෦෼ʹ໰୊͕͋Γͦ͏ɻ • Goal-directedϕϯνϚʔΫͰ͸SMILES LSTMͱGraph GA͸ಉ౳ ͷ݁ՌͩͬͨͷͰɺSMILES LSTMΛ࢖ͬͨํ͕ྑ͍ɻ

Slide 70

Slide 70 text

3FTUSJDUFE˜&MJY *OD 1ÖMTUFSM8BDIJOHFS  -'.PM("/ 70 • Validity, uniqueness, novelty͕ྑ͘࢖ΘΕΔ͕͋·ΓΑ͍ϝτϦΫεͰ͸ͳ͍ • ϊʔυͱΤοδΛϥϯμϜʹબͿϞσϧʢvalency͸ߟྀʣ͕ྑ͘ݟ͑ͯ͠·͏ • ֶशσʔλͱࣅ͍ͯͯԽֶతʹҙຯͷ͋Δ෼ࢠ͕ੜ੒͞Ε͍ͯΔ͔͸ߟྀ͞Εͯ ͍ͳ͍

Slide 71

Slide 71 text

ࠓޙͷൃలͷํ޲ੑ 71

Slide 72

Slide 72 text

3FTUSJDUFE˜&MJY *OD .VMUJPCKFDUJWFPQUJNJ[BUJPO (VJNBSBFTFUBM  03("/ 72 • Druglikeness, synthesizability, solubilityͰަޓʹֶश͢Δ͜ͱʹΑΓ3ͭͷಛੑΛ࠷దԽ • 3ͭ࠷దԽͯ͠΋ͦΕͧΕ1͚ͭͩΛ࠷దԽͨ࣌͠ʹ͍ۙ݁Ռ

Slide 73

Slide 73 text

3FTUSJDUFE˜&MJY *OD .VMUJPCKFDUJWFPQUJNJ[BUJPO ;IPVFUBM  .PM%2/ 73 • DQNʹΑΓ࠷దԽΛߦ͏ੜ੒Ϟσϧ • SimilarityͱQED (drug-likeness) Λಉ࣌ʹ࠷దԽ͢Δ࣮ݧΛߦ͍ͬͯΔ

Slide 74

Slide 74 text

3FTUSJDUFE˜&MJY *OD σʔληοτͳ͠ 1VSF3- .PM%2/ ;IPVFUBM  74 • ڧԽֶशΛར༻͢Δ͜ͱʹΑΓσʔληοτͳ͠Ͱ΋ֶश • Pre-train͠ͳ͍ͨΊ෯޿͍୳ࡧ͕Մೳ

Slide 75

Slide 75 text

3FTUSJDUFE˜&MJY *OD ߹੒ܦ࿏΋ߟྀɹ#SBETIBXFUBM  .PMFDVMF$IFG 75 Encoder Decoder ߹੒ܦ࿏΋ߟྀͨ͠Ϟσϧɻ൓Ԡ෺ͱੜ੒෺ͷ྆ํΛग़ྗɻ ൓Ԡ෺Λॱʹग़ྗɻ൓Ԡ෺͸ط஌ͷ΋ͷ͔Βબ͹ΕΔɻ ͦͷޙreaction predictorʹΑΓੜ੒෺ʹɻ Graph neural networkʹΑΓ൓Ԡ෺ͷembeddingΛಘΔ

Slide 76

Slide 76 text

&MJY *OD IUUQTFMJYJODDPN 76