Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Speaker Deck
PRO
Sign in
Sign up for free
生成モデルを中心としたAI創薬最前線 / Elix CBI 2019
Elix
October 22, 2019
Technology
4
4.7k
生成モデルを中心としたAI創薬最前線 / Elix CBI 2019
AI創薬で利用される様々な生成モデルについてまとめています。CBI学会2019での講演スライドです。
Elix
October 22, 2019
Tweet
Share
More Decks by Elix
See All by Elix
An Elix Discovery™ Case Study: Rediscovering Donepezil with an In-house Generative Model
elix
0
11
Elix, CBI, フォーカストセッション, はじめてのAI創薬とElixにおける事例紹介
elix
0
160
Elix, CBI, 招待講演, ElixにおけるAI創薬と最新動向, 2021-10-26
elix
0
150
Leveraging Self-Supervised Contextual Language Models for Deep Neural Network Antibody CDR-H3 Loop Prediction, Elix, CBI 2021
elix
0
58
Using Attribution-based Explainability to Guide Deep Molecular Optimization, Elix, CBI 2021
elix
0
51
Uncertainty in Virtual Screening, Elix, CBI 2021
elix
0
69
RetroSynth WAVE: An Open source Software Platform for Efficient Chemical Synthesis Research, Tokyo Institute of Technology, Elix, CBI 2021
elix
0
84
Improving Molecular Property Prediction using Self-supervised Learning, Elix, CBI 2021
elix
0
53
Towards Generating Synthesizable de novo small Molecules, Elix, CBI 2021
elix
0
83
Other Decks in Technology
See All in Technology
データエンジニアと作るデータ文化
yuki_saito
4
1.6k
Autonomous Database Cloud 技術詳細 / adb-s_technical_detail_jp
oracle4engineer
PRO
10
18k
oakのミドルウェアを書くときの技のらしきもの
toranoana
0
130
JJUG2022_spring_Keycloak (Red Hat Single Sign-on)
tinoue
0
200
Meet passkeys
satotakeshi
1
110
ROS再入門-はじめてのSLAM-
miura55
0
400
UWBを使ってみた
norioikedo
0
410
Scrum Fest Osaka 2022 段階的スクラムマスターのススメ
orimomo
0
770
越境チャレンジの現在地 〜Epic大臣制度の今〜
yousak
0
910
スタートアップと技術選定と AWS
track3jyo
PRO
2
330
SlackBotで あらゆる業務を自動化。問い合わせ〜DevOpsまで #CODT2022
kogatakanori
0
810
20220622_FinJAWS_あのときにAWSがあったらこうできた
taketakekaho
0
110
Featured
See All Featured
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
181
15k
A Tale of Four Properties
chriscoyier
149
21k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_i
23
15k
Building a Scalable Design System with Sketch
lauravandoore
448
30k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
269
11k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
4
510
Building Applications with DynamoDB
mza
83
4.7k
How GitHub (no longer) Works
holman
296
140k
Infographics Made Easy
chrislema
233
17k
Optimizing for Happiness
mojombo
365
63k
The Mythical Team-Month
searls
209
39k
Art, The Web, and Tiny UX
lynnandtonic
280
17k
Transcript
ੜϞσϧΛத৺ͱͨ͠"*ༀ࠷લઢ גࣜձࣾ&MJY $&0݁৳࠸ 2019/10/22 1 $#*ֶձେձ
࣍ 2 • ΠϯτϩμΫγϣϯ • ཁૉٕज़ • Fingerprint, SMILESϕʔεͷϞσϧ •
άϥϑϕʔεͷϞσϧ • ੜϞσϧͷར༻๏ • ੜϞσϧͷੑೳධՁ • ࠓޙͷൃలͷํੑ • Elix Chem
ΠϯτϩμΫγϣϯ 3
3FTUSJDUFE&MJY *OD ࢠઃܭ 4 Sanchez-Lengeling et al. (2018) ࣮ݧ/γϛϡϨʔγϣϯ ༧ଌϞσϧ
ੜϞσϧ Drug-likeͳࢠʙ10^60ݸ
3FTUSJDUFE&MJY *OD Α͘༻͍ΒΕΔදݱํ๏ 5 Fingerprint SMILES Graph Meter & Coote
(2019) Schwalbe-Koda & Gómez-Bombarelli (2019)
ಛʹΑ͘༻͍ΒΕΔදݱํ๏ 6 • Fingerprint • ༷ʑͳछྨ͕ଘࡏ͢Δ͕ECFPͳͲ͕ಛʹ༗໊ • ֤Ϗοτ͕ಛఆͷߏʹରԠ • Collision͕ى͖ͯ͠·͏Մೳੑ͕͋Δ
• InvertibleͰͳ͍ • SMILES • Խ߹Λจࣈྻͱͯ͠දݱ • ҰͭͷԽ߹ʹରͯ͠Ұҙʹܾ·Βͳ͍ • Θ͔ͣʹҟͳΔԽ߹SMILESͱͯ͠େ͖͘มΘͬͯ͠·͏߹ ʢԽ߹ͷsimilarityΛදݱ͢ΔΑ͏ʹσβΠϯ͞Ε͍ͯͳ͍ʣ • Graph • Խ߹ΛϊʔυΛΤοδͱͯ͠දݱ • ࣗવͳදݱํ๏ʹࢥ͑Δ https://arxiv.org/abs/1802.04364 https://arxiv.org/abs/1903.04388
༷ʑͳ༧ଌϞσϧ 7 Wu et al. (2017) άϥϑϕʔεͷϞσϧͷํ͕ྑ͍݁Ռ͕Ͱ͋Δ͜ͱ͕ଟ͍
ੜϞσϧͷϕʔεͱͳΔΞʔΩςΫνϟ 8 Sanchez-Lengeling&Aspuru-Guzik (2018)
༷ʑͳΈ߹Θͤ 9 Schwalbe-Koda & Gómez-Bombarelli (2019)
3FTUSJDUFE&MJY *OD ࠷৽ͷੜϞσϧҰཡ 10 Elton et al. (2019)
Α͘ΘΕΔެ։σʔληοτҰཡ 11 https://arxiv.org/abs/1903.04388 Elton et al. (2019)
ཁૉٕज़ 12
3FTUSJDUFE&MJY *OD (FOFSBUJWF"EWFSTBSJBM/FUXPSLT ("/T 13 Karras et al. (2018)
3FTUSJDUFE&MJY *OD (FOFSBUJWF"EWFSTBSJBM/FUXPSLT ("/T 14 ੜϞσϧͷҰछ Generator (G): ِͷը૾Λੜ͠ɺDΛὃͦ͏ͱ͢Δ Discriminator
(D): ຊͷը૾ͱِͷը૾Λݟ͚Α͏ͱ͢Δ Noise G D ຊ or ِʁ ِͷը૾ ʢੜը૾ʣ ຊͷը૾ ʢTraining setʣ Karras et al. (2017)
3FTUSJDUFE&MJY *OD (FOFSBUJWF"EWFSTBSJBM/FUXPSLT ("/T 15
3FTUSJDUFE&MJY *OD "VUPFODPEFST 16
3FTUSJDUFE&MJY *OD "VUPFODPEFST 17
3FTUSJDUFE&MJY *OD 7BSJBUJPOBM"VUPFODPEFST 7"&T 18 reconstruction ਖ਼ن͔ΒͷͣΕ
3FTUSJDUFE&MJY *OD 3FDVSSFOU/FVSBM/FUXPSLT 3//T 19 Segler et al. (2017)
3FTUSJDUFE&MJY *OD (SBQI3FQSFTFOUBUJPOT 20 Peter et al. (2018) https://www.businessinsider.com/explainer-what-exactly-is-the-social-graph-2012-3
3FTUSJDUFE&MJY *OD (SBQI/FVSBM/FUXPSLT 21 Peter et al. (2018)
3FTUSJDUFE&MJY *OD (SBQI/FVSBM/FUXPSLT 22 Peter et al. (2018)
3FTUSJDUFE&MJY *OD (SBQI$POWPMVUJPOBM/FUXPSLT 23 2D Convolution Graph Convolution Graph Convolutional
Networks Wu et al. (2019)
3FTUSJDUFE&MJY *OD 3FJOGPSDFNFOU-FBSOJOH 3- ڧԽֶश 24 Sutton & Barto (2018)
Mnih et al. (2015)
3FTUSJDUFE&MJY *OD 3FJOGPSDFNFOU-FBSOJOH 3- ڧԽֶश 25 Sutton & Barto (2018)
Mnih et al. (2015) ex) QED, logP
3FTUSJDUFE&MJY *OD 5SBOTGFS-FBSOJOHʢసҠֶशʣ 26 ඇৗʹେ͖ͳϥϕϧͳ͠σʔλ গྔͷڭࢣσʔλ RDKitͰlogPͳͲΛΛܭࢉ͠ɺ pre-train Goh et
al. (2017)
'JOHFSQSJOU 4.*-&4ϕʔεͷϞσϧ 27
3FTUSJDUFE&MJY *OD .PMFDVMFSFQSFTFOUBUJPO 28 Fingerprint SMILES Graph Meter & Coote
(2019) Schwalbe-Koda & Gómez-Bombarelli (2019)
3FTUSJDUFE&MJY *OD ,BEVSJOFUBM 29 • ೖग़ྗ • Binary fingerprints
(MACCS) • Log concentration (LCONC) • தؒ • 5ͭͷχϡʔϩϯͰߏ • 1ͭGrowth Inhibition percentage (GI) • Γ4ͭਖ਼نʹۙͮ͘Α͏ʹֶश The cornucopia of meaningful leads: Applying deep AAEs for new molecule development in oncology
3FTUSJDUFE&MJY *OD ,BEVSJOFUBM 30 σʔληοτ Λ༻ҙֶ͠श Ϟσϧ͔Β αϯϓϧ நग़
ࣅͨಛͷ Խ߹Λ୳ࡧ • NCI-60, MCF-7 • 6252ͷԽ߹ • Fingerprint, LCONC, GI͔ΒΔσʔλ •640ݸͷϕΫτϧ ʢԾతͳԽ߹ ʣΛαϯϓϧ •LCONC < -5.0 M ͷͷΛநग़ •32ݸͷϕΫτϧΛಘΔ •ࣅͨಛͷԽ߹Λ PubChem͔Β୳͠ ग़͢ ࣮ݧͷྲྀΕ
3FTUSJDUFE&MJY *OD ,BEVSJOFUBM 31 • PubChemɿ7200ສͷԽ߹ • ੜͨ͠32ݸͷϕΫτϧͱࣅͨಛΛ࣋ͭԽ߹ ΛPubChem͔Βநग़
• ࠷ऴతʹ69ݸͷԽ߹Λಘͨ • طʹ߅͕Μࡎͱͯ͠ΒΕ͍ͯΔͷ͕ෳ • 13ݸಛڐ͕औΒΕ͍ͯΔͷ • ΄ͱΜͲΞϯτϥαΠΫϦϯܥ ʢݱࡏ࠷ޮՌతͳ߅͕Μࡎʣ : PubChem ੨: ֶशσʔλ : ੜϕΫτϧʢԾతͳԽ߹ʣ ࣮ݧ݁Ռ
3FTUSJDUFE&MJY *OD .PMFDVMFSFQSFTFOUBUJPO 32 Fingerprint SMILES Graph Meter & Coote
(2019) Schwalbe-Koda & Gómez-Bombarelli (2019)
3FTUSJDUFE&MJY *OD 4FHMFSFUBM 33 • LSTMʹΑΓԽ߹Λੜ • ೖग़ྗSMILES •
ԼهΛ܁Γฦ͢ʢHillclimb-MLEͱݺΕΔʣ 1. LSTMͰֶशɾαϯϓϧ 2. Target filtering modelͰϑΟϧλϦϯά ʢػցֶशҎ֎Մʣ Generating Focussed Molecule Libraries for Drug Discovery with Recurrent Neural Networks
3FTUSJDUFE&MJY *OD (PNF[#PNCBSFMMJFUBM $7"& 34 • RNN+VAEʹΑΓԽ߹Λੜ • ೖग़ྗSMILES
• λʔήοτͱ͢Δಛੑ͕େ͖͍latent code Λݟ͚ͭΔ Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules
3FTUSJDUFE&MJY *OD ,VTOFSFUBM (7"& 35 Grammar Variational Autoencoder Encoder
Decoder จ๏ʢcontext free grammarʣΛߟྀͯ͠ੜ
3FTUSJDUFE&MJY *OD :BOHFUBM $IFN54 36 MCTSͱRNNʹΑΓSMILESΛੜ Penalized logPΛ࠷దԽ
3FTUSJDUFE&MJY *OD 1PQPWBFUBM 3F-FB4& 37 https://arxiv.org/abs/1711.10907 Popova et al.
(2017) • SMILESϕʔεͷੜϞσϧ • ඪಛੑΛ࠷దԽ͢ΔͨΊʹڧԽֶशͱΈ߹Θͤ ͍ͯΔ • ௨ৗrewardΛRDKitͰܭࢉ͢Δ͜ͱ͕ଟ͍͕ɺ SMILESϕʔεͷ༧ଌϞσϧʹΑΓrewardΛܭࢉͯ͠ ͍Δ • ͜ΕʹΑΓRDKitͰܭࢉͰ͖ͳ͍ಛੑ࠷దԽ
3FTUSJDUFE&MJY *OD (VJNBSBFTFUBM 03("/ 38 • SeqGANͱ͍͏sequential data༻ͷRNNϕʔεͷGAN͕جʹͳ͍ͬͯΔ •
DruglikenessͳͲͷಛੑΛ࠷దԽ͢ΔͨΊʹڧԽֶशΛಋೖ
3FTUSJDUFE&MJY *OD "MM4.*-&47"& 39 • άϥϑܥϞσϧ • 3ʙ7͘Β͍ͷͷ͕ଟ͍ • 1ʹ͖ͭ1ͭͷڑʹ͋Δใ͕
• ZINC250kʹؚ·ΕΔࢠ • ฏۉܘ͕11.1 • ࠷େܘ24 • ࢠશମʹใΛ͖͑Δ͜ͱ͕Ͱ͖ͳ͍ • RNNͰ͍ใΛ͑Δ • SMILESҰҙʹܾ·Βͳ͍ • ෳͷSMILESΛೖྗʹར༻ Alperstein et al. (2019)
άϥϑϕʔεͷϞσϧʢPOFTIPUܕʣ 40
3FTUSJDUFE&MJY *OD .PMFDVMFSFQSFTFOUBUJPO 41 Fingerprint SMILES Graph Meter & Coote
(2019) Schwalbe-Koda & Gómez-Bombarelli (2019)
3FTUSJDUFE&MJY *OD %F$BP,JQG .PM("/ 42 • DiscriminatorͰgraph convΛར༻͢Δ͜ͱʹΑΓorder invariantʹ
• ֤ಛੑΛ࠷దԽ͢Δ͜ͱ͏·͍͍ͬͯ͘ΔΑ͏ʹݟ͑Δ • ͔͠͠ɺuniqueness͕2%ఔͱඇৗʹ͍ʢGoal-directedͳ߹ʣ • GANRLͰग़ྗΛଟ༷ʹ͢ΔΑ͏ͳ੍͕ͳ͍ͨΊ • ҰൃͰάϥϑΛੜ͢ΔͨΊܭࢉ͕͍࣌ؒ • QM9Ͱ࣮ݧɻߋʹେ͖ͳάϥϑʹద༻͢Δͷͦ͠͏ άϥϑΛҰൃͰੜ͢ΔλΠϓͷϞσϧɻGANͱڧԽֶशར༻ɻ
3FTUSJDUFE&MJY *OD 1ÖMTUFSM8BDIJOHFS -'.PM("/ 43 • MolGANͷΑ͏ʹάϥϑΛҰൃͰੜ͢ΔλΠϓɻ͜ͷϞσϧͰvalencyʹؔ͢Δ੍Λಋೖ • Reconstruction
lossΛexplicitʹܭࢉ͢Δ͜ͱ͕ͳ͘ɺgraph isomorphism problemΛճආ • ී௨ͷGANͱҧͬͯencoderؚΉߏʹͳ͍ͬͯͯɺlatent spaceͰsimilarity͕ߴ͍ࢠΛ୳͢͜ͱ͕༰қ • QM9Ͱ࣮ݧ
άϥϑϕʔεͷϞσϧʢSFDVSSFOUܕʣ 44
3FTUSJDUFE&MJY *OD -JFUBM 45 Learning Deep Generative Models of
Graphs SMILESͰͳ͘άϥϑͱͯ͠ϊʔυͱΤοδΛॱʹՃ GrammarVAEͳͲΑΓྑ͍݁Ռ
3FTUSJDUFE&MJY *OD +JOFUBM +57"& 46 Junction Tree Variational Autoencoder
for Molecular Graph Generation • ୯७ʹϊʔυΛҰͭͻͱͭՃ͍ͯ͘͠Ξϓϩʔν͕ߟ͑ ΒΕΔ • ͔͠͠ɺ͜Εͩͱ࣮ࡍʹଘࡏ͠ͳ͍Խ߹͕ੜ͞Εͯ͠ ·͏Մೳੑ͕͋Δ • ͦ͜ͰΫϥελ͝ͱʹੜ͍ͯ͘͠
3FTUSJDUFE&MJY *OD +JOFUBM +57"& 47 ࣄલʹఆ͓͍ٛͯͨ͠ΫϥελΛͬ ͯπϦʔߏʹղ EmbeddingΛͱʹ৽ͨͳπϦʔߏΛߏங ʢϊʔυΛҰͭͻͱͭՃ͍ͯ͘͠ํࣜʣ
Neural message passing ʹΑΓΤϯίʔυ ಘΒΕͨgraph embeddingͱπϦʔߏͷ ྆ํΛͬͯ࠷ऴతͳԽ߹Λੜ ʢΫϥελΛͲ͏Έ߹ΘͤΔ͔ͱ͍͏ࣗ༝ ͕͋ΔͨΊ͜ͷεςοϓ͕ඞཁʣ GRUʹΑΓΤϯίʔυ
3FTUSJDUFE&MJY *OD :PVFUBM ($1/ 48 Graph Convolutional Policy Network
for Goal-Directed Molecular Graph Generation ΤοδΛҰͭͣͭՃ͢Δ͜ͱͰάϥϑΛੜ GANͱڧԽֶशΛΈ߹ΘͤͨϞσϧ
3FTUSJDUFE&MJY *OD -JFUBM .PM.1.PM3// 49 QEDSAscoreͷ conditional codeΛೖΕΔ λʔήοτͱ͢ΔಛੑͳͲͰcondition͢ΔλΠϓͷϞσϧ
3FTUSJDUFE&MJY *OD ("/ͱ7"&ͷൺֱ 50 GAN • ϝϦοτ • ͏·͘νϡʔχϯάͰ͖Δͱྑ͍݁Ռ •
Reconstruction lossΛܭࢉ͠ͳͯ͘ྑ͍ʢgraph isomorphism problemΛճආʣ • σϝϦοτ • ϋΠύʔύϥϝʔλνϡʔχϯά͕ࠔ • Mode-collapseʢಉ͡ͷ͔Γੜͯ͠͠·͏ʣ VAE • ϝϦοτ • GANΑΓ҆ఆͯ͠ಈ͘ • ϋΠύʔύϥϝʔλνϡʔχϯάָ͕ • Mode-collapseى͖ʹ͍͘ • σϝϦοτ • Reconstruction lossΛܭࢉ͢ΔͨΊgraph isomorphism problem͕ग़ͯ͘Δ
3FTUSJDUFE&MJY *OD 'JOHFSQSJOU 4.*-&4 (SBQIͷൺֱ 51 • Fingerprintϕʔε • FingerprintinvertibleͰͳ͍ͨΊ͍ͮΒ͍
ʢͦͷͨΊ΄ͱΜͲݟ͔͚ͳ͍ʣ • SMILESϕʔε • ҆ఆͨ͠ੑೳ • Validity͕͘ͳͬͯ͠·͏ • Fragment-base generation͕͍͠ • Graphϕʔεʢone-shotܕʣ • ߴ • Heavy atom͕9ҎԼͷখ͞ͳࢠ͔͠࡞Ε͍ͯͳ͍ • Validityuniqueness͕͍ • Graphϕʔεʢrecurrentܕʣ • Validity͕ߴ͍ • ϊʔυͱΤοδͷorderingͷ
ੜϞσϧͷར༻๏ 52
3FTUSJDUFE&MJY *OD .PMFDVMFHFOFSBUJPO 53 Distribution Learning Predefined Scaffold Molecule Optimization
%JTUSJCVUJPO-FBSOJOH 54 https://github.com/NVlabs/ffhq-dataset Karras et al. (2018) ֶशσʔλ ੜ͞Εͨσʔλ
"SPVT1PVTFUBM &YQMPSJOHUIF(%#DIFNJDBMTQBDFVTJOHEFFQHFOFSBUJWFNPEFMT 55 • GDB-13: 13ݸ·Ͱͷheavy atomͰߏ͞ΕΔ9.75ԯࢠ͔ΒͳΔ σʔληοτ
• ͦͷ͏ͪͷ0.1%ʹ૬͢Δ100ສࢠΛֶͬͯश • SMILESΛGRUʹ༩͑ΔγϯϓϧͳϞσϧ • 20ԯࢠΛαϯϓϧ͢Δ͜ͱʹΑΓGDB-13ͷ68.9%Λ෮ݩ͢Δ͜ ͱ͕Ͱ͖ͨ • GDB-13ʹؚ·ΕΔԽ߹ͷಛ͔ͭΉ͜ͱ͕Ͱ͖ͨ • SMILESͷه๏ʹىҼͯ͠ੜͮ͠Β͍λΠϓͷࢠ͕͋Δ͜ͱ ͔ͬͨʢringΛଟؚ͘ΉͷͳͲʣ
.PMFDVMBSPQUJNJ[BUJPO 56 Choi et al. (2017)
.PMFDVMBSPQUJNJ[BUJPO 57 Latent spaceΛ୳ࡧ • Gradient ascent • ϕΠζ࠷దԽ ڧԽֶश
Hillclimb-MLE ʢϑΟϧλϦϯάΛ܁Γฦֶͯ͠शʣ Conditioning code ʢ݅ೖྗͱͯ͠ѻ͏ʣ
.PMFDVMBSPQUJNJ[BUJPOʢಛఆͷ෦ߏ͔Βελʔτʣ 58 Penalized logPΛ࠷దԽ
ͦͷଞʢ༩͑ͨࢠͱྨࣅͷߴ͍ࢠΛੜʣ 59 Drug Analogs from Fragment Based Long Short-Term Memory
Generative Neural Networks 1. ChEMBL, DrugBank, FDB17ͷσʔλΛͬͯLSTMΛ pre-train 2. ͦͷޙ1ͭͷࢠͰfine-tuningʢ10छྨͷࢠͰ࣮ݧʣ 3. SMILESΛੜ • Retain correct SMILES • Remove duplicates • Remove undesirable functional groups 4. ྨࣅͷߴ͍ࢠΛબͿ ༩͑ͨࢠͱྨࣅͷߴ͍ࢠΛੜ Awale et el. (2018)
ͦͷଞʢ༩͑ͨࢠͱྨࣅͷߴ͍ࢠΛੜʣ 60 Drug Analogs from Fragment Based Long Short-Term Memory
Generative Neural Networks Awale et el. (2018)
ੜϞσϧͷੑೳධՁ 61
ੜϞσϧͷධՁͷ͠͞ 62 Karras et al. (2018) • ఆੑతʹྑͦ͞͏ͳ͜ͱ͔Δ͕ɺఆྔతʹධՁ͢Δ͜ͱ͕͍͠ • Խ߹ͷ߹ఆੑతʹධՁ͢Δ͜ͱإը૾ͳͲΑΓ͍͠
ੜϞσϧͷϕϯνϚʔΫ 63 • ͦΕͧΕͷจͰҟͳΔσʔληοτʢChEMBL, ZINC, QM9ͳͲʣɺҟͳΔϝτϦΫεΛ༻͍ͯ͠ΔͨΊൺֱ͕ ͍͠ঢ়گ • ·ͨɺൺֱʹ༻͍͍ͯΔϝτϦΫεͷछྨेͰͳ͍
#SPXOFUBM (VBDB.PM %JTUSJCVUJPO-FBSOJOHϕϯνϚʔΫ 64 • Distribution-learningϕϯνϚʔΫͷత • ܇࿅σʔλͷΛөͯ͠Λ͏·͘࠶ݱͰ͖͍ͯΔ͔ΛධՁ •
͜ͷλεΫ͕͏·͘͜ͳͤΔΑ͏ʹͳΔͱɺԽ߹ͷಛΛ͏·͘ͱΒ͑ΒΕΔΑ͏ʹͳ͍ͬͯΔͣͰɺgoal-directed taskʹཱͭͱߟ͑ΒΕΔ • Validity • ੜ͞ΕͨԽ߹ͷ͏ͪͲΕ͘Β͍ͷׂ߹͕༗ޮͰ͋Δ͔ • ༗ޮ͔Ͳ͏͔RDKitͰνΣοΫ • Uniqueness • ॏෳΛνΣοΫɻϢχʔΫͳԽ߹ͷׂ߹ • Novelty • ৽نੑɻ܇࿅σʔλʹଘࡏ͠ͳ͍Խ߹ͷׂ߹ • Frechet ChemNet Distance (FCD) • ੜ׆ੑ༧ଌͰֶशͨ͠ChemNetͷಛΛ͍ɺ܇࿅σʔλͷͱͲΕ͘Β͍͍͔ۙΛൺֱ͢Δࢦඪ • ը૾ͰੜϞσϧͷੑೳΛൺֱ͢ΔͨΊʹFrechet Inception Distance (FID)ͱ͍͏ࢦඪ͕ΘΕΔ͕FCDͦͷԽ߹൛ • KL Divergence • 2ͭͷ֬ͷࠩΛଌΔͨΊͷࢦඪ • ཧԽֶతಛΛॏࢹ
(PBM%JSFDUFEϕϯνϚʔΫʢNPMFDVMBSPQUJNJ[BUJPOʣ 65 • Goal-DirectedϕϯνϚʔΫͷత • ಛఆͷείΞΛ࠷େԽ͢Δͱ͍͏ઃఆͰධՁ • Similarity • ྨࣅੑɻ܇࿅σʔλ͔ΒऔΓআ͔ΕͨλʔήοτʹͲΕ͘Β͍͚ۙͮΒΕΔ͔
• Rediscovery • ্هͱࣅ͍ͯΔ͕similarityͰͳ͘ɺશ͘ಉ͡ࢠΛੜͰ͖Δ͔ • ͪ͜ΒશҰகΛඞཁͱ͢Δ • Isomers • ྫ͑C7H8N2O2ͷΑ͏ͳࢠʹରͯ͠ͲΕ͘Β͍ҟੑମΛੜͰ͖Δ͔ • ༀͱతʹؔͳ͍͕ϞσϧͷॊೈੑΛධՁ • Median molecules • ෳͷࢠͱͷsimilarityΛಉ࣌ʹ࠷େԽ
.FBTVSJOH$PNQPVOE2VBMJUZ 66 • Measuring Compound Qualityͷత • ઌߦݚڀͷde novo design
algorithmʹΑͬͯੜ͞ΕͨԽ߹ෆ҆ఆɺԠੑ͕ߴ͍ɺ߹͕ࠔɺmedicinal chemist͕ݟΔ ͱ͓͔͍͠ͷ͕͋ΔՄೳੑ͕͋Δ • ͦͷͨΊɺ·ͱͳԽ߹Ͱ͋Δ͔ΛνΣοΫ͢Δඞཁ͕͋Δ • Medicinal chemist͕࣋ͭݟΛͯ͢ϧʔϧԽͯ͠νΣοΫ͢Δ͜ͱ͍͠ • ͜͜Ͱrd_filterΛద༻ • https://github.com/PatWalters/rd_filters
࣮ݧ݁Ռɿ%JTUSJCVUJPOMFBSOJOHϕϯνϚʔΫ 67 • Random samplerɿChEMBL͔Βऔ͖͍ͬͯͯΔ͚ͩͳͷͰഁ͍ͯ͠ΔԽ߹ͳ͘ɺvalidity100%ɻ͔͠͠ɺnoveltyθϩ • SMILES LSTMɿશମతʹྑ͍ • Graph
MCTSɿׂͱྑ͍͕ɺKLͱFCD͕ѱ͍ • AAEɿFCDҎ֎ྑ͍ • ORGANɿશମతʹѱ͍ • VAEɿશମతʹྑ͍
࣮ݧ݁Ռɿ(PBMEJSFDUFEϕϯνϚʔΫ 68 • Best of Data Set • ܇࿅σʔλͷத͔Β࠷είΞͷߴ͍Խ߹ΛબΜͩ߹ɻ ࠷ݶ͑ͳ͚ΕͳΒͳ͍ࢦඪɻ
• Graph GA • Ұ൪ྑ͍݁Ռ • SMILES LSTM • Graph GAͱ΄΅ಉͷྑ͍݁Ռ • ͦͷଞϞσϧ • Graph GAͱSMILES LSTMʹൺΔͱ໌Β͔ʹѱ͍݁Ռ
࣮ݧ݁Ռɿ$PNQPVOE2VBMJUZ.FBTVSFNFOU 69 • Goal-directedͳλεΫʹ͓͍ͯੜ͞ΕͨԽ߹Λrd_filterͰΫΦ ϦςΟʔνΣοΫ • SMILES LSTM͕໌Β͔ʹྑ͍݁Ռ • SMILES
LSTMͰ·ͣpre-training͕͋ΓɺͦΕ͔Β֤είΞͷ࠷ େԽΛߦ͏ͱ͍͏ྲྀΕʹͳ͍ͬͯΔɻPre-trainingͷϑΣʔζͰԽ߹ ͱͯ͠ॏཁͳಛΛ͏·ֶ͘शͰ͖ͨͷͩͱߟ͑ΒΕΔɻ • ҰํɺGraph GA͋·Γྑ͘ͳ͍݁ՌɻࣄલࣝΛ࣋ͭ͜ͱͳ͘ ͍͖ͳΓείΞΛ࠷େԽ͠Α͏ͱ͢Δ෦ʹ͕͋Γͦ͏ɻ • Goal-directedϕϯνϚʔΫͰSMILES LSTMͱGraph GAಉ ͷ݁ՌͩͬͨͷͰɺSMILES LSTMΛͬͨํ͕ྑ͍ɻ
3FTUSJDUFE&MJY *OD 1ÖMTUFSM8BDIJOHFS -'.PM("/ 70 • Validity, uniqueness, novelty͕ྑ͘ΘΕΔ͕͋·ΓΑ͍ϝτϦΫεͰͳ͍
• ϊʔυͱΤοδΛϥϯμϜʹબͿϞσϧʢvalencyߟྀʣ͕ྑ͘ݟ͑ͯ͠·͏ • ֶशσʔλͱࣅ͍ͯͯԽֶతʹҙຯͷ͋Δࢠ͕ੜ͞Ε͍ͯΔ͔ߟྀ͞Εͯ ͍ͳ͍
ࠓޙͷൃలͷํੑ 71
3FTUSJDUFE&MJY *OD .VMUJPCKFDUJWFPQUJNJ[BUJPO (VJNBSBFTFUBM 03("/ 72 • Druglikeness, synthesizability,
solubilityͰަޓʹֶश͢Δ͜ͱʹΑΓ3ͭͷಛੑΛ࠷దԽ • 3ͭ࠷దԽͯͦ͠ΕͧΕ1͚ͭͩΛ࠷దԽͨ࣌͠ʹ͍ۙ݁Ռ
3FTUSJDUFE&MJY *OD .VMUJPCKFDUJWFPQUJNJ[BUJPO ;IPVFUBM .PM%2/ 73 • DQNʹΑΓ࠷దԽΛߦ͏ੜϞσϧ •
SimilarityͱQED (drug-likeness) Λಉ࣌ʹ࠷దԽ͢Δ࣮ݧΛߦ͍ͬͯΔ
3FTUSJDUFE&MJY *OD σʔληοτͳ͠ 1VSF3- .PM%2/ ;IPVFUBM 74 • ڧԽֶशΛར༻͢Δ͜ͱʹΑΓσʔληοτͳ͠Ͱֶश
• Pre-train͠ͳ͍ͨΊ෯͍୳ࡧ͕Մೳ
3FTUSJDUFE&MJY *OD ߹ܦ࿏ߟྀɹ#SBETIBXFUBM .PMFDVMF$IFG 75 Encoder Decoder ߹ܦ࿏ߟྀͨ͠ϞσϧɻԠͱੜͷ྆ํΛग़ྗɻ ԠΛॱʹग़ྗɻԠطͷͷ͔ΒબΕΔɻ
ͦͷޙreaction predictorʹΑΓੜʹɻ Graph neural networkʹΑΓԠͷembeddingΛಘΔ
&MJY *OD IUUQTFMJYJODDPN 76