Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
生成モデルを中心としたAI創薬最前線 / Elix CBI 2019
Search
Elix
October 22, 2019
Technology
4
5.3k
生成モデルを中心としたAI創薬最前線 / Elix CBI 2019
AI創薬で利用される様々な生成モデルについてまとめています。CBI学会2019での講演スライドです。
Elix
October 22, 2019
Tweet
Share
More Decks by Elix
See All by Elix
SynthFormer: A Customizable Framework for Virtual Synthesis-Based Molecule Generation, Elix, CBI2024
elix
0
110
Optimization of Generator Reward Function Settings for Non-covalent KRAS Inhibitors, Elix, CBI2024
elix
0
190
Open Molecule Generator: A Multipurpose Molecule LLM, Elix, CBI2024
elix
0
110
Elix, CBI2024, ランチョンセミナー, 創薬における連合学習の応用
elix
0
110
Elix, 第1回 AIDD Wednesday, ⼩規模データセットを⽤いた 予測モデルの訓練について
elix
0
660
Molecular Generation of Non-covalent KRAS Inhibitor Candidates Using Machine Learning on Elix Discovery™, Elix, 8th Autumn School of Chemoinformatics, Nara
elix
0
330
Elix, CBI 2023, ランチョンセミナー, 大規模言語モデルの基本から最前線へ
elix
0
470
Efficient and Scalable Framework for Activity Prediction with kMol, Elix, CBI 2023
elix
0
260
Protein - Ligand Affinity Prediction_Strategizing Data Usage for Virtual Screening, Elix, CBI 2023
elix
0
240
Other Decks in Technology
See All in Technology
GCASアップデート(202508-202510)
techniczna
0
100
20251024_TROCCO/COMETAアップデート紹介といくつかデモもやります!_#p_UG 東京:データ活用が進む組織の作り方
soysoysoyb
0
130
現場の壁を乗り越えて、 「計装注入」が拓く オブザーバビリティ / Beyond the Field Barriers: Instrumentation Injection and the Future of Observability
aoto
PRO
1
700
【SORACOM UG Explorer 2025】さらなる10年へ ~ SORACOM MVC 発表
soracom
PRO
0
180
OPENLOGI Company Profile for engineer
hr01
1
46k
頭部ふわふわ浄酔器
uyupun
0
240
知覚とデザイン
rinchoku
1
640
戦えるAIエージェントの作り方
iwiwi
11
5.1k
How Fast Is Fast Enough? [PerfNow 2025]
tammyeverts
2
160
dbtとAIエージェントを組み合わせて見えたデータ調査の新しい形
10xinc
7
1.5k
AI機能プロジェクト炎上の 3つのしくじりと学び
nakawai
0
160
オブザーバビリティが育むシステム理解と好奇心
maruloop
3
1.7k
Featured
See All Featured
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
132
19k
Gamification - CAS2011
davidbonilla
81
5.5k
Producing Creativity
orderedlist
PRO
348
40k
RailsConf 2023
tenderlove
30
1.3k
Raft: Consensus for Rubyists
vanstee
140
7.2k
Leading Effective Engineering Teams in the AI Era
addyosmani
7
670
What's in a price? How to price your products and services
michaelherold
246
12k
Thoughts on Productivity
jonyablonski
71
4.9k
Bash Introduction
62gerente
615
210k
Designing for humans not robots
tammielis
254
26k
Automating Front-end Workflow
addyosmani
1371
200k
Faster Mobile Websites
deanohume
310
31k
Transcript
ੜϞσϧΛத৺ͱͨ͠"*ༀ࠷લઢ גࣜձࣾ&MJY $&0݁৳࠸ 2019/10/22 1 $#*ֶձେձ
࣍ 2 • ΠϯτϩμΫγϣϯ • ཁૉٕज़ • Fingerprint, SMILESϕʔεͷϞσϧ •
άϥϑϕʔεͷϞσϧ • ੜϞσϧͷར༻๏ • ੜϞσϧͷੑೳධՁ • ࠓޙͷൃలͷํੑ • Elix Chem
ΠϯτϩμΫγϣϯ 3
3FTUSJDUFE&MJY *OD ࢠઃܭ 4 Sanchez-Lengeling et al. (2018) ࣮ݧ/γϛϡϨʔγϣϯ ༧ଌϞσϧ
ੜϞσϧ Drug-likeͳࢠʙ10^60ݸ
3FTUSJDUFE&MJY *OD Α͘༻͍ΒΕΔදݱํ๏ 5 Fingerprint SMILES Graph Meter & Coote
(2019) Schwalbe-Koda & Gómez-Bombarelli (2019)
ಛʹΑ͘༻͍ΒΕΔදݱํ๏ 6 • Fingerprint • ༷ʑͳछྨ͕ଘࡏ͢Δ͕ECFPͳͲ͕ಛʹ༗໊ • ֤Ϗοτ͕ಛఆͷߏʹରԠ • Collision͕ى͖ͯ͠·͏Մೳੑ͕͋Δ
• InvertibleͰͳ͍ • SMILES • Խ߹Λจࣈྻͱͯ͠දݱ • ҰͭͷԽ߹ʹରͯ͠Ұҙʹܾ·Βͳ͍ • Θ͔ͣʹҟͳΔԽ߹SMILESͱͯ͠େ͖͘มΘͬͯ͠·͏߹ ʢԽ߹ͷsimilarityΛදݱ͢ΔΑ͏ʹσβΠϯ͞Ε͍ͯͳ͍ʣ • Graph • Խ߹ΛϊʔυΛΤοδͱͯ͠දݱ • ࣗવͳදݱํ๏ʹࢥ͑Δ https://arxiv.org/abs/1802.04364 https://arxiv.org/abs/1903.04388
༷ʑͳ༧ଌϞσϧ 7 Wu et al. (2017) άϥϑϕʔεͷϞσϧͷํ͕ྑ͍݁Ռ͕Ͱ͋Δ͜ͱ͕ଟ͍
ੜϞσϧͷϕʔεͱͳΔΞʔΩςΫνϟ 8 Sanchez-Lengeling&Aspuru-Guzik (2018)
༷ʑͳΈ߹Θͤ 9 Schwalbe-Koda & Gómez-Bombarelli (2019)
3FTUSJDUFE&MJY *OD ࠷৽ͷੜϞσϧҰཡ 10 Elton et al. (2019)
Α͘ΘΕΔެ։σʔληοτҰཡ 11 https://arxiv.org/abs/1903.04388 Elton et al. (2019)
ཁૉٕज़ 12
3FTUSJDUFE&MJY *OD (FOFSBUJWF"EWFSTBSJBM/FUXPSLT ("/T 13 Karras et al. (2018)
3FTUSJDUFE&MJY *OD (FOFSBUJWF"EWFSTBSJBM/FUXPSLT ("/T 14 ੜϞσϧͷҰछ Generator (G): ِͷը૾Λੜ͠ɺDΛὃͦ͏ͱ͢Δ Discriminator
(D): ຊͷը૾ͱِͷը૾Λݟ͚Α͏ͱ͢Δ Noise G D ຊ or ِʁ ِͷը૾ ʢੜը૾ʣ ຊͷը૾ ʢTraining setʣ Karras et al. (2017)
3FTUSJDUFE&MJY *OD (FOFSBUJWF"EWFSTBSJBM/FUXPSLT ("/T 15
3FTUSJDUFE&MJY *OD "VUPFODPEFST 16
3FTUSJDUFE&MJY *OD "VUPFODPEFST 17
3FTUSJDUFE&MJY *OD 7BSJBUJPOBM"VUPFODPEFST 7"&T 18 reconstruction ਖ਼ن͔ΒͷͣΕ
3FTUSJDUFE&MJY *OD 3FDVSSFOU/FVSBM/FUXPSLT 3//T 19 Segler et al. (2017)
3FTUSJDUFE&MJY *OD (SBQI3FQSFTFOUBUJPOT 20 Peter et al. (2018) https://www.businessinsider.com/explainer-what-exactly-is-the-social-graph-2012-3
3FTUSJDUFE&MJY *OD (SBQI/FVSBM/FUXPSLT 21 Peter et al. (2018)
3FTUSJDUFE&MJY *OD (SBQI/FVSBM/FUXPSLT 22 Peter et al. (2018)
3FTUSJDUFE&MJY *OD (SBQI$POWPMVUJPOBM/FUXPSLT 23 2D Convolution Graph Convolution Graph Convolutional
Networks Wu et al. (2019)
3FTUSJDUFE&MJY *OD 3FJOGPSDFNFOU-FBSOJOH 3- ڧԽֶश 24 Sutton & Barto (2018)
Mnih et al. (2015)
3FTUSJDUFE&MJY *OD 3FJOGPSDFNFOU-FBSOJOH 3- ڧԽֶश 25 Sutton & Barto (2018)
Mnih et al. (2015) ex) QED, logP
3FTUSJDUFE&MJY *OD 5SBOTGFS-FBSOJOHʢసҠֶशʣ 26 ඇৗʹେ͖ͳϥϕϧͳ͠σʔλ গྔͷڭࢣσʔλ RDKitͰlogPͳͲΛΛܭࢉ͠ɺ pre-train Goh et
al. (2017)
'JOHFSQSJOU 4.*-&4ϕʔεͷϞσϧ 27
3FTUSJDUFE&MJY *OD .PMFDVMFSFQSFTFOUBUJPO 28 Fingerprint SMILES Graph Meter & Coote
(2019) Schwalbe-Koda & Gómez-Bombarelli (2019)
3FTUSJDUFE&MJY *OD ,BEVSJOFUBM 29 • ೖग़ྗ • Binary fingerprints
(MACCS) • Log concentration (LCONC) • தؒ • 5ͭͷχϡʔϩϯͰߏ • 1ͭGrowth Inhibition percentage (GI) • Γ4ͭਖ਼نʹۙͮ͘Α͏ʹֶश The cornucopia of meaningful leads: Applying deep AAEs for new molecule development in oncology
3FTUSJDUFE&MJY *OD ,BEVSJOFUBM 30 σʔληοτ Λ༻ҙֶ͠श Ϟσϧ͔Β αϯϓϧ நग़
ࣅͨಛͷ Խ߹Λ୳ࡧ • NCI-60, MCF-7 • 6252ͷԽ߹ • Fingerprint, LCONC, GI͔ΒΔσʔλ •640ݸͷϕΫτϧ ʢԾతͳԽ߹ ʣΛαϯϓϧ •LCONC < -5.0 M ͷͷΛநग़ •32ݸͷϕΫτϧΛಘΔ •ࣅͨಛͷԽ߹Λ PubChem͔Β୳͠ ग़͢ ࣮ݧͷྲྀΕ
3FTUSJDUFE&MJY *OD ,BEVSJOFUBM 31 • PubChemɿ7200ສͷԽ߹ • ੜͨ͠32ݸͷϕΫτϧͱࣅͨಛΛ࣋ͭԽ߹ ΛPubChem͔Βநग़
• ࠷ऴతʹ69ݸͷԽ߹Λಘͨ • طʹ߅͕Μࡎͱͯ͠ΒΕ͍ͯΔͷ͕ෳ • 13ݸಛڐ͕औΒΕ͍ͯΔͷ • ΄ͱΜͲΞϯτϥαΠΫϦϯܥ ʢݱࡏ࠷ޮՌతͳ߅͕Μࡎʣ : PubChem ੨: ֶशσʔλ : ੜϕΫτϧʢԾతͳԽ߹ʣ ࣮ݧ݁Ռ
3FTUSJDUFE&MJY *OD .PMFDVMFSFQSFTFOUBUJPO 32 Fingerprint SMILES Graph Meter & Coote
(2019) Schwalbe-Koda & Gómez-Bombarelli (2019)
3FTUSJDUFE&MJY *OD 4FHMFSFUBM 33 • LSTMʹΑΓԽ߹Λੜ • ೖग़ྗSMILES •
ԼهΛ܁Γฦ͢ʢHillclimb-MLEͱݺΕΔʣ 1. LSTMͰֶशɾαϯϓϧ 2. Target filtering modelͰϑΟϧλϦϯά ʢػցֶशҎ֎Մʣ Generating Focussed Molecule Libraries for Drug Discovery with Recurrent Neural Networks
3FTUSJDUFE&MJY *OD (PNF[#PNCBSFMMJFUBM $7"& 34 • RNN+VAEʹΑΓԽ߹Λੜ • ೖग़ྗSMILES
• λʔήοτͱ͢Δಛੑ͕େ͖͍latent code Λݟ͚ͭΔ Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules
3FTUSJDUFE&MJY *OD ,VTOFSFUBM (7"& 35 Grammar Variational Autoencoder Encoder
Decoder จ๏ʢcontext free grammarʣΛߟྀͯ͠ੜ
3FTUSJDUFE&MJY *OD :BOHFUBM $IFN54 36 MCTSͱRNNʹΑΓSMILESΛੜ Penalized logPΛ࠷దԽ
3FTUSJDUFE&MJY *OD 1PQPWBFUBM 3F-FB4& 37 https://arxiv.org/abs/1711.10907 Popova et al.
(2017) • SMILESϕʔεͷੜϞσϧ • ඪಛੑΛ࠷దԽ͢ΔͨΊʹڧԽֶशͱΈ߹Θͤ ͍ͯΔ • ௨ৗrewardΛRDKitͰܭࢉ͢Δ͜ͱ͕ଟ͍͕ɺ SMILESϕʔεͷ༧ଌϞσϧʹΑΓrewardΛܭࢉͯ͠ ͍Δ • ͜ΕʹΑΓRDKitͰܭࢉͰ͖ͳ͍ಛੑ࠷దԽ
3FTUSJDUFE&MJY *OD (VJNBSBFTFUBM 03("/ 38 • SeqGANͱ͍͏sequential data༻ͷRNNϕʔεͷGAN͕جʹͳ͍ͬͯΔ •
DruglikenessͳͲͷಛੑΛ࠷దԽ͢ΔͨΊʹڧԽֶशΛಋೖ
3FTUSJDUFE&MJY *OD "MM4.*-&47"& 39 • άϥϑܥϞσϧ • 3ʙ7͘Β͍ͷͷ͕ଟ͍ • 1ʹ͖ͭ1ͭͷڑʹ͋Δใ͕
• ZINC250kʹؚ·ΕΔࢠ • ฏۉܘ͕11.1 • ࠷େܘ24 • ࢠશମʹใΛ͖͑Δ͜ͱ͕Ͱ͖ͳ͍ • RNNͰ͍ใΛ͑Δ • SMILESҰҙʹܾ·Βͳ͍ • ෳͷSMILESΛೖྗʹར༻ Alperstein et al. (2019)
άϥϑϕʔεͷϞσϧʢPOFTIPUܕʣ 40
3FTUSJDUFE&MJY *OD .PMFDVMFSFQSFTFOUBUJPO 41 Fingerprint SMILES Graph Meter & Coote
(2019) Schwalbe-Koda & Gómez-Bombarelli (2019)
3FTUSJDUFE&MJY *OD %F$BP,JQG .PM("/ 42 • DiscriminatorͰgraph convΛར༻͢Δ͜ͱʹΑΓorder invariantʹ
• ֤ಛੑΛ࠷దԽ͢Δ͜ͱ͏·͍͍ͬͯ͘ΔΑ͏ʹݟ͑Δ • ͔͠͠ɺuniqueness͕2%ఔͱඇৗʹ͍ʢGoal-directedͳ߹ʣ • GANRLͰग़ྗΛଟ༷ʹ͢ΔΑ͏ͳ੍͕ͳ͍ͨΊ • ҰൃͰάϥϑΛੜ͢ΔͨΊܭࢉ͕͍࣌ؒ • QM9Ͱ࣮ݧɻߋʹେ͖ͳάϥϑʹద༻͢Δͷͦ͠͏ άϥϑΛҰൃͰੜ͢ΔλΠϓͷϞσϧɻGANͱڧԽֶशར༻ɻ
3FTUSJDUFE&MJY *OD 1ÖMTUFSM8BDIJOHFS -'.PM("/ 43 • MolGANͷΑ͏ʹάϥϑΛҰൃͰੜ͢ΔλΠϓɻ͜ͷϞσϧͰvalencyʹؔ͢Δ੍Λಋೖ • Reconstruction
lossΛexplicitʹܭࢉ͢Δ͜ͱ͕ͳ͘ɺgraph isomorphism problemΛճආ • ී௨ͷGANͱҧͬͯencoderؚΉߏʹͳ͍ͬͯͯɺlatent spaceͰsimilarity͕ߴ͍ࢠΛ୳͢͜ͱ͕༰қ • QM9Ͱ࣮ݧ
άϥϑϕʔεͷϞσϧʢSFDVSSFOUܕʣ 44
3FTUSJDUFE&MJY *OD -JFUBM 45 Learning Deep Generative Models of
Graphs SMILESͰͳ͘άϥϑͱͯ͠ϊʔυͱΤοδΛॱʹՃ GrammarVAEͳͲΑΓྑ͍݁Ռ
3FTUSJDUFE&MJY *OD +JOFUBM +57"& 46 Junction Tree Variational Autoencoder
for Molecular Graph Generation • ୯७ʹϊʔυΛҰͭͻͱͭՃ͍ͯ͘͠Ξϓϩʔν͕ߟ͑ ΒΕΔ • ͔͠͠ɺ͜Εͩͱ࣮ࡍʹଘࡏ͠ͳ͍Խ߹͕ੜ͞Εͯ͠ ·͏Մೳੑ͕͋Δ • ͦ͜ͰΫϥελ͝ͱʹੜ͍ͯ͘͠
3FTUSJDUFE&MJY *OD +JOFUBM +57"& 47 ࣄલʹఆ͓͍ٛͯͨ͠ΫϥελΛͬ ͯπϦʔߏʹղ EmbeddingΛͱʹ৽ͨͳπϦʔߏΛߏங ʢϊʔυΛҰͭͻͱͭՃ͍ͯ͘͠ํࣜʣ
Neural message passing ʹΑΓΤϯίʔυ ಘΒΕͨgraph embeddingͱπϦʔߏͷ ྆ํΛͬͯ࠷ऴతͳԽ߹Λੜ ʢΫϥελΛͲ͏Έ߹ΘͤΔ͔ͱ͍͏ࣗ༝ ͕͋ΔͨΊ͜ͷεςοϓ͕ඞཁʣ GRUʹΑΓΤϯίʔυ
3FTUSJDUFE&MJY *OD :PVFUBM ($1/ 48 Graph Convolutional Policy Network
for Goal-Directed Molecular Graph Generation ΤοδΛҰͭͣͭՃ͢Δ͜ͱͰάϥϑΛੜ GANͱڧԽֶशΛΈ߹ΘͤͨϞσϧ
3FTUSJDUFE&MJY *OD -JFUBM .PM.1.PM3// 49 QEDSAscoreͷ conditional codeΛೖΕΔ λʔήοτͱ͢ΔಛੑͳͲͰcondition͢ΔλΠϓͷϞσϧ
3FTUSJDUFE&MJY *OD ("/ͱ7"&ͷൺֱ 50 GAN • ϝϦοτ • ͏·͘νϡʔχϯάͰ͖Δͱྑ͍݁Ռ •
Reconstruction lossΛܭࢉ͠ͳͯ͘ྑ͍ʢgraph isomorphism problemΛճආʣ • σϝϦοτ • ϋΠύʔύϥϝʔλνϡʔχϯά͕ࠔ • Mode-collapseʢಉ͡ͷ͔Γੜͯ͠͠·͏ʣ VAE • ϝϦοτ • GANΑΓ҆ఆͯ͠ಈ͘ • ϋΠύʔύϥϝʔλνϡʔχϯάָ͕ • Mode-collapseى͖ʹ͍͘ • σϝϦοτ • Reconstruction lossΛܭࢉ͢ΔͨΊgraph isomorphism problem͕ग़ͯ͘Δ
3FTUSJDUFE&MJY *OD 'JOHFSQSJOU 4.*-&4 (SBQIͷൺֱ 51 • Fingerprintϕʔε • FingerprintinvertibleͰͳ͍ͨΊ͍ͮΒ͍
ʢͦͷͨΊ΄ͱΜͲݟ͔͚ͳ͍ʣ • SMILESϕʔε • ҆ఆͨ͠ੑೳ • Validity͕͘ͳͬͯ͠·͏ • Fragment-base generation͕͍͠ • Graphϕʔεʢone-shotܕʣ • ߴ • Heavy atom͕9ҎԼͷখ͞ͳࢠ͔͠࡞Ε͍ͯͳ͍ • Validityuniqueness͕͍ • Graphϕʔεʢrecurrentܕʣ • Validity͕ߴ͍ • ϊʔυͱΤοδͷorderingͷ
ੜϞσϧͷར༻๏ 52
3FTUSJDUFE&MJY *OD .PMFDVMFHFOFSBUJPO 53 Distribution Learning Predefined Scaffold Molecule Optimization
%JTUSJCVUJPO-FBSOJOH 54 https://github.com/NVlabs/ffhq-dataset Karras et al. (2018) ֶशσʔλ ੜ͞Εͨσʔλ
"SPVT1PVTFUBM &YQMPSJOHUIF(%#DIFNJDBMTQBDFVTJOHEFFQHFOFSBUJWFNPEFMT 55 • GDB-13: 13ݸ·Ͱͷheavy atomͰߏ͞ΕΔ9.75ԯࢠ͔ΒͳΔ σʔληοτ
• ͦͷ͏ͪͷ0.1%ʹ૬͢Δ100ສࢠΛֶͬͯश • SMILESΛGRUʹ༩͑ΔγϯϓϧͳϞσϧ • 20ԯࢠΛαϯϓϧ͢Δ͜ͱʹΑΓGDB-13ͷ68.9%Λ෮ݩ͢Δ͜ ͱ͕Ͱ͖ͨ • GDB-13ʹؚ·ΕΔԽ߹ͷಛ͔ͭΉ͜ͱ͕Ͱ͖ͨ • SMILESͷه๏ʹىҼͯ͠ੜͮ͠Β͍λΠϓͷࢠ͕͋Δ͜ͱ ͔ͬͨʢringΛଟؚ͘ΉͷͳͲʣ
.PMFDVMBSPQUJNJ[BUJPO 56 Choi et al. (2017)
.PMFDVMBSPQUJNJ[BUJPO 57 Latent spaceΛ୳ࡧ • Gradient ascent • ϕΠζ࠷దԽ ڧԽֶश
Hillclimb-MLE ʢϑΟϧλϦϯάΛ܁Γฦֶͯ͠शʣ Conditioning code ʢ݅ೖྗͱͯ͠ѻ͏ʣ
.PMFDVMBSPQUJNJ[BUJPOʢಛఆͷ෦ߏ͔Βελʔτʣ 58 Penalized logPΛ࠷దԽ
ͦͷଞʢ༩͑ͨࢠͱྨࣅͷߴ͍ࢠΛੜʣ 59 Drug Analogs from Fragment Based Long Short-Term Memory
Generative Neural Networks 1. ChEMBL, DrugBank, FDB17ͷσʔλΛͬͯLSTMΛ pre-train 2. ͦͷޙ1ͭͷࢠͰfine-tuningʢ10छྨͷࢠͰ࣮ݧʣ 3. SMILESΛੜ • Retain correct SMILES • Remove duplicates • Remove undesirable functional groups 4. ྨࣅͷߴ͍ࢠΛબͿ ༩͑ͨࢠͱྨࣅͷߴ͍ࢠΛੜ Awale et el. (2018)
ͦͷଞʢ༩͑ͨࢠͱྨࣅͷߴ͍ࢠΛੜʣ 60 Drug Analogs from Fragment Based Long Short-Term Memory
Generative Neural Networks Awale et el. (2018)
ੜϞσϧͷੑೳධՁ 61
ੜϞσϧͷධՁͷ͠͞ 62 Karras et al. (2018) • ఆੑతʹྑͦ͞͏ͳ͜ͱ͔Δ͕ɺఆྔతʹධՁ͢Δ͜ͱ͕͍͠ • Խ߹ͷ߹ఆੑతʹධՁ͢Δ͜ͱإը૾ͳͲΑΓ͍͠
ੜϞσϧͷϕϯνϚʔΫ 63 • ͦΕͧΕͷจͰҟͳΔσʔληοτʢChEMBL, ZINC, QM9ͳͲʣɺҟͳΔϝτϦΫεΛ༻͍ͯ͠ΔͨΊൺֱ͕ ͍͠ঢ়گ • ·ͨɺൺֱʹ༻͍͍ͯΔϝτϦΫεͷछྨेͰͳ͍
#SPXOFUBM (VBDB.PM %JTUSJCVUJPO-FBSOJOHϕϯνϚʔΫ 64 • Distribution-learningϕϯνϚʔΫͷత • ܇࿅σʔλͷΛөͯ͠Λ͏·͘࠶ݱͰ͖͍ͯΔ͔ΛධՁ •
͜ͷλεΫ͕͏·͘͜ͳͤΔΑ͏ʹͳΔͱɺԽ߹ͷಛΛ͏·͘ͱΒ͑ΒΕΔΑ͏ʹͳ͍ͬͯΔͣͰɺgoal-directed taskʹཱͭͱߟ͑ΒΕΔ • Validity • ੜ͞ΕͨԽ߹ͷ͏ͪͲΕ͘Β͍ͷׂ߹͕༗ޮͰ͋Δ͔ • ༗ޮ͔Ͳ͏͔RDKitͰνΣοΫ • Uniqueness • ॏෳΛνΣοΫɻϢχʔΫͳԽ߹ͷׂ߹ • Novelty • ৽نੑɻ܇࿅σʔλʹଘࡏ͠ͳ͍Խ߹ͷׂ߹ • Frechet ChemNet Distance (FCD) • ੜ׆ੑ༧ଌͰֶशͨ͠ChemNetͷಛΛ͍ɺ܇࿅σʔλͷͱͲΕ͘Β͍͍͔ۙΛൺֱ͢Δࢦඪ • ը૾ͰੜϞσϧͷੑೳΛൺֱ͢ΔͨΊʹFrechet Inception Distance (FID)ͱ͍͏ࢦඪ͕ΘΕΔ͕FCDͦͷԽ߹൛ • KL Divergence • 2ͭͷ֬ͷࠩΛଌΔͨΊͷࢦඪ • ཧԽֶతಛΛॏࢹ
(PBM%JSFDUFEϕϯνϚʔΫʢNPMFDVMBSPQUJNJ[BUJPOʣ 65 • Goal-DirectedϕϯνϚʔΫͷత • ಛఆͷείΞΛ࠷େԽ͢Δͱ͍͏ઃఆͰධՁ • Similarity • ྨࣅੑɻ܇࿅σʔλ͔ΒऔΓআ͔ΕͨλʔήοτʹͲΕ͘Β͍͚ۙͮΒΕΔ͔
• Rediscovery • ্هͱࣅ͍ͯΔ͕similarityͰͳ͘ɺશ͘ಉ͡ࢠΛੜͰ͖Δ͔ • ͪ͜ΒશҰகΛඞཁͱ͢Δ • Isomers • ྫ͑C7H8N2O2ͷΑ͏ͳࢠʹରͯ͠ͲΕ͘Β͍ҟੑମΛੜͰ͖Δ͔ • ༀͱతʹؔͳ͍͕ϞσϧͷॊೈੑΛධՁ • Median molecules • ෳͷࢠͱͷsimilarityΛಉ࣌ʹ࠷େԽ
.FBTVSJOH$PNQPVOE2VBMJUZ 66 • Measuring Compound Qualityͷత • ઌߦݚڀͷde novo design
algorithmʹΑͬͯੜ͞ΕͨԽ߹ෆ҆ఆɺԠੑ͕ߴ͍ɺ߹͕ࠔɺmedicinal chemist͕ݟΔ ͱ͓͔͍͠ͷ͕͋ΔՄೳੑ͕͋Δ • ͦͷͨΊɺ·ͱͳԽ߹Ͱ͋Δ͔ΛνΣοΫ͢Δඞཁ͕͋Δ • Medicinal chemist͕࣋ͭݟΛͯ͢ϧʔϧԽͯ͠νΣοΫ͢Δ͜ͱ͍͠ • ͜͜Ͱrd_filterΛద༻ • https://github.com/PatWalters/rd_filters
࣮ݧ݁Ռɿ%JTUSJCVUJPOMFBSOJOHϕϯνϚʔΫ 67 • Random samplerɿChEMBL͔Βऔ͖͍ͬͯͯΔ͚ͩͳͷͰഁ͍ͯ͠ΔԽ߹ͳ͘ɺvalidity100%ɻ͔͠͠ɺnoveltyθϩ • SMILES LSTMɿશମతʹྑ͍ • Graph
MCTSɿׂͱྑ͍͕ɺKLͱFCD͕ѱ͍ • AAEɿFCDҎ֎ྑ͍ • ORGANɿશମతʹѱ͍ • VAEɿશମతʹྑ͍
࣮ݧ݁Ռɿ(PBMEJSFDUFEϕϯνϚʔΫ 68 • Best of Data Set • ܇࿅σʔλͷத͔Β࠷είΞͷߴ͍Խ߹ΛબΜͩ߹ɻ ࠷ݶ͑ͳ͚ΕͳΒͳ͍ࢦඪɻ
• Graph GA • Ұ൪ྑ͍݁Ռ • SMILES LSTM • Graph GAͱ΄΅ಉͷྑ͍݁Ռ • ͦͷଞϞσϧ • Graph GAͱSMILES LSTMʹൺΔͱ໌Β͔ʹѱ͍݁Ռ
࣮ݧ݁Ռɿ$PNQPVOE2VBMJUZ.FBTVSFNFOU 69 • Goal-directedͳλεΫʹ͓͍ͯੜ͞ΕͨԽ߹Λrd_filterͰΫΦ ϦςΟʔνΣοΫ • SMILES LSTM͕໌Β͔ʹྑ͍݁Ռ • SMILES
LSTMͰ·ͣpre-training͕͋ΓɺͦΕ͔Β֤είΞͷ࠷ େԽΛߦ͏ͱ͍͏ྲྀΕʹͳ͍ͬͯΔɻPre-trainingͷϑΣʔζͰԽ߹ ͱͯ͠ॏཁͳಛΛ͏·ֶ͘शͰ͖ͨͷͩͱߟ͑ΒΕΔɻ • ҰํɺGraph GA͋·Γྑ͘ͳ͍݁ՌɻࣄલࣝΛ࣋ͭ͜ͱͳ͘ ͍͖ͳΓείΞΛ࠷େԽ͠Α͏ͱ͢Δ෦ʹ͕͋Γͦ͏ɻ • Goal-directedϕϯνϚʔΫͰSMILES LSTMͱGraph GAಉ ͷ݁ՌͩͬͨͷͰɺSMILES LSTMΛͬͨํ͕ྑ͍ɻ
3FTUSJDUFE&MJY *OD 1ÖMTUFSM8BDIJOHFS -'.PM("/ 70 • Validity, uniqueness, novelty͕ྑ͘ΘΕΔ͕͋·ΓΑ͍ϝτϦΫεͰͳ͍
• ϊʔυͱΤοδΛϥϯμϜʹબͿϞσϧʢvalencyߟྀʣ͕ྑ͘ݟ͑ͯ͠·͏ • ֶशσʔλͱࣅ͍ͯͯԽֶతʹҙຯͷ͋Δࢠ͕ੜ͞Ε͍ͯΔ͔ߟྀ͞Εͯ ͍ͳ͍
ࠓޙͷൃలͷํੑ 71
3FTUSJDUFE&MJY *OD .VMUJPCKFDUJWFPQUJNJ[BUJPO (VJNBSBFTFUBM 03("/ 72 • Druglikeness, synthesizability,
solubilityͰަޓʹֶश͢Δ͜ͱʹΑΓ3ͭͷಛੑΛ࠷దԽ • 3ͭ࠷దԽͯͦ͠ΕͧΕ1͚ͭͩΛ࠷దԽͨ࣌͠ʹ͍ۙ݁Ռ
3FTUSJDUFE&MJY *OD .VMUJPCKFDUJWFPQUJNJ[BUJPO ;IPVFUBM .PM%2/ 73 • DQNʹΑΓ࠷దԽΛߦ͏ੜϞσϧ •
SimilarityͱQED (drug-likeness) Λಉ࣌ʹ࠷దԽ͢Δ࣮ݧΛߦ͍ͬͯΔ
3FTUSJDUFE&MJY *OD σʔληοτͳ͠ 1VSF3- .PM%2/ ;IPVFUBM 74 • ڧԽֶशΛར༻͢Δ͜ͱʹΑΓσʔληοτͳ͠Ͱֶश
• Pre-train͠ͳ͍ͨΊ෯͍୳ࡧ͕Մೳ
3FTUSJDUFE&MJY *OD ߹ܦ࿏ߟྀɹ#SBETIBXFUBM .PMFDVMF$IFG 75 Encoder Decoder ߹ܦ࿏ߟྀͨ͠ϞσϧɻԠͱੜͷ྆ํΛग़ྗɻ ԠΛॱʹग़ྗɻԠطͷͷ͔ΒબΕΔɻ
ͦͷޙreaction predictorʹΑΓੜʹɻ Graph neural networkʹΑΓԠͷembeddingΛಘΔ
&MJY *OD IUUQTFMJYJODDPN 76