Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
生成モデルを中心としたAI創薬最前線 / Elix CBI 2019
Search
Elix
October 22, 2019
Technology
4
5.2k
生成モデルを中心としたAI創薬最前線 / Elix CBI 2019
AI創薬で利用される様々な生成モデルについてまとめています。CBI学会2019での講演スライドです。
Elix
October 22, 2019
Tweet
Share
More Decks by Elix
See All by Elix
SynthFormer: A Customizable Framework for Virtual Synthesis-Based Molecule Generation, Elix, CBI2024
elix
0
10
Optimization of Generator Reward Function Settings for Non-covalent KRAS Inhibitors, Elix, CBI2024
elix
0
25
Open Molecule Generator: A Multipurpose Molecule LLM, Elix, CBI2024
elix
0
8
Elix, CBI2024, スポンサードセッション, Molecular Glue研究の展望:近年の進展とAI活用の可能性
elix
0
76
Elix, CBI2024, ランチョンセミナー, 創薬における連合学習の応用
elix
0
8
Elix, 第1回 AIDD Wednesday, ⼩規模データセットを⽤いた 予測モデルの訓練について
elix
0
480
Molecular Generation of Non-covalent KRAS Inhibitor Candidates Using Machine Learning on Elix Discovery™, Elix, 8th Autumn School of Chemoinformatics, Nara
elix
0
170
Elix, CBI 2023, ランチョンセミナー, 大規模言語モデルの基本から最前線へ
elix
0
330
Efficient and Scalable Framework for Activity Prediction with kMol, Elix, CBI 2023
elix
0
150
Other Decks in Technology
See All in Technology
Terraform Stacks入門 #HashiTalks
msato
0
350
Platform Engineering for Software Developers and Architects
syntasso
1
520
開発生産性を上げながらビジネスも30倍成長させてきたチームの姿
kamina_zzz
2
1.7k
100 名超が参加した日経グループ横断の競技型 AWS 学習イベント「Nikkei Group AWS GameDay」の紹介/mediajaws202411
nikkei_engineer_recruiting
1
170
Can We Measure Developer Productivity?
ewolff
1
150
SREが投資するAIOps ~ペアーズにおけるLLM for Developerへの取り組み~
takumiogawa
1
180
AWS Lambdaと歩んだ“サーバーレス”と今後 #lambda_10years
yoshidashingo
1
170
【Pycon mini 東海 2024】Google Colaboratoryで試すVLM
kazuhitotakahashi
2
500
Making your applications cross-environment - OSCG 2024 NA
salaboy
0
180
これまでの計測・開発・デプロイ方法全部見せます! / Findy ISUCON 2024-11-14
tohutohu
3
370
IBC 2024 動画技術関連レポート / IBC 2024 Report
cyberagentdevelopers
PRO
0
110
OCI Vault 概要
oracle4engineer
PRO
0
9.7k
Featured
See All Featured
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
356
29k
Designing for humans not robots
tammielis
250
25k
We Have a Design System, Now What?
morganepeng
50
7.2k
Put a Button on it: Removing Barriers to Going Fast.
kastner
59
3.5k
Building Your Own Lightsaber
phodgson
103
6.1k
What's in a price? How to price your products and services
michaelherold
243
12k
Embracing the Ebb and Flow
colly
84
4.5k
Why You Should Never Use an ORM
jnunemaker
PRO
54
9.1k
The Power of CSS Pseudo Elements
geoffreycrofte
73
5.3k
Build your cross-platform service in a week with App Engine
jlugia
229
18k
Why Our Code Smells
bkeepers
PRO
334
57k
Rebuilding a faster, lazier Slack
samanthasiow
79
8.7k
Transcript
ੜϞσϧΛத৺ͱͨ͠"*ༀ࠷લઢ גࣜձࣾ&MJY $&0݁৳࠸ 2019/10/22 1 $#*ֶձେձ
࣍ 2 • ΠϯτϩμΫγϣϯ • ཁૉٕज़ • Fingerprint, SMILESϕʔεͷϞσϧ •
άϥϑϕʔεͷϞσϧ • ੜϞσϧͷར༻๏ • ੜϞσϧͷੑೳධՁ • ࠓޙͷൃలͷํੑ • Elix Chem
ΠϯτϩμΫγϣϯ 3
3FTUSJDUFE&MJY *OD ࢠઃܭ 4 Sanchez-Lengeling et al. (2018) ࣮ݧ/γϛϡϨʔγϣϯ ༧ଌϞσϧ
ੜϞσϧ Drug-likeͳࢠʙ10^60ݸ
3FTUSJDUFE&MJY *OD Α͘༻͍ΒΕΔදݱํ๏ 5 Fingerprint SMILES Graph Meter & Coote
(2019) Schwalbe-Koda & Gómez-Bombarelli (2019)
ಛʹΑ͘༻͍ΒΕΔදݱํ๏ 6 • Fingerprint • ༷ʑͳछྨ͕ଘࡏ͢Δ͕ECFPͳͲ͕ಛʹ༗໊ • ֤Ϗοτ͕ಛఆͷߏʹରԠ • Collision͕ى͖ͯ͠·͏Մೳੑ͕͋Δ
• InvertibleͰͳ͍ • SMILES • Խ߹Λจࣈྻͱͯ͠දݱ • ҰͭͷԽ߹ʹରͯ͠Ұҙʹܾ·Βͳ͍ • Θ͔ͣʹҟͳΔԽ߹SMILESͱͯ͠େ͖͘มΘͬͯ͠·͏߹ ʢԽ߹ͷsimilarityΛදݱ͢ΔΑ͏ʹσβΠϯ͞Ε͍ͯͳ͍ʣ • Graph • Խ߹ΛϊʔυΛΤοδͱͯ͠දݱ • ࣗવͳදݱํ๏ʹࢥ͑Δ https://arxiv.org/abs/1802.04364 https://arxiv.org/abs/1903.04388
༷ʑͳ༧ଌϞσϧ 7 Wu et al. (2017) άϥϑϕʔεͷϞσϧͷํ͕ྑ͍݁Ռ͕Ͱ͋Δ͜ͱ͕ଟ͍
ੜϞσϧͷϕʔεͱͳΔΞʔΩςΫνϟ 8 Sanchez-Lengeling&Aspuru-Guzik (2018)
༷ʑͳΈ߹Θͤ 9 Schwalbe-Koda & Gómez-Bombarelli (2019)
3FTUSJDUFE&MJY *OD ࠷৽ͷੜϞσϧҰཡ 10 Elton et al. (2019)
Α͘ΘΕΔެ։σʔληοτҰཡ 11 https://arxiv.org/abs/1903.04388 Elton et al. (2019)
ཁૉٕज़ 12
3FTUSJDUFE&MJY *OD (FOFSBUJWF"EWFSTBSJBM/FUXPSLT ("/T 13 Karras et al. (2018)
3FTUSJDUFE&MJY *OD (FOFSBUJWF"EWFSTBSJBM/FUXPSLT ("/T 14 ੜϞσϧͷҰछ Generator (G): ِͷը૾Λੜ͠ɺDΛὃͦ͏ͱ͢Δ Discriminator
(D): ຊͷը૾ͱِͷը૾Λݟ͚Α͏ͱ͢Δ Noise G D ຊ or ِʁ ِͷը૾ ʢੜը૾ʣ ຊͷը૾ ʢTraining setʣ Karras et al. (2017)
3FTUSJDUFE&MJY *OD (FOFSBUJWF"EWFSTBSJBM/FUXPSLT ("/T 15
3FTUSJDUFE&MJY *OD "VUPFODPEFST 16
3FTUSJDUFE&MJY *OD "VUPFODPEFST 17
3FTUSJDUFE&MJY *OD 7BSJBUJPOBM"VUPFODPEFST 7"&T 18 reconstruction ਖ਼ن͔ΒͷͣΕ
3FTUSJDUFE&MJY *OD 3FDVSSFOU/FVSBM/FUXPSLT 3//T 19 Segler et al. (2017)
3FTUSJDUFE&MJY *OD (SBQI3FQSFTFOUBUJPOT 20 Peter et al. (2018) https://www.businessinsider.com/explainer-what-exactly-is-the-social-graph-2012-3
3FTUSJDUFE&MJY *OD (SBQI/FVSBM/FUXPSLT 21 Peter et al. (2018)
3FTUSJDUFE&MJY *OD (SBQI/FVSBM/FUXPSLT 22 Peter et al. (2018)
3FTUSJDUFE&MJY *OD (SBQI$POWPMVUJPOBM/FUXPSLT 23 2D Convolution Graph Convolution Graph Convolutional
Networks Wu et al. (2019)
3FTUSJDUFE&MJY *OD 3FJOGPSDFNFOU-FBSOJOH 3- ڧԽֶश 24 Sutton & Barto (2018)
Mnih et al. (2015)
3FTUSJDUFE&MJY *OD 3FJOGPSDFNFOU-FBSOJOH 3- ڧԽֶश 25 Sutton & Barto (2018)
Mnih et al. (2015) ex) QED, logP
3FTUSJDUFE&MJY *OD 5SBOTGFS-FBSOJOHʢసҠֶशʣ 26 ඇৗʹେ͖ͳϥϕϧͳ͠σʔλ গྔͷڭࢣσʔλ RDKitͰlogPͳͲΛΛܭࢉ͠ɺ pre-train Goh et
al. (2017)
'JOHFSQSJOU 4.*-&4ϕʔεͷϞσϧ 27
3FTUSJDUFE&MJY *OD .PMFDVMFSFQSFTFOUBUJPO 28 Fingerprint SMILES Graph Meter & Coote
(2019) Schwalbe-Koda & Gómez-Bombarelli (2019)
3FTUSJDUFE&MJY *OD ,BEVSJOFUBM 29 • ೖग़ྗ • Binary fingerprints
(MACCS) • Log concentration (LCONC) • தؒ • 5ͭͷχϡʔϩϯͰߏ • 1ͭGrowth Inhibition percentage (GI) • Γ4ͭਖ਼نʹۙͮ͘Α͏ʹֶश The cornucopia of meaningful leads: Applying deep AAEs for new molecule development in oncology
3FTUSJDUFE&MJY *OD ,BEVSJOFUBM 30 σʔληοτ Λ༻ҙֶ͠श Ϟσϧ͔Β αϯϓϧ நग़
ࣅͨಛͷ Խ߹Λ୳ࡧ • NCI-60, MCF-7 • 6252ͷԽ߹ • Fingerprint, LCONC, GI͔ΒΔσʔλ •640ݸͷϕΫτϧ ʢԾతͳԽ߹ ʣΛαϯϓϧ •LCONC < -5.0 M ͷͷΛநग़ •32ݸͷϕΫτϧΛಘΔ •ࣅͨಛͷԽ߹Λ PubChem͔Β୳͠ ग़͢ ࣮ݧͷྲྀΕ
3FTUSJDUFE&MJY *OD ,BEVSJOFUBM 31 • PubChemɿ7200ສͷԽ߹ • ੜͨ͠32ݸͷϕΫτϧͱࣅͨಛΛ࣋ͭԽ߹ ΛPubChem͔Βநग़
• ࠷ऴతʹ69ݸͷԽ߹Λಘͨ • طʹ߅͕Μࡎͱͯ͠ΒΕ͍ͯΔͷ͕ෳ • 13ݸಛڐ͕औΒΕ͍ͯΔͷ • ΄ͱΜͲΞϯτϥαΠΫϦϯܥ ʢݱࡏ࠷ޮՌతͳ߅͕Μࡎʣ : PubChem ੨: ֶशσʔλ : ੜϕΫτϧʢԾతͳԽ߹ʣ ࣮ݧ݁Ռ
3FTUSJDUFE&MJY *OD .PMFDVMFSFQSFTFOUBUJPO 32 Fingerprint SMILES Graph Meter & Coote
(2019) Schwalbe-Koda & Gómez-Bombarelli (2019)
3FTUSJDUFE&MJY *OD 4FHMFSFUBM 33 • LSTMʹΑΓԽ߹Λੜ • ೖग़ྗSMILES •
ԼهΛ܁Γฦ͢ʢHillclimb-MLEͱݺΕΔʣ 1. LSTMͰֶशɾαϯϓϧ 2. Target filtering modelͰϑΟϧλϦϯά ʢػցֶशҎ֎Մʣ Generating Focussed Molecule Libraries for Drug Discovery with Recurrent Neural Networks
3FTUSJDUFE&MJY *OD (PNF[#PNCBSFMMJFUBM $7"& 34 • RNN+VAEʹΑΓԽ߹Λੜ • ೖग़ྗSMILES
• λʔήοτͱ͢Δಛੑ͕େ͖͍latent code Λݟ͚ͭΔ Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules
3FTUSJDUFE&MJY *OD ,VTOFSFUBM (7"& 35 Grammar Variational Autoencoder Encoder
Decoder จ๏ʢcontext free grammarʣΛߟྀͯ͠ੜ
3FTUSJDUFE&MJY *OD :BOHFUBM $IFN54 36 MCTSͱRNNʹΑΓSMILESΛੜ Penalized logPΛ࠷దԽ
3FTUSJDUFE&MJY *OD 1PQPWBFUBM 3F-FB4& 37 https://arxiv.org/abs/1711.10907 Popova et al.
(2017) • SMILESϕʔεͷੜϞσϧ • ඪಛੑΛ࠷దԽ͢ΔͨΊʹڧԽֶशͱΈ߹Θͤ ͍ͯΔ • ௨ৗrewardΛRDKitͰܭࢉ͢Δ͜ͱ͕ଟ͍͕ɺ SMILESϕʔεͷ༧ଌϞσϧʹΑΓrewardΛܭࢉͯ͠ ͍Δ • ͜ΕʹΑΓRDKitͰܭࢉͰ͖ͳ͍ಛੑ࠷దԽ
3FTUSJDUFE&MJY *OD (VJNBSBFTFUBM 03("/ 38 • SeqGANͱ͍͏sequential data༻ͷRNNϕʔεͷGAN͕جʹͳ͍ͬͯΔ •
DruglikenessͳͲͷಛੑΛ࠷దԽ͢ΔͨΊʹڧԽֶशΛಋೖ
3FTUSJDUFE&MJY *OD "MM4.*-&47"& 39 • άϥϑܥϞσϧ • 3ʙ7͘Β͍ͷͷ͕ଟ͍ • 1ʹ͖ͭ1ͭͷڑʹ͋Δใ͕
• ZINC250kʹؚ·ΕΔࢠ • ฏۉܘ͕11.1 • ࠷େܘ24 • ࢠશମʹใΛ͖͑Δ͜ͱ͕Ͱ͖ͳ͍ • RNNͰ͍ใΛ͑Δ • SMILESҰҙʹܾ·Βͳ͍ • ෳͷSMILESΛೖྗʹར༻ Alperstein et al. (2019)
άϥϑϕʔεͷϞσϧʢPOFTIPUܕʣ 40
3FTUSJDUFE&MJY *OD .PMFDVMFSFQSFTFOUBUJPO 41 Fingerprint SMILES Graph Meter & Coote
(2019) Schwalbe-Koda & Gómez-Bombarelli (2019)
3FTUSJDUFE&MJY *OD %F$BP,JQG .PM("/ 42 • DiscriminatorͰgraph convΛར༻͢Δ͜ͱʹΑΓorder invariantʹ
• ֤ಛੑΛ࠷దԽ͢Δ͜ͱ͏·͍͍ͬͯ͘ΔΑ͏ʹݟ͑Δ • ͔͠͠ɺuniqueness͕2%ఔͱඇৗʹ͍ʢGoal-directedͳ߹ʣ • GANRLͰग़ྗΛଟ༷ʹ͢ΔΑ͏ͳ੍͕ͳ͍ͨΊ • ҰൃͰάϥϑΛੜ͢ΔͨΊܭࢉ͕͍࣌ؒ • QM9Ͱ࣮ݧɻߋʹେ͖ͳάϥϑʹద༻͢Δͷͦ͠͏ άϥϑΛҰൃͰੜ͢ΔλΠϓͷϞσϧɻGANͱڧԽֶशར༻ɻ
3FTUSJDUFE&MJY *OD 1ÖMTUFSM8BDIJOHFS -'.PM("/ 43 • MolGANͷΑ͏ʹάϥϑΛҰൃͰੜ͢ΔλΠϓɻ͜ͷϞσϧͰvalencyʹؔ͢Δ੍Λಋೖ • Reconstruction
lossΛexplicitʹܭࢉ͢Δ͜ͱ͕ͳ͘ɺgraph isomorphism problemΛճආ • ී௨ͷGANͱҧͬͯencoderؚΉߏʹͳ͍ͬͯͯɺlatent spaceͰsimilarity͕ߴ͍ࢠΛ୳͢͜ͱ͕༰қ • QM9Ͱ࣮ݧ
άϥϑϕʔεͷϞσϧʢSFDVSSFOUܕʣ 44
3FTUSJDUFE&MJY *OD -JFUBM 45 Learning Deep Generative Models of
Graphs SMILESͰͳ͘άϥϑͱͯ͠ϊʔυͱΤοδΛॱʹՃ GrammarVAEͳͲΑΓྑ͍݁Ռ
3FTUSJDUFE&MJY *OD +JOFUBM +57"& 46 Junction Tree Variational Autoencoder
for Molecular Graph Generation • ୯७ʹϊʔυΛҰͭͻͱͭՃ͍ͯ͘͠Ξϓϩʔν͕ߟ͑ ΒΕΔ • ͔͠͠ɺ͜Εͩͱ࣮ࡍʹଘࡏ͠ͳ͍Խ߹͕ੜ͞Εͯ͠ ·͏Մೳੑ͕͋Δ • ͦ͜ͰΫϥελ͝ͱʹੜ͍ͯ͘͠
3FTUSJDUFE&MJY *OD +JOFUBM +57"& 47 ࣄલʹఆ͓͍ٛͯͨ͠ΫϥελΛͬ ͯπϦʔߏʹղ EmbeddingΛͱʹ৽ͨͳπϦʔߏΛߏங ʢϊʔυΛҰͭͻͱͭՃ͍ͯ͘͠ํࣜʣ
Neural message passing ʹΑΓΤϯίʔυ ಘΒΕͨgraph embeddingͱπϦʔߏͷ ྆ํΛͬͯ࠷ऴతͳԽ߹Λੜ ʢΫϥελΛͲ͏Έ߹ΘͤΔ͔ͱ͍͏ࣗ༝ ͕͋ΔͨΊ͜ͷεςοϓ͕ඞཁʣ GRUʹΑΓΤϯίʔυ
3FTUSJDUFE&MJY *OD :PVFUBM ($1/ 48 Graph Convolutional Policy Network
for Goal-Directed Molecular Graph Generation ΤοδΛҰͭͣͭՃ͢Δ͜ͱͰάϥϑΛੜ GANͱڧԽֶशΛΈ߹ΘͤͨϞσϧ
3FTUSJDUFE&MJY *OD -JFUBM .PM.1.PM3// 49 QEDSAscoreͷ conditional codeΛೖΕΔ λʔήοτͱ͢ΔಛੑͳͲͰcondition͢ΔλΠϓͷϞσϧ
3FTUSJDUFE&MJY *OD ("/ͱ7"&ͷൺֱ 50 GAN • ϝϦοτ • ͏·͘νϡʔχϯάͰ͖Δͱྑ͍݁Ռ •
Reconstruction lossΛܭࢉ͠ͳͯ͘ྑ͍ʢgraph isomorphism problemΛճආʣ • σϝϦοτ • ϋΠύʔύϥϝʔλνϡʔχϯά͕ࠔ • Mode-collapseʢಉ͡ͷ͔Γੜͯ͠͠·͏ʣ VAE • ϝϦοτ • GANΑΓ҆ఆͯ͠ಈ͘ • ϋΠύʔύϥϝʔλνϡʔχϯάָ͕ • Mode-collapseى͖ʹ͍͘ • σϝϦοτ • Reconstruction lossΛܭࢉ͢ΔͨΊgraph isomorphism problem͕ग़ͯ͘Δ
3FTUSJDUFE&MJY *OD 'JOHFSQSJOU 4.*-&4 (SBQIͷൺֱ 51 • Fingerprintϕʔε • FingerprintinvertibleͰͳ͍ͨΊ͍ͮΒ͍
ʢͦͷͨΊ΄ͱΜͲݟ͔͚ͳ͍ʣ • SMILESϕʔε • ҆ఆͨ͠ੑೳ • Validity͕͘ͳͬͯ͠·͏ • Fragment-base generation͕͍͠ • Graphϕʔεʢone-shotܕʣ • ߴ • Heavy atom͕9ҎԼͷখ͞ͳࢠ͔͠࡞Ε͍ͯͳ͍ • Validityuniqueness͕͍ • Graphϕʔεʢrecurrentܕʣ • Validity͕ߴ͍ • ϊʔυͱΤοδͷorderingͷ
ੜϞσϧͷར༻๏ 52
3FTUSJDUFE&MJY *OD .PMFDVMFHFOFSBUJPO 53 Distribution Learning Predefined Scaffold Molecule Optimization
%JTUSJCVUJPO-FBSOJOH 54 https://github.com/NVlabs/ffhq-dataset Karras et al. (2018) ֶशσʔλ ੜ͞Εͨσʔλ
"SPVT1PVTFUBM &YQMPSJOHUIF(%#DIFNJDBMTQBDFVTJOHEFFQHFOFSBUJWFNPEFMT 55 • GDB-13: 13ݸ·Ͱͷheavy atomͰߏ͞ΕΔ9.75ԯࢠ͔ΒͳΔ σʔληοτ
• ͦͷ͏ͪͷ0.1%ʹ૬͢Δ100ສࢠΛֶͬͯश • SMILESΛGRUʹ༩͑ΔγϯϓϧͳϞσϧ • 20ԯࢠΛαϯϓϧ͢Δ͜ͱʹΑΓGDB-13ͷ68.9%Λ෮ݩ͢Δ͜ ͱ͕Ͱ͖ͨ • GDB-13ʹؚ·ΕΔԽ߹ͷಛ͔ͭΉ͜ͱ͕Ͱ͖ͨ • SMILESͷه๏ʹىҼͯ͠ੜͮ͠Β͍λΠϓͷࢠ͕͋Δ͜ͱ ͔ͬͨʢringΛଟؚ͘ΉͷͳͲʣ
.PMFDVMBSPQUJNJ[BUJPO 56 Choi et al. (2017)
.PMFDVMBSPQUJNJ[BUJPO 57 Latent spaceΛ୳ࡧ • Gradient ascent • ϕΠζ࠷దԽ ڧԽֶश
Hillclimb-MLE ʢϑΟϧλϦϯάΛ܁Γฦֶͯ͠शʣ Conditioning code ʢ݅ೖྗͱͯ͠ѻ͏ʣ
.PMFDVMBSPQUJNJ[BUJPOʢಛఆͷ෦ߏ͔Βελʔτʣ 58 Penalized logPΛ࠷దԽ
ͦͷଞʢ༩͑ͨࢠͱྨࣅͷߴ͍ࢠΛੜʣ 59 Drug Analogs from Fragment Based Long Short-Term Memory
Generative Neural Networks 1. ChEMBL, DrugBank, FDB17ͷσʔλΛͬͯLSTMΛ pre-train 2. ͦͷޙ1ͭͷࢠͰfine-tuningʢ10छྨͷࢠͰ࣮ݧʣ 3. SMILESΛੜ • Retain correct SMILES • Remove duplicates • Remove undesirable functional groups 4. ྨࣅͷߴ͍ࢠΛબͿ ༩͑ͨࢠͱྨࣅͷߴ͍ࢠΛੜ Awale et el. (2018)
ͦͷଞʢ༩͑ͨࢠͱྨࣅͷߴ͍ࢠΛੜʣ 60 Drug Analogs from Fragment Based Long Short-Term Memory
Generative Neural Networks Awale et el. (2018)
ੜϞσϧͷੑೳධՁ 61
ੜϞσϧͷධՁͷ͠͞ 62 Karras et al. (2018) • ఆੑతʹྑͦ͞͏ͳ͜ͱ͔Δ͕ɺఆྔతʹධՁ͢Δ͜ͱ͕͍͠ • Խ߹ͷ߹ఆੑతʹධՁ͢Δ͜ͱإը૾ͳͲΑΓ͍͠
ੜϞσϧͷϕϯνϚʔΫ 63 • ͦΕͧΕͷจͰҟͳΔσʔληοτʢChEMBL, ZINC, QM9ͳͲʣɺҟͳΔϝτϦΫεΛ༻͍ͯ͠ΔͨΊൺֱ͕ ͍͠ঢ়گ • ·ͨɺൺֱʹ༻͍͍ͯΔϝτϦΫεͷछྨेͰͳ͍
#SPXOFUBM (VBDB.PM %JTUSJCVUJPO-FBSOJOHϕϯνϚʔΫ 64 • Distribution-learningϕϯνϚʔΫͷత • ܇࿅σʔλͷΛөͯ͠Λ͏·͘࠶ݱͰ͖͍ͯΔ͔ΛධՁ •
͜ͷλεΫ͕͏·͘͜ͳͤΔΑ͏ʹͳΔͱɺԽ߹ͷಛΛ͏·͘ͱΒ͑ΒΕΔΑ͏ʹͳ͍ͬͯΔͣͰɺgoal-directed taskʹཱͭͱߟ͑ΒΕΔ • Validity • ੜ͞ΕͨԽ߹ͷ͏ͪͲΕ͘Β͍ͷׂ߹͕༗ޮͰ͋Δ͔ • ༗ޮ͔Ͳ͏͔RDKitͰνΣοΫ • Uniqueness • ॏෳΛνΣοΫɻϢχʔΫͳԽ߹ͷׂ߹ • Novelty • ৽نੑɻ܇࿅σʔλʹଘࡏ͠ͳ͍Խ߹ͷׂ߹ • Frechet ChemNet Distance (FCD) • ੜ׆ੑ༧ଌͰֶशͨ͠ChemNetͷಛΛ͍ɺ܇࿅σʔλͷͱͲΕ͘Β͍͍͔ۙΛൺֱ͢Δࢦඪ • ը૾ͰੜϞσϧͷੑೳΛൺֱ͢ΔͨΊʹFrechet Inception Distance (FID)ͱ͍͏ࢦඪ͕ΘΕΔ͕FCDͦͷԽ߹൛ • KL Divergence • 2ͭͷ֬ͷࠩΛଌΔͨΊͷࢦඪ • ཧԽֶతಛΛॏࢹ
(PBM%JSFDUFEϕϯνϚʔΫʢNPMFDVMBSPQUJNJ[BUJPOʣ 65 • Goal-DirectedϕϯνϚʔΫͷత • ಛఆͷείΞΛ࠷େԽ͢Δͱ͍͏ઃఆͰධՁ • Similarity • ྨࣅੑɻ܇࿅σʔλ͔ΒऔΓআ͔ΕͨλʔήοτʹͲΕ͘Β͍͚ۙͮΒΕΔ͔
• Rediscovery • ্هͱࣅ͍ͯΔ͕similarityͰͳ͘ɺશ͘ಉ͡ࢠΛੜͰ͖Δ͔ • ͪ͜ΒશҰகΛඞཁͱ͢Δ • Isomers • ྫ͑C7H8N2O2ͷΑ͏ͳࢠʹରͯ͠ͲΕ͘Β͍ҟੑମΛੜͰ͖Δ͔ • ༀͱతʹؔͳ͍͕ϞσϧͷॊೈੑΛධՁ • Median molecules • ෳͷࢠͱͷsimilarityΛಉ࣌ʹ࠷େԽ
.FBTVSJOH$PNQPVOE2VBMJUZ 66 • Measuring Compound Qualityͷత • ઌߦݚڀͷde novo design
algorithmʹΑͬͯੜ͞ΕͨԽ߹ෆ҆ఆɺԠੑ͕ߴ͍ɺ߹͕ࠔɺmedicinal chemist͕ݟΔ ͱ͓͔͍͠ͷ͕͋ΔՄೳੑ͕͋Δ • ͦͷͨΊɺ·ͱͳԽ߹Ͱ͋Δ͔ΛνΣοΫ͢Δඞཁ͕͋Δ • Medicinal chemist͕࣋ͭݟΛͯ͢ϧʔϧԽͯ͠νΣοΫ͢Δ͜ͱ͍͠ • ͜͜Ͱrd_filterΛద༻ • https://github.com/PatWalters/rd_filters
࣮ݧ݁Ռɿ%JTUSJCVUJPOMFBSOJOHϕϯνϚʔΫ 67 • Random samplerɿChEMBL͔Βऔ͖͍ͬͯͯΔ͚ͩͳͷͰഁ͍ͯ͠ΔԽ߹ͳ͘ɺvalidity100%ɻ͔͠͠ɺnoveltyθϩ • SMILES LSTMɿશମతʹྑ͍ • Graph
MCTSɿׂͱྑ͍͕ɺKLͱFCD͕ѱ͍ • AAEɿFCDҎ֎ྑ͍ • ORGANɿશମతʹѱ͍ • VAEɿશମతʹྑ͍
࣮ݧ݁Ռɿ(PBMEJSFDUFEϕϯνϚʔΫ 68 • Best of Data Set • ܇࿅σʔλͷத͔Β࠷είΞͷߴ͍Խ߹ΛબΜͩ߹ɻ ࠷ݶ͑ͳ͚ΕͳΒͳ͍ࢦඪɻ
• Graph GA • Ұ൪ྑ͍݁Ռ • SMILES LSTM • Graph GAͱ΄΅ಉͷྑ͍݁Ռ • ͦͷଞϞσϧ • Graph GAͱSMILES LSTMʹൺΔͱ໌Β͔ʹѱ͍݁Ռ
࣮ݧ݁Ռɿ$PNQPVOE2VBMJUZ.FBTVSFNFOU 69 • Goal-directedͳλεΫʹ͓͍ͯੜ͞ΕͨԽ߹Λrd_filterͰΫΦ ϦςΟʔνΣοΫ • SMILES LSTM͕໌Β͔ʹྑ͍݁Ռ • SMILES
LSTMͰ·ͣpre-training͕͋ΓɺͦΕ͔Β֤είΞͷ࠷ େԽΛߦ͏ͱ͍͏ྲྀΕʹͳ͍ͬͯΔɻPre-trainingͷϑΣʔζͰԽ߹ ͱͯ͠ॏཁͳಛΛ͏·ֶ͘शͰ͖ͨͷͩͱߟ͑ΒΕΔɻ • ҰํɺGraph GA͋·Γྑ͘ͳ͍݁ՌɻࣄલࣝΛ࣋ͭ͜ͱͳ͘ ͍͖ͳΓείΞΛ࠷େԽ͠Α͏ͱ͢Δ෦ʹ͕͋Γͦ͏ɻ • Goal-directedϕϯνϚʔΫͰSMILES LSTMͱGraph GAಉ ͷ݁ՌͩͬͨͷͰɺSMILES LSTMΛͬͨํ͕ྑ͍ɻ
3FTUSJDUFE&MJY *OD 1ÖMTUFSM8BDIJOHFS -'.PM("/ 70 • Validity, uniqueness, novelty͕ྑ͘ΘΕΔ͕͋·ΓΑ͍ϝτϦΫεͰͳ͍
• ϊʔυͱΤοδΛϥϯμϜʹબͿϞσϧʢvalencyߟྀʣ͕ྑ͘ݟ͑ͯ͠·͏ • ֶशσʔλͱࣅ͍ͯͯԽֶతʹҙຯͷ͋Δࢠ͕ੜ͞Ε͍ͯΔ͔ߟྀ͞Εͯ ͍ͳ͍
ࠓޙͷൃలͷํੑ 71
3FTUSJDUFE&MJY *OD .VMUJPCKFDUJWFPQUJNJ[BUJPO (VJNBSBFTFUBM 03("/ 72 • Druglikeness, synthesizability,
solubilityͰަޓʹֶश͢Δ͜ͱʹΑΓ3ͭͷಛੑΛ࠷దԽ • 3ͭ࠷దԽͯͦ͠ΕͧΕ1͚ͭͩΛ࠷దԽͨ࣌͠ʹ͍ۙ݁Ռ
3FTUSJDUFE&MJY *OD .VMUJPCKFDUJWFPQUJNJ[BUJPO ;IPVFUBM .PM%2/ 73 • DQNʹΑΓ࠷దԽΛߦ͏ੜϞσϧ •
SimilarityͱQED (drug-likeness) Λಉ࣌ʹ࠷దԽ͢Δ࣮ݧΛߦ͍ͬͯΔ
3FTUSJDUFE&MJY *OD σʔληοτͳ͠ 1VSF3- .PM%2/ ;IPVFUBM 74 • ڧԽֶशΛར༻͢Δ͜ͱʹΑΓσʔληοτͳ͠Ͱֶश
• Pre-train͠ͳ͍ͨΊ෯͍୳ࡧ͕Մೳ
3FTUSJDUFE&MJY *OD ߹ܦ࿏ߟྀɹ#SBETIBXFUBM .PMFDVMF$IFG 75 Encoder Decoder ߹ܦ࿏ߟྀͨ͠ϞσϧɻԠͱੜͷ྆ํΛग़ྗɻ ԠΛॱʹग़ྗɻԠطͷͷ͔ΒબΕΔɻ
ͦͷޙreaction predictorʹΑΓੜʹɻ Graph neural networkʹΑΓԠͷembeddingΛಘΔ
&MJY *OD IUUQTFMJYJODDPN 76