Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
生成モデルを中心としたAI創薬最前線 / Elix CBI 2019
Search
Elix
October 22, 2019
Technology
4
5.3k
生成モデルを中心としたAI創薬最前線 / Elix CBI 2019
AI創薬で利用される様々な生成モデルについてまとめています。CBI学会2019での講演スライドです。
Elix
October 22, 2019
Tweet
Share
More Decks by Elix
See All by Elix
Elix,第42回メディシナルケミストリーシンポジウム,ランチョンセミナー,標的タンパク分解誘導薬開発へのAI活⽤:新たなMolecular Glue Degrader創出に向けて
elix
0
16
Elix,CBI2025,スポンサードセッション,タンパク-タンパク複合体情報を活用した構造生成:TRIM21の新たなリガンド探索に向けて
elix
0
15
Elix, CBI2025,ランチョンセミナー,標的タンパク分解誘導薬開発へのAI活用:新たなMolecular Glue Degrader創出に向けて
elix
0
48
kMoL: An Open-source Machine and Federated Learning Library for Drug Discovery
elix
0
27
SynthFormer: A Customizable Framework for Virtual Synthesis-Based Molecule Generation, Elix, CBI2024
elix
0
140
Optimization of Generator Reward Function Settings for Non-covalent KRAS Inhibitors, Elix, CBI2024
elix
0
240
Open Molecule Generator: A Multipurpose Molecule LLM, Elix, CBI2024
elix
0
130
Elix, CBI2024, スポンサードセッション, Molecular Glue研究の展望:近年の進展とAI活用の可能性
elix
0
290
Elix, CBI2024, ランチョンセミナー, 創薬における連合学習の応用
elix
0
120
Other Decks in Technology
See All in Technology
ZOZOの独自性を生み出す「似合う4大要素」の開発サイクル
zozotech
PRO
0
120
SREには開発組織全体で向き合う
koh_naga
0
410
日本Rubyの会: これまでとこれから
snoozer05
PRO
5
220
JEDAI認定プログラム JEDAI Order 2026 エントリーのご案内 / JEDAI Order 2026 Entry
databricksjapan
0
170
re:Invent2025 3つの Frontier Agents を紹介 / introducing-3-frontier-agents
tomoki10
0
390
Oracle Database@Azure:サービス概要のご紹介
oracle4engineer
PRO
2
180
mairuでつくるクレデンシャルレス開発環境 / Credential-less development environment using Mailru
mirakui
5
590
TED_modeki_共創ラボ_20251203.pdf
iotcomjpadmin
0
130
【開発を止めるな】機能追加と並行して進めるアーキテクチャ改善/Keep Shipping: Architecture Improvements Without Pausing Dev
bitkey
PRO
1
120
LayerX QA Night#1
koyaman2
0
230
SREが取り組むデプロイ高速化 ─ Docker Buildを最適化した話
capytan
0
130
松尾研LLM講座2025 応用編Day3「軽量化」 講義資料
aratako
0
550
Featured
See All Featured
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
659
61k
Art, The Web, and Tiny UX
lynnandtonic
304
21k
How STYLIGHT went responsive
nonsquared
100
6k
Music & Morning Musume
bryan
46
7k
Code Review Best Practice
trishagee
74
19k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
12
1.4k
Building a A Zero-Code AI SEO Workflow
portentint
PRO
0
190
Introduction to Domain-Driven Design and Collaborative software design
baasie
1
510
Marketing to machines
jonoalderson
1
4.3k
The Organizational Zoo: Understanding Human Behavior Agility Through Metaphoric Constructive Conversations (based on the works of Arthur Shelley, Ph.D)
kimpetersen
PRO
0
200
How to Align SEO within the Product Triangle To Get Buy-In & Support - #RIMC
aleyda
1
1.3k
Leading Effective Engineering Teams in the AI Era
addyosmani
9
1.4k
Transcript
ੜϞσϧΛத৺ͱͨ͠"*ༀ࠷લઢ גࣜձࣾ&MJY $&0݁৳࠸ 2019/10/22 1 $#*ֶձେձ
࣍ 2 • ΠϯτϩμΫγϣϯ • ཁૉٕज़ • Fingerprint, SMILESϕʔεͷϞσϧ •
άϥϑϕʔεͷϞσϧ • ੜϞσϧͷར༻๏ • ੜϞσϧͷੑೳධՁ • ࠓޙͷൃలͷํੑ • Elix Chem
ΠϯτϩμΫγϣϯ 3
3FTUSJDUFE&MJY *OD ࢠઃܭ 4 Sanchez-Lengeling et al. (2018) ࣮ݧ/γϛϡϨʔγϣϯ ༧ଌϞσϧ
ੜϞσϧ Drug-likeͳࢠʙ10^60ݸ
3FTUSJDUFE&MJY *OD Α͘༻͍ΒΕΔදݱํ๏ 5 Fingerprint SMILES Graph Meter & Coote
(2019) Schwalbe-Koda & Gómez-Bombarelli (2019)
ಛʹΑ͘༻͍ΒΕΔදݱํ๏ 6 • Fingerprint • ༷ʑͳछྨ͕ଘࡏ͢Δ͕ECFPͳͲ͕ಛʹ༗໊ • ֤Ϗοτ͕ಛఆͷߏʹରԠ • Collision͕ى͖ͯ͠·͏Մೳੑ͕͋Δ
• InvertibleͰͳ͍ • SMILES • Խ߹Λจࣈྻͱͯ͠දݱ • ҰͭͷԽ߹ʹରͯ͠Ұҙʹܾ·Βͳ͍ • Θ͔ͣʹҟͳΔԽ߹SMILESͱͯ͠େ͖͘มΘͬͯ͠·͏߹ ʢԽ߹ͷsimilarityΛදݱ͢ΔΑ͏ʹσβΠϯ͞Ε͍ͯͳ͍ʣ • Graph • Խ߹ΛϊʔυΛΤοδͱͯ͠දݱ • ࣗવͳදݱํ๏ʹࢥ͑Δ https://arxiv.org/abs/1802.04364 https://arxiv.org/abs/1903.04388
༷ʑͳ༧ଌϞσϧ 7 Wu et al. (2017) άϥϑϕʔεͷϞσϧͷํ͕ྑ͍݁Ռ͕Ͱ͋Δ͜ͱ͕ଟ͍
ੜϞσϧͷϕʔεͱͳΔΞʔΩςΫνϟ 8 Sanchez-Lengeling&Aspuru-Guzik (2018)
༷ʑͳΈ߹Θͤ 9 Schwalbe-Koda & Gómez-Bombarelli (2019)
3FTUSJDUFE&MJY *OD ࠷৽ͷੜϞσϧҰཡ 10 Elton et al. (2019)
Α͘ΘΕΔެ։σʔληοτҰཡ 11 https://arxiv.org/abs/1903.04388 Elton et al. (2019)
ཁૉٕज़ 12
3FTUSJDUFE&MJY *OD (FOFSBUJWF"EWFSTBSJBM/FUXPSLT ("/T 13 Karras et al. (2018)
3FTUSJDUFE&MJY *OD (FOFSBUJWF"EWFSTBSJBM/FUXPSLT ("/T 14 ੜϞσϧͷҰछ Generator (G): ِͷը૾Λੜ͠ɺDΛὃͦ͏ͱ͢Δ Discriminator
(D): ຊͷը૾ͱِͷը૾Λݟ͚Α͏ͱ͢Δ Noise G D ຊ or ِʁ ِͷը૾ ʢੜը૾ʣ ຊͷը૾ ʢTraining setʣ Karras et al. (2017)
3FTUSJDUFE&MJY *OD (FOFSBUJWF"EWFSTBSJBM/FUXPSLT ("/T 15
3FTUSJDUFE&MJY *OD "VUPFODPEFST 16
3FTUSJDUFE&MJY *OD "VUPFODPEFST 17
3FTUSJDUFE&MJY *OD 7BSJBUJPOBM"VUPFODPEFST 7"&T 18 reconstruction ਖ਼ن͔ΒͷͣΕ
3FTUSJDUFE&MJY *OD 3FDVSSFOU/FVSBM/FUXPSLT 3//T 19 Segler et al. (2017)
3FTUSJDUFE&MJY *OD (SBQI3FQSFTFOUBUJPOT 20 Peter et al. (2018) https://www.businessinsider.com/explainer-what-exactly-is-the-social-graph-2012-3
3FTUSJDUFE&MJY *OD (SBQI/FVSBM/FUXPSLT 21 Peter et al. (2018)
3FTUSJDUFE&MJY *OD (SBQI/FVSBM/FUXPSLT 22 Peter et al. (2018)
3FTUSJDUFE&MJY *OD (SBQI$POWPMVUJPOBM/FUXPSLT 23 2D Convolution Graph Convolution Graph Convolutional
Networks Wu et al. (2019)
3FTUSJDUFE&MJY *OD 3FJOGPSDFNFOU-FBSOJOH 3- ڧԽֶश 24 Sutton & Barto (2018)
Mnih et al. (2015)
3FTUSJDUFE&MJY *OD 3FJOGPSDFNFOU-FBSOJOH 3- ڧԽֶश 25 Sutton & Barto (2018)
Mnih et al. (2015) ex) QED, logP
3FTUSJDUFE&MJY *OD 5SBOTGFS-FBSOJOHʢసҠֶशʣ 26 ඇৗʹେ͖ͳϥϕϧͳ͠σʔλ গྔͷڭࢣσʔλ RDKitͰlogPͳͲΛΛܭࢉ͠ɺ pre-train Goh et
al. (2017)
'JOHFSQSJOU 4.*-&4ϕʔεͷϞσϧ 27
3FTUSJDUFE&MJY *OD .PMFDVMFSFQSFTFOUBUJPO 28 Fingerprint SMILES Graph Meter & Coote
(2019) Schwalbe-Koda & Gómez-Bombarelli (2019)
3FTUSJDUFE&MJY *OD ,BEVSJOFUBM 29 • ೖग़ྗ • Binary fingerprints
(MACCS) • Log concentration (LCONC) • தؒ • 5ͭͷχϡʔϩϯͰߏ • 1ͭGrowth Inhibition percentage (GI) • Γ4ͭਖ਼نʹۙͮ͘Α͏ʹֶश The cornucopia of meaningful leads: Applying deep AAEs for new molecule development in oncology
3FTUSJDUFE&MJY *OD ,BEVSJOFUBM 30 σʔληοτ Λ༻ҙֶ͠श Ϟσϧ͔Β αϯϓϧ நग़
ࣅͨಛͷ Խ߹Λ୳ࡧ • NCI-60, MCF-7 • 6252ͷԽ߹ • Fingerprint, LCONC, GI͔ΒΔσʔλ •640ݸͷϕΫτϧ ʢԾతͳԽ߹ ʣΛαϯϓϧ •LCONC < -5.0 M ͷͷΛநग़ •32ݸͷϕΫτϧΛಘΔ •ࣅͨಛͷԽ߹Λ PubChem͔Β୳͠ ग़͢ ࣮ݧͷྲྀΕ
3FTUSJDUFE&MJY *OD ,BEVSJOFUBM 31 • PubChemɿ7200ສͷԽ߹ • ੜͨ͠32ݸͷϕΫτϧͱࣅͨಛΛ࣋ͭԽ߹ ΛPubChem͔Βநग़
• ࠷ऴతʹ69ݸͷԽ߹Λಘͨ • طʹ߅͕Μࡎͱͯ͠ΒΕ͍ͯΔͷ͕ෳ • 13ݸಛڐ͕औΒΕ͍ͯΔͷ • ΄ͱΜͲΞϯτϥαΠΫϦϯܥ ʢݱࡏ࠷ޮՌతͳ߅͕Μࡎʣ : PubChem ੨: ֶशσʔλ : ੜϕΫτϧʢԾతͳԽ߹ʣ ࣮ݧ݁Ռ
3FTUSJDUFE&MJY *OD .PMFDVMFSFQSFTFOUBUJPO 32 Fingerprint SMILES Graph Meter & Coote
(2019) Schwalbe-Koda & Gómez-Bombarelli (2019)
3FTUSJDUFE&MJY *OD 4FHMFSFUBM 33 • LSTMʹΑΓԽ߹Λੜ • ೖग़ྗSMILES •
ԼهΛ܁Γฦ͢ʢHillclimb-MLEͱݺΕΔʣ 1. LSTMͰֶशɾαϯϓϧ 2. Target filtering modelͰϑΟϧλϦϯά ʢػցֶशҎ֎Մʣ Generating Focussed Molecule Libraries for Drug Discovery with Recurrent Neural Networks
3FTUSJDUFE&MJY *OD (PNF[#PNCBSFMMJFUBM $7"& 34 • RNN+VAEʹΑΓԽ߹Λੜ • ೖग़ྗSMILES
• λʔήοτͱ͢Δಛੑ͕େ͖͍latent code Λݟ͚ͭΔ Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules
3FTUSJDUFE&MJY *OD ,VTOFSFUBM (7"& 35 Grammar Variational Autoencoder Encoder
Decoder จ๏ʢcontext free grammarʣΛߟྀͯ͠ੜ
3FTUSJDUFE&MJY *OD :BOHFUBM $IFN54 36 MCTSͱRNNʹΑΓSMILESΛੜ Penalized logPΛ࠷దԽ
3FTUSJDUFE&MJY *OD 1PQPWBFUBM 3F-FB4& 37 https://arxiv.org/abs/1711.10907 Popova et al.
(2017) • SMILESϕʔεͷੜϞσϧ • ඪಛੑΛ࠷దԽ͢ΔͨΊʹڧԽֶशͱΈ߹Θͤ ͍ͯΔ • ௨ৗrewardΛRDKitͰܭࢉ͢Δ͜ͱ͕ଟ͍͕ɺ SMILESϕʔεͷ༧ଌϞσϧʹΑΓrewardΛܭࢉͯ͠ ͍Δ • ͜ΕʹΑΓRDKitͰܭࢉͰ͖ͳ͍ಛੑ࠷దԽ
3FTUSJDUFE&MJY *OD (VJNBSBFTFUBM 03("/ 38 • SeqGANͱ͍͏sequential data༻ͷRNNϕʔεͷGAN͕جʹͳ͍ͬͯΔ •
DruglikenessͳͲͷಛੑΛ࠷దԽ͢ΔͨΊʹڧԽֶशΛಋೖ
3FTUSJDUFE&MJY *OD "MM4.*-&47"& 39 • άϥϑܥϞσϧ • 3ʙ7͘Β͍ͷͷ͕ଟ͍ • 1ʹ͖ͭ1ͭͷڑʹ͋Δใ͕
• ZINC250kʹؚ·ΕΔࢠ • ฏۉܘ͕11.1 • ࠷େܘ24 • ࢠશମʹใΛ͖͑Δ͜ͱ͕Ͱ͖ͳ͍ • RNNͰ͍ใΛ͑Δ • SMILESҰҙʹܾ·Βͳ͍ • ෳͷSMILESΛೖྗʹར༻ Alperstein et al. (2019)
άϥϑϕʔεͷϞσϧʢPOFTIPUܕʣ 40
3FTUSJDUFE&MJY *OD .PMFDVMFSFQSFTFOUBUJPO 41 Fingerprint SMILES Graph Meter & Coote
(2019) Schwalbe-Koda & Gómez-Bombarelli (2019)
3FTUSJDUFE&MJY *OD %F$BP,JQG .PM("/ 42 • DiscriminatorͰgraph convΛར༻͢Δ͜ͱʹΑΓorder invariantʹ
• ֤ಛੑΛ࠷దԽ͢Δ͜ͱ͏·͍͍ͬͯ͘ΔΑ͏ʹݟ͑Δ • ͔͠͠ɺuniqueness͕2%ఔͱඇৗʹ͍ʢGoal-directedͳ߹ʣ • GANRLͰग़ྗΛଟ༷ʹ͢ΔΑ͏ͳ੍͕ͳ͍ͨΊ • ҰൃͰάϥϑΛੜ͢ΔͨΊܭࢉ͕͍࣌ؒ • QM9Ͱ࣮ݧɻߋʹେ͖ͳάϥϑʹద༻͢Δͷͦ͠͏ άϥϑΛҰൃͰੜ͢ΔλΠϓͷϞσϧɻGANͱڧԽֶशར༻ɻ
3FTUSJDUFE&MJY *OD 1ÖMTUFSM8BDIJOHFS -'.PM("/ 43 • MolGANͷΑ͏ʹάϥϑΛҰൃͰੜ͢ΔλΠϓɻ͜ͷϞσϧͰvalencyʹؔ͢Δ੍Λಋೖ • Reconstruction
lossΛexplicitʹܭࢉ͢Δ͜ͱ͕ͳ͘ɺgraph isomorphism problemΛճආ • ී௨ͷGANͱҧͬͯencoderؚΉߏʹͳ͍ͬͯͯɺlatent spaceͰsimilarity͕ߴ͍ࢠΛ୳͢͜ͱ͕༰қ • QM9Ͱ࣮ݧ
άϥϑϕʔεͷϞσϧʢSFDVSSFOUܕʣ 44
3FTUSJDUFE&MJY *OD -JFUBM 45 Learning Deep Generative Models of
Graphs SMILESͰͳ͘άϥϑͱͯ͠ϊʔυͱΤοδΛॱʹՃ GrammarVAEͳͲΑΓྑ͍݁Ռ
3FTUSJDUFE&MJY *OD +JOFUBM +57"& 46 Junction Tree Variational Autoencoder
for Molecular Graph Generation • ୯७ʹϊʔυΛҰͭͻͱͭՃ͍ͯ͘͠Ξϓϩʔν͕ߟ͑ ΒΕΔ • ͔͠͠ɺ͜Εͩͱ࣮ࡍʹଘࡏ͠ͳ͍Խ߹͕ੜ͞Εͯ͠ ·͏Մೳੑ͕͋Δ • ͦ͜ͰΫϥελ͝ͱʹੜ͍ͯ͘͠
3FTUSJDUFE&MJY *OD +JOFUBM +57"& 47 ࣄલʹఆ͓͍ٛͯͨ͠ΫϥελΛͬ ͯπϦʔߏʹղ EmbeddingΛͱʹ৽ͨͳπϦʔߏΛߏங ʢϊʔυΛҰͭͻͱͭՃ͍ͯ͘͠ํࣜʣ
Neural message passing ʹΑΓΤϯίʔυ ಘΒΕͨgraph embeddingͱπϦʔߏͷ ྆ํΛͬͯ࠷ऴతͳԽ߹Λੜ ʢΫϥελΛͲ͏Έ߹ΘͤΔ͔ͱ͍͏ࣗ༝ ͕͋ΔͨΊ͜ͷεςοϓ͕ඞཁʣ GRUʹΑΓΤϯίʔυ
3FTUSJDUFE&MJY *OD :PVFUBM ($1/ 48 Graph Convolutional Policy Network
for Goal-Directed Molecular Graph Generation ΤοδΛҰͭͣͭՃ͢Δ͜ͱͰάϥϑΛੜ GANͱڧԽֶशΛΈ߹ΘͤͨϞσϧ
3FTUSJDUFE&MJY *OD -JFUBM .PM.1.PM3// 49 QEDSAscoreͷ conditional codeΛೖΕΔ λʔήοτͱ͢ΔಛੑͳͲͰcondition͢ΔλΠϓͷϞσϧ
3FTUSJDUFE&MJY *OD ("/ͱ7"&ͷൺֱ 50 GAN • ϝϦοτ • ͏·͘νϡʔχϯάͰ͖Δͱྑ͍݁Ռ •
Reconstruction lossΛܭࢉ͠ͳͯ͘ྑ͍ʢgraph isomorphism problemΛճආʣ • σϝϦοτ • ϋΠύʔύϥϝʔλνϡʔχϯά͕ࠔ • Mode-collapseʢಉ͡ͷ͔Γੜͯ͠͠·͏ʣ VAE • ϝϦοτ • GANΑΓ҆ఆͯ͠ಈ͘ • ϋΠύʔύϥϝʔλνϡʔχϯάָ͕ • Mode-collapseى͖ʹ͍͘ • σϝϦοτ • Reconstruction lossΛܭࢉ͢ΔͨΊgraph isomorphism problem͕ग़ͯ͘Δ
3FTUSJDUFE&MJY *OD 'JOHFSQSJOU 4.*-&4 (SBQIͷൺֱ 51 • Fingerprintϕʔε • FingerprintinvertibleͰͳ͍ͨΊ͍ͮΒ͍
ʢͦͷͨΊ΄ͱΜͲݟ͔͚ͳ͍ʣ • SMILESϕʔε • ҆ఆͨ͠ੑೳ • Validity͕͘ͳͬͯ͠·͏ • Fragment-base generation͕͍͠ • Graphϕʔεʢone-shotܕʣ • ߴ • Heavy atom͕9ҎԼͷখ͞ͳࢠ͔͠࡞Ε͍ͯͳ͍ • Validityuniqueness͕͍ • Graphϕʔεʢrecurrentܕʣ • Validity͕ߴ͍ • ϊʔυͱΤοδͷorderingͷ
ੜϞσϧͷར༻๏ 52
3FTUSJDUFE&MJY *OD .PMFDVMFHFOFSBUJPO 53 Distribution Learning Predefined Scaffold Molecule Optimization
%JTUSJCVUJPO-FBSOJOH 54 https://github.com/NVlabs/ffhq-dataset Karras et al. (2018) ֶशσʔλ ੜ͞Εͨσʔλ
"SPVT1PVTFUBM &YQMPSJOHUIF(%#DIFNJDBMTQBDFVTJOHEFFQHFOFSBUJWFNPEFMT 55 • GDB-13: 13ݸ·Ͱͷheavy atomͰߏ͞ΕΔ9.75ԯࢠ͔ΒͳΔ σʔληοτ
• ͦͷ͏ͪͷ0.1%ʹ૬͢Δ100ສࢠΛֶͬͯश • SMILESΛGRUʹ༩͑ΔγϯϓϧͳϞσϧ • 20ԯࢠΛαϯϓϧ͢Δ͜ͱʹΑΓGDB-13ͷ68.9%Λ෮ݩ͢Δ͜ ͱ͕Ͱ͖ͨ • GDB-13ʹؚ·ΕΔԽ߹ͷಛ͔ͭΉ͜ͱ͕Ͱ͖ͨ • SMILESͷه๏ʹىҼͯ͠ੜͮ͠Β͍λΠϓͷࢠ͕͋Δ͜ͱ ͔ͬͨʢringΛଟؚ͘ΉͷͳͲʣ
.PMFDVMBSPQUJNJ[BUJPO 56 Choi et al. (2017)
.PMFDVMBSPQUJNJ[BUJPO 57 Latent spaceΛ୳ࡧ • Gradient ascent • ϕΠζ࠷దԽ ڧԽֶश
Hillclimb-MLE ʢϑΟϧλϦϯάΛ܁Γฦֶͯ͠शʣ Conditioning code ʢ݅ೖྗͱͯ͠ѻ͏ʣ
.PMFDVMBSPQUJNJ[BUJPOʢಛఆͷ෦ߏ͔Βελʔτʣ 58 Penalized logPΛ࠷దԽ
ͦͷଞʢ༩͑ͨࢠͱྨࣅͷߴ͍ࢠΛੜʣ 59 Drug Analogs from Fragment Based Long Short-Term Memory
Generative Neural Networks 1. ChEMBL, DrugBank, FDB17ͷσʔλΛͬͯLSTMΛ pre-train 2. ͦͷޙ1ͭͷࢠͰfine-tuningʢ10छྨͷࢠͰ࣮ݧʣ 3. SMILESΛੜ • Retain correct SMILES • Remove duplicates • Remove undesirable functional groups 4. ྨࣅͷߴ͍ࢠΛબͿ ༩͑ͨࢠͱྨࣅͷߴ͍ࢠΛੜ Awale et el. (2018)
ͦͷଞʢ༩͑ͨࢠͱྨࣅͷߴ͍ࢠΛੜʣ 60 Drug Analogs from Fragment Based Long Short-Term Memory
Generative Neural Networks Awale et el. (2018)
ੜϞσϧͷੑೳධՁ 61
ੜϞσϧͷධՁͷ͠͞ 62 Karras et al. (2018) • ఆੑతʹྑͦ͞͏ͳ͜ͱ͔Δ͕ɺఆྔతʹධՁ͢Δ͜ͱ͕͍͠ • Խ߹ͷ߹ఆੑతʹධՁ͢Δ͜ͱإը૾ͳͲΑΓ͍͠
ੜϞσϧͷϕϯνϚʔΫ 63 • ͦΕͧΕͷจͰҟͳΔσʔληοτʢChEMBL, ZINC, QM9ͳͲʣɺҟͳΔϝτϦΫεΛ༻͍ͯ͠ΔͨΊൺֱ͕ ͍͠ঢ়گ • ·ͨɺൺֱʹ༻͍͍ͯΔϝτϦΫεͷछྨेͰͳ͍
#SPXOFUBM (VBDB.PM %JTUSJCVUJPO-FBSOJOHϕϯνϚʔΫ 64 • Distribution-learningϕϯνϚʔΫͷత • ܇࿅σʔλͷΛөͯ͠Λ͏·͘࠶ݱͰ͖͍ͯΔ͔ΛධՁ •
͜ͷλεΫ͕͏·͘͜ͳͤΔΑ͏ʹͳΔͱɺԽ߹ͷಛΛ͏·͘ͱΒ͑ΒΕΔΑ͏ʹͳ͍ͬͯΔͣͰɺgoal-directed taskʹཱͭͱߟ͑ΒΕΔ • Validity • ੜ͞ΕͨԽ߹ͷ͏ͪͲΕ͘Β͍ͷׂ߹͕༗ޮͰ͋Δ͔ • ༗ޮ͔Ͳ͏͔RDKitͰνΣοΫ • Uniqueness • ॏෳΛνΣοΫɻϢχʔΫͳԽ߹ͷׂ߹ • Novelty • ৽نੑɻ܇࿅σʔλʹଘࡏ͠ͳ͍Խ߹ͷׂ߹ • Frechet ChemNet Distance (FCD) • ੜ׆ੑ༧ଌͰֶशͨ͠ChemNetͷಛΛ͍ɺ܇࿅σʔλͷͱͲΕ͘Β͍͍͔ۙΛൺֱ͢Δࢦඪ • ը૾ͰੜϞσϧͷੑೳΛൺֱ͢ΔͨΊʹFrechet Inception Distance (FID)ͱ͍͏ࢦඪ͕ΘΕΔ͕FCDͦͷԽ߹൛ • KL Divergence • 2ͭͷ֬ͷࠩΛଌΔͨΊͷࢦඪ • ཧԽֶతಛΛॏࢹ
(PBM%JSFDUFEϕϯνϚʔΫʢNPMFDVMBSPQUJNJ[BUJPOʣ 65 • Goal-DirectedϕϯνϚʔΫͷత • ಛఆͷείΞΛ࠷େԽ͢Δͱ͍͏ઃఆͰධՁ • Similarity • ྨࣅੑɻ܇࿅σʔλ͔ΒऔΓআ͔ΕͨλʔήοτʹͲΕ͘Β͍͚ۙͮΒΕΔ͔
• Rediscovery • ্هͱࣅ͍ͯΔ͕similarityͰͳ͘ɺશ͘ಉ͡ࢠΛੜͰ͖Δ͔ • ͪ͜ΒશҰகΛඞཁͱ͢Δ • Isomers • ྫ͑C7H8N2O2ͷΑ͏ͳࢠʹରͯ͠ͲΕ͘Β͍ҟੑମΛੜͰ͖Δ͔ • ༀͱతʹؔͳ͍͕ϞσϧͷॊೈੑΛධՁ • Median molecules • ෳͷࢠͱͷsimilarityΛಉ࣌ʹ࠷େԽ
.FBTVSJOH$PNQPVOE2VBMJUZ 66 • Measuring Compound Qualityͷత • ઌߦݚڀͷde novo design
algorithmʹΑͬͯੜ͞ΕͨԽ߹ෆ҆ఆɺԠੑ͕ߴ͍ɺ߹͕ࠔɺmedicinal chemist͕ݟΔ ͱ͓͔͍͠ͷ͕͋ΔՄೳੑ͕͋Δ • ͦͷͨΊɺ·ͱͳԽ߹Ͱ͋Δ͔ΛνΣοΫ͢Δඞཁ͕͋Δ • Medicinal chemist͕࣋ͭݟΛͯ͢ϧʔϧԽͯ͠νΣοΫ͢Δ͜ͱ͍͠ • ͜͜Ͱrd_filterΛద༻ • https://github.com/PatWalters/rd_filters
࣮ݧ݁Ռɿ%JTUSJCVUJPOMFBSOJOHϕϯνϚʔΫ 67 • Random samplerɿChEMBL͔Βऔ͖͍ͬͯͯΔ͚ͩͳͷͰഁ͍ͯ͠ΔԽ߹ͳ͘ɺvalidity100%ɻ͔͠͠ɺnoveltyθϩ • SMILES LSTMɿશମతʹྑ͍ • Graph
MCTSɿׂͱྑ͍͕ɺKLͱFCD͕ѱ͍ • AAEɿFCDҎ֎ྑ͍ • ORGANɿશମతʹѱ͍ • VAEɿશମతʹྑ͍
࣮ݧ݁Ռɿ(PBMEJSFDUFEϕϯνϚʔΫ 68 • Best of Data Set • ܇࿅σʔλͷத͔Β࠷είΞͷߴ͍Խ߹ΛબΜͩ߹ɻ ࠷ݶ͑ͳ͚ΕͳΒͳ͍ࢦඪɻ
• Graph GA • Ұ൪ྑ͍݁Ռ • SMILES LSTM • Graph GAͱ΄΅ಉͷྑ͍݁Ռ • ͦͷଞϞσϧ • Graph GAͱSMILES LSTMʹൺΔͱ໌Β͔ʹѱ͍݁Ռ
࣮ݧ݁Ռɿ$PNQPVOE2VBMJUZ.FBTVSFNFOU 69 • Goal-directedͳλεΫʹ͓͍ͯੜ͞ΕͨԽ߹Λrd_filterͰΫΦ ϦςΟʔνΣοΫ • SMILES LSTM͕໌Β͔ʹྑ͍݁Ռ • SMILES
LSTMͰ·ͣpre-training͕͋ΓɺͦΕ͔Β֤είΞͷ࠷ େԽΛߦ͏ͱ͍͏ྲྀΕʹͳ͍ͬͯΔɻPre-trainingͷϑΣʔζͰԽ߹ ͱͯ͠ॏཁͳಛΛ͏·ֶ͘शͰ͖ͨͷͩͱߟ͑ΒΕΔɻ • ҰํɺGraph GA͋·Γྑ͘ͳ͍݁ՌɻࣄલࣝΛ࣋ͭ͜ͱͳ͘ ͍͖ͳΓείΞΛ࠷େԽ͠Α͏ͱ͢Δ෦ʹ͕͋Γͦ͏ɻ • Goal-directedϕϯνϚʔΫͰSMILES LSTMͱGraph GAಉ ͷ݁ՌͩͬͨͷͰɺSMILES LSTMΛͬͨํ͕ྑ͍ɻ
3FTUSJDUFE&MJY *OD 1ÖMTUFSM8BDIJOHFS -'.PM("/ 70 • Validity, uniqueness, novelty͕ྑ͘ΘΕΔ͕͋·ΓΑ͍ϝτϦΫεͰͳ͍
• ϊʔυͱΤοδΛϥϯμϜʹબͿϞσϧʢvalencyߟྀʣ͕ྑ͘ݟ͑ͯ͠·͏ • ֶशσʔλͱࣅ͍ͯͯԽֶతʹҙຯͷ͋Δࢠ͕ੜ͞Ε͍ͯΔ͔ߟྀ͞Εͯ ͍ͳ͍
ࠓޙͷൃలͷํੑ 71
3FTUSJDUFE&MJY *OD .VMUJPCKFDUJWFPQUJNJ[BUJPO (VJNBSBFTFUBM 03("/ 72 • Druglikeness, synthesizability,
solubilityͰަޓʹֶश͢Δ͜ͱʹΑΓ3ͭͷಛੑΛ࠷దԽ • 3ͭ࠷దԽͯͦ͠ΕͧΕ1͚ͭͩΛ࠷దԽͨ࣌͠ʹ͍ۙ݁Ռ
3FTUSJDUFE&MJY *OD .VMUJPCKFDUJWFPQUJNJ[BUJPO ;IPVFUBM .PM%2/ 73 • DQNʹΑΓ࠷దԽΛߦ͏ੜϞσϧ •
SimilarityͱQED (drug-likeness) Λಉ࣌ʹ࠷దԽ͢Δ࣮ݧΛߦ͍ͬͯΔ
3FTUSJDUFE&MJY *OD σʔληοτͳ͠ 1VSF3- .PM%2/ ;IPVFUBM 74 • ڧԽֶशΛར༻͢Δ͜ͱʹΑΓσʔληοτͳ͠Ͱֶश
• Pre-train͠ͳ͍ͨΊ෯͍୳ࡧ͕Մೳ
3FTUSJDUFE&MJY *OD ߹ܦ࿏ߟྀɹ#SBETIBXFUBM .PMFDVMF$IFG 75 Encoder Decoder ߹ܦ࿏ߟྀͨ͠ϞσϧɻԠͱੜͷ྆ํΛग़ྗɻ ԠΛॱʹग़ྗɻԠطͷͷ͔ΒબΕΔɻ
ͦͷޙreaction predictorʹΑΓੜʹɻ Graph neural networkʹΑΓԠͷembeddingΛಘΔ
&MJY *OD IUUQTFMJYJODDPN 76