Upgrade to Pro — share decks privately, control downloads, hide ads and more …

SynthFormer: A Customizable Framework for Virtu...

Elix
October 31, 2024

SynthFormer: A Customizable Framework for Virtual Synthesis-Based Molecule Generation, Elix, CBI2024

Elix

October 31, 2024
Tweet

More Decks by Elix

Other Decks in Technology

Transcript

  1. SynthFormer: A Customizable Framework for Virtual Synthesis-Based Molecule Generation Elix,

    Inc. Chem-Bio Informatics Society (CBI) Annual Meeting 2024, Tokyo Japan | October 31, 2024 Joshua Owoyemi, Ph.D & Tasuku Ishida, Ph.D
  2. • Synthetic accessibility evaluation of generated compounds is challenging. •

    Clients have building blocks preferences in generated compounds. • Virtual Forward Synthesis could be helpful to generate compounds while suggesting synthetic paths 2 Introduction Generative Model Building Blocks • May not be readily available • Many steps may be required to achieve desired compounds Retrosynthesis • Expensive • May require Proprietary data Generative - Retrosynthesis Approach
  3. • Synthetic path is obtained with generated compounds • Better

    control of building blocks • Reaction preference can be streamlined 3 Method: Virtual Synthesis-Based Molecule Generation
  4. • Reaction Template Library: A set of reaction templates available

    to the user. We experimented with templates from design of innovative new synthetic chemical entities generated by optimization strategies (DINGOS)1 and our own in-house selections. 4 Method: Virtual Synthesis-Based Molecule Generation [1] Alexander Button, Daniel Merk, Jan A. Hiss, Gisbert Scheider. Automated de novo molecular design by hybrid machine intelligence and rule-driven chemical synthesis. Nat Mach Intell 1, 307–315 (2019). https://doi.org/10.1038/s42256-019-0067-7
  5. • Building blocks library and starting compounds: Set of reagents

    available to the user. We experimented with a subset of the Enamine Building Blocks catalogue Global Stock 1 containing 150K compounds and the Namiki building blocks 2 containing 15K compounds. 5 Method: Virtual Synthesis-Based Molecule Generation [1] https://enamine.net/building-blocks [2] https://www.namiki-s.co.jp/compound/database.php
  6. • Reaction Template Selector: A method to select a reaction

    from the reaction template library. ◦ Random Selector or Substructure Aware Selector • Reactant selector: A method to select a reagent from the build blocks library. ◦ Random Selector or Substructure Aware Selector 6 Method: Virtual Synthesis-Based Molecule Generation [1] https://enamine.net/building-blocks [2] https://www.namiki-s.co.jp/compound/database.php
  7. • Optimizer: A method to score and improve generated compounds

    across multiple iterations: ◦ Monte Carlo Tree Search (MCTS), useful where exhaustive search is not feasible due to the vast number of possibilities. ◦ Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm, a reinforcement learning suited for large action space. 7 Method: Virtual Synthesis-Based Molecule Generation [1] https://enamine.net/building-blocks [2] https://www.namiki-s.co.jp/compound/database.php
  8. [1] Christopher A Lipinski, Franco Lombardo, Beryl W Dominy, Paul

    J Feeney. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. 1997. Advanced Drug Delivery Reviews. 46 (1–3): 3–26. doi:10.1016/S0169-409X(00)00129-0. [3] Wang Sheng, Che Tao, Levit Anat, Shoichet Brian K, Wacker Danieal, and Roth Bryan L. Structure of the D2 dopamine receptor bound to the atypical antipsychotic drug risperidone. Nature. 2018 Mar 8;555(7695):269-273. doi: 10.1038/nature25758. Epub 2018 Jan 24. PMID: 29466326; PMCID: PMC5843546. [2] Nathan Brown, Marco Fiscato, Marwin Segler, and Alain Vaucher. GuacaMol: Benchmarking Models for de Novo Molecular Design. Journal of Chemical Information and Modeling. 2019. 59. 10.1021/acs.jcim.8b00839. [4] Hachiro Sugimoto, Hiroo Ogura, Yasuo Arai, Youichi Iimura, and Yoshiharu Yamanishi. Research and development of donepezil hydrochloride, a new type of acetylcholinesterase inhibitor. Japanese Journal of Pharmacology, 89(1):7–20, 2002. ISSN 0021-5198. doi: https://doi.org/10.1254/jjp.89.7. URL: https://www.sciencedirect.com/science/article/pii/S0021519819301428. 8 Experiments • Lipinski’s Rule of 5 Distribution: Evaluation of the potential of generated compounds from our framework to be orally active or viable drugs 1. • Guacamol’s Goal-directed Benchmark: Standard benchmarking for de novo molecular design comprising of 13 tasks including rediscovery, similarity and multi-objective optimization 2. • DRD2 Actives Generation: Generation of active compounds on the Dopamine Receptor 2 target, a primary target for drugs used to treat schizophrenia and Parkinson’s disease 3. • Donepezil Rediscovery: An attempt to rediscover and obtain possible synthetic paths for donepezil, a known acetylcholinesterase inhibitor 4. • Patent Compounds Rediscovery: An attempt to rediscover and obtain synthetic paths for 3 selected patented compounds not present in public training datasets.
  9. 9 Results: Guacamol’s Goal Directed Benchmark Task Best of dataset

    MCTS TD3 Celecoxib rediscovery 0.505 0.556 0.591 Troglitazone rediscovery 0.419 0.474 0.496 Thiothixene rediscovery 0.456 0.528 0.567 Aripiprazole similarity 0.595 1.000 1.000 Albuterol similarity 0.719 1.000 1.000 Mestranolsimilarity 0.629 1.000 0.908 Osimertinib MPO 0.839 0.819 0.812 Fexofenadine MPO 0.817 0.803 0.769 Ranolazine MPO 0.792 0.764 0.778 Perindopril MPO 0.575 0.638 0.619 Amlodipine MPO 0.696 0.727 0.727 Sitagliptin MPO 0.509 0.366 0.305 Zaleplon MPO 0.547 0.469 0.418 Average 0.703 0.703 0.692 We compared the performance of the MCTS and TD3 reinforcement learning optimizers. While there is no significant difference between the two optimizers, we found MCTS capable and more convenient since no learning is needed.
  10. 10 Results: DRD2 Actives Generation Reaction: Reductive Amination-Ketone Reaction: Williamson

    Ether Reaction: FGI Chlorination DRD2 Score: 0.997, QED: 0.587 SA: 2.74 Reaction: Reductive Amination-Ketone Reaction: Reductive Amination-Ketone Reaction: Reductive Amination-Ketone DRD2 Score: 0.994, QED: 0.894 SA: 2.856 3 Reaction Steps 2 Reaction Steps 1 Reaction Step We performed de novo generation to find potentially active molecules for the Dopamine Receptor D2 (DRD2) target. Top 1 Score Top 10 Score QED* SA Score* MCTS 0.997 0.996 0.590 3.385 TD3 0.995 0.916 0.680 3.010 *Average of Top 10 scores
  11. 11 Results: DRD2 Actives Generation We performed de novo generation

    to find potentially active molecules for the Dopamine Receptor D2 (DRD2) target. Top 1 Score Top 10 Score* QED* SA Score* MCTS 0.997 0.996 0.590 3.385 TD3 0.995 0.916 0.680 3.010 TD3 Reaction: Reductive Amination-Ketone DRD2 Score: 0.909, QED: 0.484 SA: 3.645 Reaction: FGI Chlorination DRD2 Score: 0.990, QED: 0.495 SA: 3.593 Reaction: FGI Chlorination DRD2 Score: 0.995, QED: 0.433 SA: 3.537 Reaction: Reductive Amination-Ketone DRD2 Score: 0.950, QED: 0.917 SA: 2.950 Reaction: FGI Chlorination DRD2 Score: 0.958, QED: 0.805 SA: 2.875 Reaction: Reductive Amination-Ketone DRD2 Score: 0.971, QED: 0.794 SA: 3.106 3 Reaction Steps 2 Reaction Steps 1 Reaction Step
  12. 12 Results: RO5 Samples Distribution We sampled compounds using the

    Enamine and the Namiki building blocks. The compounds were optimized for the Lipinsky’s Rule of 5 and the distribution of the physico-chemical properties are shown. Enamine Namiki
  13. We compared the Enamines and Namiki Building blocks. We also

    compared with Isomol 1 - a MCTS combinatorial fragments based method, and Reinvent 2 - a popular generative model tool for de novo drug design. Our framework seem to work better for a building block with larger number of reagents. 13 Results: Donepezil Rediscovery Top 1 Score Top 10 Score QED* SA Score* Enamine 0.913 0.885 0.748 3.185 Namiki 0.735 0.722 0.660 2.307 Isomol 0.670 0.625 0.616 3.438 Reinvent 1.000 0.980 0.750 2.680 Reaction: Cross Claisen Reaction: Cross Claisen Similarity score: 0.836, QED: 0.724 SA: 3.539 2 Reaction Steps 1 Reaction Step Reaction: Cross Claisen Similarity score: 0.913, QED: 0.724 SA: 3.539 Donepezil [1] Isomol, Elix Discovery Platform. https://www.elix-inc.com/platform/ [2] Thomas Blaschke, Josep Arús-Pous, Hongming Chen, Christian Margreitter, Christian Tyrchan, Ola Engkvist, Kostas Papadopoulos, and Atanas Patronov. REINVENT 2.0: An AI Tool for De Novo Drug Design. Journal of Chemical Information and Modeling 2020 60 (12), 5918-5922 DOI: 10.1021/acs.jcim.0c00915
  14. 14 Results: Patent Compounds Rediscovery SMILES Structure Target Indication Patent

    ID Compound ID Developer Candidate name Cc1nc[n](-c2c3[n](c(nc 2)NCc2c(F)ccc4OCCc2 4)cc(n3)C#N)c1 PRC2 inhibitor Cancer WO2023049724 139 ORIC Pharmaceuticals ORIC-944 CN1C(=O)N(c2cnc(N[C @]3([H])C[C@](Nc4ncc (SC)cn4)([H])CC3)cc2) CC1=O PCSK9 inhibitor Dyslipidemia WO2020150473 463 Astrazeneca AZD0780 c1(N2c3nc(C)ncc3NC2 =O)cnc(Oc2cccc3OCC( C)(C)c23)c(Cl)c1C Kv3.1 modulator Neurological disorder WO2024086061 3 MSD N/A The following patent compounds were selected for rediscovery through de novo generation.
  15. 15 Results: Patent Compound Rediscovery 1 Top 1 Score Top

    10 Score* QED* SA Score* Enamine 0.754 0.737 0.629 4.121 Namiki 0.723 0.695 0.537 3.458 Isomol 0.720 0.711 0.599 5.339 Reinvent 0.740 0.722 0.564 3.402 Reaction: Cross Claisen Reaction: Cross Claisen 3 Reaction Steps Target Compound Reaction: Cross Claisen Similarity: 0.754, QED: 0.627, SA: 4.159 Reaction: Amide Formation Reaction: FGI Bromination 3 Reaction Steps Reaction: Negishi Similarity: 0.725, QED: 0.406, SA: 3.202
  16. 16 Results: Patent Compound Rediscovery 2 Top 1 Score Top

    10 Score* QED* SA Score* Enamines 0.700 0.669 0.744 2.910 Namiki 0.652 0.642 0.520 3.177 Isomol 0.670 0.625 0.616 3.438 Reinvent 0.850 0.836 0.633 2.464 Reaction: Cross Claisen Similarity score: 0.683, QED: 0.655 SA: 2.879 2 Reaction Steps 1 Reaction Step Reaction: Reductive Amination Ketone Similarity score: 0.700, QED: 0.724 SA: 3.539 Target Compound Reaction: Reductive Amination Ketone
  17. 17 Results: Patent Compound Rediscovery 3 Top 1 Score Top

    10 Score* QED* SA Score* Enamine 0.812 0.805 0.677 4.041 Namiki 0.810 0.797 0.377 3.166 Isomol 0.780 0.776 0.488 5.522 Reinvent 0.800 0.781 0.427 2.535 4 Reaction Steps Reaction: Ar-Imidazole formation Target Compound Reaction: FGI Rosenmund-von Braun Cl revised Reaction: Red nitrile to amine Reaction: Amide formation Similarity score: 0.8099, QED: 0.280 SA: 3.383
  18. 18 Conclusions • We proposed a customizable framework that generates

    compounds by performing virtual forward synthesis. • We performed experiments to demonstrate the ability of the framework to be used in de novo drug design. • Our framework performs on par with popular reinforcement learning based generative models such as Reinvent. Future Directions: • We plan to extend framework for lead optimization tasks.