Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Synthon-GA: Searching make-on-demand libraries with genetic algorithms

Jan Jensen
August 10, 2022
660

Synthon-GA: Searching make-on-demand libraries with genetic algorithms

Talk at "Ultra-Large Chemical Libraries", London, August 10, 2022

Jan Jensen

August 10, 2022
Tweet

Transcript

  1. Synthon-GA Searching make-on-demand libraries with genetic algorithms Jan H. Jensen

    Department of Chemistry, University of Copenhagen @janhjensen 1 Casper Steinmann (Aalborg University)
  2. Genetic Algorithms for Molecules Mating Chem. Sci. 2019 github.com/jensengroup/GB-GA Kill

    Unfit Molecules Molecule Unfit Molecule New Molecule Mate & Mutate Survivors
  3. Fitness = docking scores (Minimizing Glide htvs_ds score) (Population =

    400, 50 generations, 20 GA searches) Target GA ZINC Score/SA < -9.0 < -10.0 % SA < -9.0 < -10.0 % SA β2 AR 164 10 76% 86 1 84% DDR1 378 38 88% 199 8 82% DOI: 10.7717/peerj-pchem.18 SA: synthetic accessibility by
  4. How to use GA to search Enamine REAL Space? Synthons

    and Combination rules from Synt-On (formely SynthI) + Random choice Mutation Random choice Crossover
  5. REAL Space is only a small fraction of possible genes

    “The REAL Space comprises 21 billion make-on-demand molecules and is currently the largest offer of commercially available compounds. The REAL compounds in the Space are assembled via more than 170 well-validated parallel synthesis protocols applied to over 112 000 qualified reagents and building blocks.” 129K reagents => 91K and 41K 1-synthons and 2-synthons + 24 possible reactions = 28 trillion genes 2-synthon 1-synthons
  6. Workflow: Minimizing Glide XP score Population = 400, 100 generations

    20 GA searches (8 million docking calculations) Random genes Synthon-GA Similarity Search Final populations Redock
  7. GA-2 Are these 90 molecules available from Enamine? (No) If

    not what are the closest analogs and what are their docking score? GA-2: 90 mols Postera similarity search (1B?) (API, instantaenous) FTrees similarity search (~20 B) (command line, 4 min/mol) SmallWorld similarity search (2.5 B) (instantaneous, web GUI) 8,856 mols with sim > 0.2 90,000 mols (1000 per mol) 27,640 mols with sim > 0.2
  8. GA-2 Redocking Are these 90 molecules available from Enamine? (No)

    If not what are the closest analogs and what are their docking score? GA-2: 90 mols Postera FTrees SmallWorld 8,856 mols with sim > 0.2 90,000 mols (1000 per mol) 27,640 mols Docking Score ≤ -8.5 MW ≤ 400 Nrot ≤ 7 No PAINS 17 mols 222 mols 65 mols 150 mols 304 mols
  9. GA-2 Redocking Top Scores SmallWorld Score = -10.9 Score =

    -10.9 Score = -10.3 Score = -11.5 FTrees
  10. GA-2 Redocking Top Scores SmallWorld Score = -10.9 Score =

    -10.9 Score = -10.3 Score = -11.5 FTrees Protonation state? Chirality? FTrees does not report on chirality Synt-On removes chirality Enamine sell some cmpds as racemates and some as pure
  11. GA-2 Redocking Top Scores All possible chiralities plus reasonable protonation

    states 720 chiral/ prot mols 304 mols 150 short list Carteblanche The CACHE organisers suggested using Carteblance (CB) to check which chiral isomers are purchasable. CB can't find all the molecules that FTrees found For the ones CB finds we pick the enantiomer with the best score and update the catalog ID. For the rest we use the catalogue ID that FTree provided with unknown chirality (51) New 150 short list 149 short list (submitted) Check for duplicates Total price < $10K 77 short list (submitted)
  12. Adjusted Workflow: Random genes Synthon-GA Similarity Search Final populations Redock

    Chiral synthons Remove duplicates from population Estimate protonation state Ftrees: What chiral isomers are in library?
  13. Summary Use regular GA instead? Can identify molecules in with

    good docking score The molecules found by Synthon-GA are generally not similar to those available in library While the synthons are known the combination rules are proprietary Combination rules could probably be “reverse engineered” with ML Experimental verification is underway Regular GA penalises chiral molecules Are Synthon GA molecules easier to synthesize? Synthon-GA is not on GitHub yet. Contact me for access