Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Do machines dream of atoms? Crippen’s logP as a quantitative molecular benchmark for explainable AI heatmaps

Jan Jensen
November 01, 2022
77

Do machines dream of atoms? Crippen’s logP as a quantitative molecular benchmark for explainable AI heatmaps

M2D2 on-line seminar 1.11.2022

Jan Jensen

November 01, 2022
Tweet

Transcript

  1. Do machines dream of atoms? Crippen’s logP as a quantitative

    molecular benchmark for explainable AI heatmaps Maria H. Rasmussen, Diana S. Christensen, and Jan H. Jensen Department of Chemistry, University of Copenhagen @janhjensen 1 M2D2 2022.11.01 Preprint: 10.26434/chemrxiv-2022-gnq3w
  2. Molecular Heatmaps: atomic contributions to ML predictions Predicted activity decreases

    when “removed” Predicted activity increases when “removed” 3
  3. 4 Uses of heatmaps Understand the ML model Right answer

    for the right reasons? Understand errors and improve models Build trust in model Understand the chemistry and guide discovery
  4. Do expert chemist agree/find them useful? (non-numerical and imprecise) (hard

    to automate) Toy models where the ground truth heatmap is known (mainly for classification) (how to compare heatmaps?) But how to validate the heatmaps?
  5. 8 2048 bit-ECFP4/Random Forest Trained on 150K molecules from ZINC

    / 5K test set XAI: “atom attribution from finger prints”-method developed by Riniker and Landrum* *DOI:10.1186/1758-2946-5-43
  6. 10 2048 bit-ECFP4/Random Forest Finger print-adapted group truth: The sum

    of the logP contributions of each of the ten fragments is assigned to the atom
  7. 14 Comparing XAI methods Graph-convolution NN + remove atom approach

    Overall worse, but better for some molecules
  8. 15 Comparing Heatmaps GCNN RF Ground Truth Overlap 0.20 logPpred

    = 3.47 Overlap 0.32 logPpred = 3.79 logP = 3.47 Overlap 0.76 logPpred = 1.21 Overlap 0.48 logPpred = 1.06 logP = 1.07
  9. 16 Conclusions / Outlook Good heatmaps are possible for regression

    (especially if oracle heatmap is adapted to reflect molecular representation) A very simple model + XAI method gives good heatmaps (That doesn’t mean logP is “too easy”) (“atoms” don’t come natural to ML) New models/methods must beat this Heatmaps can guide adversarial models DOI: 10.26434/chemrxiv-2022-gnq3w