Upgrade to Pro — share decks privately, control downloads, hide ads and more …

'Why Should I Trust You?' Explaining the Predic...

fatml
November 18, 2016
1.7k

'Why Should I Trust You?' Explaining the Predictions of Any Classifier

fatml

November 18, 2016
Tweet

Transcript

  1. "Why Should I Trust You?" Explaining the Predictions of Any

    Classifier FATML 2016 November 18th 2016 Sameer Singh Marco Tulio Ribeiro Carlos Guestrin
  2. Visual Question Answering Is there a moustache in the picture?

    > Yes What is the moustache made of? > Banana 4
  3. Is it doing what we want? Or a neural network

    with more than a thousand layers? 5
  4. They are essentially black-boxes How can we trust the predictions

    are correct? Trust How can we understand and predict the behavior? Predict How do we improve it to prevent potential mistakes? Improve 6
  5. Fairness in ML I would like to apply for a

    loan.. Your loan has been denied Machine Learning What? Why? Cannot explain.. [0.25,-4.5,3.5,-10.4,…] Currently 7
  6. 9

  7. Visual Question Answering What is the moustache made of? >

    Banana What are the eyes made of? > Bananas What? > Banana What is? > Banana 10
  8. What an explanation looks like Why did this happen? From:

    Keith Richards Subject: Christianity is the answer NTTP-Posting-Host: x.x.com I think Christianity is the one true religion. If you’d like to know more, send me a note 11
  9. Three must-haves for a good explanation Definitely not interpretable Potentially

    interpretable Interpretability Humans can easily interpret reasoning 12
  10. Three must-haves for a good explanation x y Learned model

    Not faithful to model Interpretability Humans can easily interpret reasoning Faithful Describe how the model actually works 13
  11. Three must-haves for a good explanation Interpretability Humans can easily

    interpret reasoning Faithful Describe how the model actually works Model Agnostic Can explain any classifier 14
  12. Local, Interpretable, Model-Agnostic Explanations (LIME) 1. Pick a model class

    interpretable by humans - May not be globally faithful… 2. Locally approximate global (blackbox) model - Simple model is globally bad, but locally good Line, shallow decision tree, sparse features, … Locally-faithful simple decision boundary ➔ Good explanation for prediction 15
  13. LIME: General framework Instance Explanation Family Universe of possible explanations

    to search over Faithfulness of Explanation Is the explanation faithful to the model in context of x Interpretability Is the explanation simple enough to read? 16
  14. 1. Sample points around xi 2. Use complex model to

    predict labels for each sample 3. Weigh samples according to distance to xi 4. Learn new simple model on weighted samples 5. Use simple model to explain Sparse Linear Explanations 17
  15. Perturbed Instances P(Labrador ) Sampling example - images Original Image

    0.92 0.001 0.34 P(labrador) = 0.21 Locally weighted regression Explanation 18
  16. Did ML experts notice it? 0. 25. 50. 75. 100.

    Didn't trust the model "Snow insight" % of subjects (out of 27) Before explanations After explanations 22
  17. Explanation for a bad classifier From: Keith Richards Subject: Christianity

    is the answer NTTP-Posting-Host: x.x.com I think Christianity is the one true religion. If you’d like to know more, send me a note After looking at the explanation, we shouldn’t trust the model! 24
  18. Explanation for a good classifier It seems to be picking

    up on more reasonable things.. good! 25
  19. Comparing models If we picked based on accuracy, we would

    get it wrong. 40 53 65 78 90 103 Guessing w/ Explanations % picked better model 89% of users identify the more trustworthy model 27
  20. Problem with linear explanations Gives a “general idea”, but not

    precise. What can I change, and not change, to keep the same prediction? 28
  21. Prediction Invariance Instance Set of Constraints on x Conjunction of

    constraints on individual features Quality of Constraints Whether all instances that satisfy the constraints e have the same prediction Interpretability Are the set of constraints small? Find a set of constraints so that any other change to the input does not change the prediction 29
  22. “Programs” as Explanations Instance Set of Programs Universe of all

    possible programs/functions on x Quality of Program Does the function e on instances similar to x perform similar to the model Interpretability Is the program short? There are so many choices for interpretable models: Rules, decision trees, decision sets, falling lists, sparse linear models, etc. Why decide between them, search over ALL possible “programs”! 33
  23. Model-Agnostic Explanations Model-agnostic explanations are critical for understanding ML! Can

    be applied on future models as well LIME Local, Interpretable Model-Agnostic Explanations Linear explanations: useful for complex text and image classifiers Prediction Invariant explanations: a more precise definition “Programs” as explanations: generalizes over any representations 35
  24. Thank you! LIME Local, Interpretable Model-Agnostic Explanations Linear explanations: useful

    for complex text and image classifiers Prediction Invariant explanations: a more precise definition “Programs” as explanations: generalizes over any representations www.sameersingh.org github.com/marcotcr/lime