'Why Should I Trust You?' Explaining the Predictions of Any Classifier

"Why Should I Trust You?" Explaining the Predictions of Any
Classifier FATML 2016 November 18th 2016 Sameer Singh Marco Tulio Ribeiro Carlos Guestrin

Machine Learning is everywhere 2

Only 1 mistake! Predict wolf vs husky 3

Visual Question Answering Is there a moustache in the picture?
> Yes What is the moustache made of? > Banana 4

Is it doing what we want? Or a neural network
with more than a thousand layers? 5

They are essentially black-boxes How can we trust the predictions
are correct? Trust How can we understand and predict the behavior? Predict How do we improve it to prevent potential mistakes? Improve 6

Fairness in ML I would like to apply for a
loan.. Your loan has been denied Machine Learning What? Why? Cannot explain.. [0.25,-4.5,3.5,-10.4,…] Currently 7

Only 1 mistake! Predict wolf vs husky We’ve built a
snow detector… 8

Visual Question Answering What is the moustache made of? >
Banana What are the eyes made of? > Bananas What? > Banana What is? > Banana 10

What an explanation looks like Why did this happen? From:
Keith Richards Subject: Christianity is the answer NTTP-Posting-Host: x.x.com I think Christianity is the one true religion. If you’d like to know more, send me a note 11

Three must-haves for a good explanation Definitely not interpretable Potentially
interpretable Interpretability Humans can easily interpret reasoning 12

Three must-haves for a good explanation x y Learned model
Not faithful to model Interpretability Humans can easily interpret reasoning Faithful Describe how the model actually works 13

Three must-haves for a good explanation Interpretability Humans can easily
interpret reasoning Faithful Describe how the model actually works Model Agnostic Can explain any classifier 14

Local, Interpretable, Model-Agnostic Explanations (LIME) 1. Pick a model class
interpretable by humans - May not be globally faithful… 2. Locally approximate global (blackbox) model - Simple model is globally bad, but locally good Line, shallow decision tree, sparse features, … Locally-faithful simple decision boundary ➔ Good explanation for prediction 15

LIME: General framework Instance Explanation Family Universe of possible explanations
to search over Faithfulness of Explanation Is the explanation faithful to the model in context of x Interpretability Is the explanation simple enough to read? 16

1. Sample points around xi 2. Use complex model to
predict labels for each sample 3. Weigh samples according to distance to xi 4. Learn new simple model on weighted samples 5. Use simple model to explain Sparse Linear Explanations 17

Perturbed Instances P(Labrador ) Sampling example - images Original Image
0.92 0.001 0.34 P(labrador) = 0.21 Locally weighted regression Explanation 18

Explaining Google’s Inception NN P( ) = 0.21 P( )
= 0.24 P( ) = 0.32 19

Only 1 mistake! Predict wolf vs husky 20

Explanations for neural network prediction We’ve built a great snow
detector… ☹ 21

Did ML experts notice it? 0. 25. 50. 75. 100.
Didn't trust the model "Snow insight" % of subjects (out of 27) Before explanations After explanations 22

Explanations should match our intuitions Original Image “Bad” Classifier “Good”
Classifier 23

Explanation for a bad classifier From: Keith Richards Subject: Christianity
is the answer NTTP-Posting-Host: x.x.com I think Christianity is the one true religion. If you’d like to know more, send me a note After looking at the explanation, we shouldn’t trust the model! 24

Explanation for a good classifier It seems to be picking
up on more reasonable things.. good! 25

Comparing models with explanations 26

Comparing models If we picked based on accuracy, we would
get it wrong. 40 53 65 78 90 103 Guessing w/ Explanations % picked better model 89% of users identify the more trustworthy model 27

Problem with linear explanations Gives a “general idea”, but not
precise. What can I change, and not change, to keep the same prediction? 28

Prediction Invariance Instance Set of Constraints on x Conjunction of
constraints on individual features Quality of Constraints Whether all instances that satisfy the constraints e have the same prediction Interpretability Are the set of constraints small? Find a set of constraints so that any other change to the input does not change the prediction 29

Tabular Data 30

Visual QA 31

Image Classification 32

“Programs” as Explanations Instance Set of Programs Universe of all
possible programs/functions on x Quality of Program Does the function e on instances similar to x perform similar to the model Interpretability Is the program short? There are so many choices for interpretable models: Rules, decision trees, decision sets, falling lists, sparse linear models, etc. Why decide between them, search over ALL possible “programs”! 33

Salary Prediction 34

Model-Agnostic Explanations Model-agnostic explanations are critical for understanding ML! Can
be applied on future models as well LIME Local, Interpretable Model-Agnostic Explanations Linear explanations: useful for complex text and image classifiers Prediction Invariant explanations: a more precise definition “Programs” as explanations: generalizes over any representations 35

Thank you! LIME Local, Interpretable Model-Agnostic Explanations Linear explanations: useful
for complex text and image classifiers Prediction Invariant explanations: a more precise definition “Programs” as explanations: generalizes over any representations www.sameersingh.org github.com/marcotcr/lime

'Why Should I Trust You?' Explaining the Predic...

'Why Should I Trust You?' Explaining the Predictions of Any Classifier

More Decks by fatml

Featured

Transcript