MACHINE LEARNING INTERPRETABILITY: WHY AND HOW!

MACHINE LEARNING INTERPRETABILITY: WHY AND HOW! OMAYMA SAID Dec 2019

INTERPRETABLE ML/AI EXPLAINABLE ML/AI FEATURE ATTRIBUTIONS WHAT DO YOU KNOW
ABOUT ?

• You will be given two labels with the corresponding
mapping to 0/1. • You will be asked to classify some images. • You are a classiﬁer. HOW WOULD YOUR MENTAL MODEL LABEL THIS IMAGE? RULES

CAT → 0 DOG → 1 OR HOW WOULD YOUR
MENTAL MODEL LABEL THIS IMAGE?

DUCK → 0 RABBIT → 1 OR HOW WOULD YOUR
MENTAL MODEL LABEL THIS IMAGE?

WHY ? HOW WOULD YOUR MENTAL MODEL LABEL THIS IMAGE?

https:/ /github.com/minimaxir/optillusion-animation Max Woolf

INTERPRETABILITY TRADEOFF COMPLEXITY

AI-POWERED [----] ML-ENABLED [----] MORE AND MORE

BUT LET’S FIRST EXCLUDE THE NONSENSE

“Uses AI to give you more insight into candidates, so
you can make better decisions.” $#&T NONSENSE Source: Business Insider Video (2017) 25000 FEATURES → INSIGHT SCORE

$#&@!^&$$& !!!!!!!!!!!!!!!!!!!!! https:/ /www.faception.com/

HOW ABOUT THE OTHER APPLICATIONS?

WHAT IF IT WAS NOT GENDER OR A PROXY TO
GENDER ?

WHY? TAMMY DOBBS, ARKANSAS REDUCED HOURS OF HOME CARE VISITS
PER WEEK

TAMMY DOBBS, ARKANSAS WHY? REDUCED HOURS OF HOME CARE VISITS
PER WEEK WHO IS ACCOUNTABLE?

COLLECT/LABEL DATA IT IS HUMANS WHO WRITE ALGORITHMS DEFINE METRICS

COLLECT/LABEL DATA IT IS HUMANS WHO BIAS IN: - REPRESENTATION
- DISTRIBUTION - LABELS AND MORE….. WRITE ALGORITHMS DEFINE METRICS

IT IS HUMANS WHO DEFINE METRICS WRITE ALGORITHMS COLLECT/LABEL DATA
- TRAIN/TEST SPLIT - FEATURES/PROXIES - MODEL COMPLEXITY AND MORE…..

IT IS HUMANS WHO COLLECT/LABEL DATA DEFINE METRICS WRITE ALGORITHMS
- WHAT IS THE IMPACT OF DIFFERENT ERROR TYPES ON DIFFERENT GROUPS? - WHAT DO YOU OPTIMIZE FOR?

PRACTITIONERS CONSISTENTLY: - OVERESTIMATE THEIR MODEL’S ACCURACY. - PROPAGATE FEEDBACK
LOOPS. - FAIL TO NOTICE DATA LEAKS. “ ” “Why Should I Trust You?” Explaining the Predictions of Any Classiﬁer

MACHINE LEARNING INTERPRETABILITY: WHY?

TRANSPARENCY MACHINE LEARNING INTERPRETABILITY: WHY?

TRANSPARENCY DEBUGGING MODELS MACHINE LEARNING INTERPRETABILITY: WHY?

TRANSPARENCY DEBUGGING MODELS ROBUSTNESS MACHINE LEARNING INTERPRETABILITY: WHY? (e.g. Dealing
with adversarial attacks)

TRANSPARENCY DEBUGGING MODELS ROBUSTNESS (e.g. Dealing with adversarial attacks) MACHINE
LEARNING INTERPRETABILITY: WHY?

MACHINE LEARNING INTERPRETABILITY: HOW?

https:/ /github.com/marcotcr/lime Local Interpretable Model-agnostic Explanations LIME https:/ /github.com/thomasp85/lime

LIME Explain individual predictions of black-box models using a local
interpretable model. “ ”

this documentary was boring and quite stupid….. I-Tabular Data II-Images
III-Text LIME

1- Select a point to explain (red). Based on an
example in “Interpretable Machine Learning” Book by Christoph Molnar LIME (Tabular Data)

2- Sample data points. LIME (Tabular Data) Based on an
example in “Interpretable Machine Learning” Book by Christoph Molnar

3- Weight points according to their proximity to the selected
point. LIME (Tabular Data) Based on an example in “Interpretable Machine Learning” Book by Christoph Molnar

4- Train a weighted, interpretable local model. LIME (Tabular Data)
Based on an example in “Interpretable Machine Learning” Book by Christoph Molnar

5- Explain the black-box model prediction using the local model.
LIME (Tabular Data) Based on an example in “Interpretable Machine Learning” Book by Christoph Molnar

LIME (Tabular Data)

LOCAL → Unique explanation for each prediction LIME (Tabular Data)

LIME (Tabular Data)

Label: toy terrier Probability: 0.81 Explanation Fit: 0.38 LIME (Images)
Original Model: pre-trained ImageNet model

LIME (Images) superpixels 50 100 150 Original Model: pre-trained ImageNet
model

Label: tabby, tabby cat Probability: 0.29 Explanation Fit: 0.77 Label:
Egyptian Cat Probability: 0.28 Explanation Fit: 0.69 LIME (Images) Original Model: pre-trained ImageNet model

Label: tabby, tabby cat Probability: 0.29 Explanation Fit: 0.77 Label:
Egyptian Cat Probability: 0.28 Explanation Fit: 0.69 Type: Supports Type: Contradicts LIME (Images) Original Model: pre-trained ImageNet model

LIME (Images) “Why Should I Trust You?” Explaining the Predictions
of Any Classiﬁer https:/ /arxiv.org/pdf/1602.04938.pdf

LIME (Images) “Why Should I Trust You?” Explaining the Predictions
of Any Classiﬁer SNOW

IMDB reviews sentiment classiﬁcation LIME (Text) Original Model: Keras model
(CNN+LSTM)

Boring Stupid Dumb Waste information Label predicted: -ve sentiment LIME
(Text) Original Model: Keras model (CNN+LSTM)

Pros LIME - Provides human-friendly explanations. - Gives a ﬁdelity
measure. - Can use other features than the black-box model.

Pros LIME Cons - Provides human-friendly explanations. - Gives a
ﬁdelity measure. - Can use other features than the original model. - The deﬁnition of proximity is not totally resolved in tabular data. - Instability of explanations.

Pros - Provides human-friendly explanations. - Gives a ﬁdelity measure.
- Can use other features than the original model. Cons - Instability of explanations. LIME - The deﬁnition of proximity is not totally resolved in tabular data.

SHAPLEY VALUES

SHAPLEY VALUES Explain the diﬀerence between the actual prediction and
the average/baseline prediction of the black-box model. coalitional game theory “ ”

Pros - Solid theory. - The diﬀerence between the prediction
and the average prediction is fairly distributed among the feature values of the instance. SHAPLEY VALUES

Pros Cons - Solid theory - The diﬀerence between the
prediction and the average prediction is fairly distributed among the feature values of the instance. - Computationally expensive. - Can be misinterpreted. SHAPLEY VALUES - Uses all features (not ideal for explanations that contain few features).

Pros Cons - Solid theory. - The diﬀerence between the
prediction and the average prediction is fairly distributed among the feature values of the instance. - Computationally expensive. - Can be misinterpreted. SHAPLEY VALUES - Uses all features (not ideal for explanations that contain few features).

https://github.com/slundberg/shap SHAP

Explainable AI for Trees: From Local Explanations to Global Understanding
Bar chart (left) and SHAP summary plot (right) for a gradient boosted decision tree model trained on the mortality dataset. SHAP

https:/ /github.com/ModelOriented/DALEX https:/ /github.com/ModelOriented/DALEXtra

https://www.sicara.ai/blog/2019-07-31-tf-explain-interpretability-tensorflow

https://github.com/SeldonIO/alibi

https:/ /cloud.google.com/blog/products/ai-machine-learning/google-cloud-ai-explanations-to-increase-fairness-responsibility-and-trust

https:/ /cloud.google.com/ml-engine/docs/ai-explanations/overview AI Explanations for AI Platform

ADDED FEATURE ATTRIBUTIONS Google AI Explainability Whitepaper

• Deploy the model. • Log feature attributions for certain
predictions (e.g. rare class). • Monitor model training/live skew. • Monitor attribution skew and focus on the most inﬂuential features. PREDICTION AUDITING AND MODEL MONITORING Explainable AI for Trees: From Local Explanations to Global Understanding Google AI Explainability Whitepaper

https:/ /cloud.google.com/ml-engine/docs/ai-explanations/limitations • Explanations are LOCAL (Each attribution only shows
how much the feature aﬀected the prediction for that particular example). • Explanations/Feature attributions are subject to adversarial attacks as predictions in complex models. • Explanations alone cannot tell if your model is fair, unbiased, or of sound quality • Diﬀerent methods are just complementary tools to be combined with other approaches and the practitioners’ best judgement. • Explanations might be misinterpreted in some cases. LIMITATIONS Google AI Explainability Whitepaper Limitations of Interpretable Machine Learning Methods

EXTRA READINGS

MACHINE LEARNING INTERPRETABILITY: WHY AND HOW!

MACHINE LEARNING INTERPRETABILITY: WHY AND HOW!

More Decks by OmaymaS

Featured

Transcript