[PyCon JP 2018] Interpretable Machine Learning, making black box models explainable with Python!

Interpretable Machine Learning: Making black-box models explainable with Python! David
Low Co-founder / CDS

Bio • • • • •

• • ◦ ◦ ◦ • ◦ ◦ Overview

Machine Learning is everywhere! Source: Nvidia

Source: xkcd

Is there a need to explain ML model? • Safety
Make sure the system is making sound decisions. • Debugging Understand why a system doesn't work, so we can fix that. • Science Enable new discovery. • Mismatched Objectives and multi-objectives trade-offs: The system may not be optimizing the true objective. • Legal / Ethics: Legally required to provide an explanation and/or avoid discriminate against particular groups due to bias in data.

EU General Data Protection Regulation (GDPR) • Article 15 and
22 from GDPR suggest that: Customer/User has the rights to request for information/explanation pertaining to decision made by the automated system. • Such a right would enable people to ask how a specific decision (e.g. being declined insurance or being denied a promotion) was reached.

A highly accurate model to classify Wolf and Husky

A highly accurate model BUT... • Does the model learn
the RIGHT things? The answer is “NO”. Instead of learning the appearance feature of the dogs, the model picks up the signal from the background (Snow in, this case)

Machine Learning Model • ML model is a function that
takes the input and produce output f(x) x (Data) y (Output)

Complexity of learned function 1. Linear + Monotonic function 2.
Non-linear + Monotonic function 3. Non-linear + Non-monotonic function Increased complexity, hence harder to interpret... Eg of Linear model: • Linear regression, Logistic Regression, Naives Bayes... Eg of Non-linear model • Neural Network, Tree-based (Random Forest, Gradient Boosting)...

Monotonicity Non-monotonic x y Monotonic x y

Scope of Interpretability • Global Interpretability ◦ How do parts
of the model influence predictions? • Local Interpretability ◦ Why did the model make a specific decision for an instance? ◦ What factors contributed to a particular decision impacting a specific person

Approaches • Interpretable Models • Model-Agnostic Methods ◦ Partial Dependence
Plot (PDP) ◦ Individual Conditional Expectation (ICE) ◦ Feature Importance ◦ Surrogate Models

Partial Dependence Plot (PDP) I • Shows the marginal effect
of a feature on the predicted outcome of a previously fit model • Steps ◦ Select a feature: Temperature ◦ Identify a list of value for that feature: 0 to 35 Celcius ◦ Iterate over the list ▪ Replace Temperature one value at a time ▪ Take the average of the prediction outputs ◦ Repeat it for other features

Partial Dependence Plot (PDP) II

Visualize in 3D

Individual Conditional Expectations (ICE) • Visualizes the dependence of the
predicted response on a feature for EACH instance separately, resulting in multiple lines, one for each instance.

• Determined by the changes in the model’s prediction error
after permuting the feature’s values • Steps ◦ For each feature ▪ Replace the selected feature with noise (random) values ▪ Measure the changes in prediction error • Important feature → Decline in accuracy Less important feature → No changes / Increase of accuracy Feature Importance I

Feature Importance II

Local Interpretable Model-Agnostic Explanations (LIME) • Local surrogate models that
can explain single predictions of any black-box machine learning model • Surrogate models are interpretable models (Linear mode / Decision Tree) that are learned on the predictions of the original black box model. • How it works ◦ Generate a artificial dataset from the example we’re going to explain. ◦ Use original model to get target values for each example in a generated dataset ◦ Train a new interpretable model, using generated dataset and generated labels as training data. ◦ Explain the original example through weights/rules of this new model. *Prediction quality of a white-box classifier shows how well it approximates the original model. If the quality is low then explanation shouldn’t be trusted.

No Perfect Solution • • ◦ ▪

Do it with Python

Libraries • Scikit-Learn ◦ https://scikit-learn.org/ • Local Interpretable Model-Agnostic Explanations
(LIME) ◦ https://github.com/marcotcr/lime • ELI5 ◦ https://github.com/TeamHG-Memex/eli5

Dog Breeds Classification • Stanford Dogs Dataset (subset of ImageNet)
◦ http://vision.stanford.edu/aditya86/ImageNetDogs/ • Summary ◦ 120 dog breeds ◦ Around 150 images per class ◦ In total, 20580 images

Further readings • ◦ • ◦ • ◦

We’re hiring Front-end Developer and SW Engineer! Send your CV
to [email protected]

[PyCon JP 2018] Interpretable Machine Learning,...

[PyCon JP 2018] Interpretable Machine Learning, making black box models explainable with Python!

David Low

More Decks by David Low

Other Decks in Technology

Featured

Transcript

Interpretable Machine Learning: Making black-box models explainable with Python! David

Bio • • • • •

• • ◦ ◦ ◦ • ◦ ◦ Overview

Machine Learning is everywhere! Source: Nvidia

Source: xkcd

Is there a need to explain ML model? • Safety

EU General Data Protection Regulation (GDPR) • Article 15 and

A highly accurate model to classify Wolf and Husky

A highly accurate model BUT... • Does the model learn

Machine Learning Model • ML model is a function that

Complexity of learned function 1. Linear + Monotonic function 2.

Monotonicity Non-monotonic x y Monotonic x y

Scope of Interpretability • Global Interpretability ◦ How do parts

Approaches • Interpretable Models • Model-Agnostic Methods ◦ Partial Dependence

Partial Dependence Plot (PDP) I • Shows the marginal effect

Partial Dependence Plot (PDP) II

Visualize in 3D

Individual Conditional Expectations (ICE) • Visualizes the dependence of the

• Determined by the changes in the model’s prediction error

Feature Importance II

Local Interpretable Model-Agnostic Explanations (LIME) • Local surrogate models that

No Perfect Solution • • ◦ ▪

Do it with Python

Libraries • Scikit-Learn ◦ https://scikit-learn.org/ • Local Interpretable Model-Agnostic Explanations

Dog Breeds Classification • Stanford Dogs Dataset (subset of ImageNet)

DEMO

Further readings • ◦ • ◦ • ◦

We’re hiring Front-end Developer and SW Engineer! Send your CV