Machine Learning Interpretability in Python

Slide 1

Slide 1 text

Machine Learning Interpretability in Python AbdulMajedRaja RS @1littlecoder

Slide 2

Slide 2 text

Outline: ● What is Interpretable ML / Explainable AI ? ● Why is Interpretability required in Machine Learning? ● How is it relevant to me? ● Types of Interpretability (Global & Local) ● IML with Python References (LIME, SHAP) Less ML More ML

Slide 3

Slide 3 text

Heavily borrowed from:

Slide 4

Slide 4 text

The mandatory joke before presentation: https://xkcd.com/1838/

Slide 5

Slide 5 text

Interpretable ML / eXplainable AI Interpretable Machine Learning refers to methods and models that make the behavior and predictions of machine learning systems understandable to humans. Otherwise, the ability to interpret / explain your ML Model/Algorithm’s Predictions

Slide 6

Slide 6 text

Why IML? ● Fairness: Ensuring that predictions are unbiased and do not implicitly or explicitly discriminate against protected groups. ● Privacy: Ensuring that sensitive information in the data is protected. ● Reliability or Robustness: Ensuring that small changes in the input do not lead to large changes in the prediction. ● Causality: Check that only causal relationships are picked up. ● Trust: It is easier for humans to trust a system that explains its decisions compared to a black box. ● Legal: Compliance (like from GDPR) emphasise on Right to Explanation

Slide 7

Slide 7 text

Simply, “ An interpretable model can tell you why it has decided that a certain person should not get a loan, and it becomes easier for a human to judge whether the decision is based on a learned demographic (e.g. racial, gender) bias.

Slide 8

Slide 8 text

When things go wrong!

Slide 9

Slide 9 text

The Big B Value - BRAND https://twitter.com/dhh/status/1192946583832158208 Creator of Ruby on Rails

Slide 10

Slide 10 text

How is it relevant to me? ● You are a Data Scientist who takes pride in your work, But what’s with the pride when you have no clue about why it does what it does! ● You may not be a Data Scientist or ML Engineer, But a Technologist / Team member of a given project, you want to validate what’s inside it ● A lot of times, Data points used for an ML Model are Real Human Beings - Could be You and I - Today or Some day!

Slide 11

Slide 11 text

Types of IML ● Model-Speciﬁc ● Model-Agnostic ● Global Interpretability ● Local Interpretability

Slide 12

Slide 12 text

LIME Local Interpretable Model-agnostic Explanations Surrogate models are trained to approximate the predictions of the underlying black box model. Works for Tabular, Text, Image Dataset

Slide 13

Slide 13 text

LIME - pip install lime

Slide 14

Slide 14 text

LIME - pip install lime

Slide 15

Slide 15 text

LIME - Pros ● Model-Agnostic, Different Data Types (Tabular/Text) ● Easy to interpret for Non-ML users (even non-Techies) ● Relatively faster than SHAP in many cases ● Local Fidelity Measure helps us understand how well the LIME is approximating to the Black-box Model

Slide 16

Slide 16 text

LIME - Cons ● Interpretability could be Unstable (since it’s based on Sampling of Data) ● Local Model Approximation could be bad making LIME interpretability unreliable ● First Time setup could be a tedious task for new comers

Slide 17

Slide 17 text

SHAP SHapley Additive exPlanations Based on Shapley Values from Game Theory Shapley values – a method from coalitional game theory – tells us how to fairly distribute the “payout” among the features. https://datascience.sia-partners.com/en/blog/interpretable-machine-learning

Slide 18

Slide 18 text

SHAP SHapley Additive exPlanations Based on Shapley Values from Game Theory Shapley values – a method from coalitional game theory – tells us how to fairly distribute the “payout” among the features. Feature (Columns) of a data instance act as players in a coalition https://datascience.sia-partners.com/en/blog/interpretable-machine-learning

Slide 19

Slide 19 text

SHAP - pip install shap

Slide 20

Slide 20 text

SHAP - Global Explanation

Slide 21

Slide 21 text

SHAP - Pros ● Quite easy to get it up and running ● Strong Theoretical Foundation (in Game theory) makes SHAP a nice candidate for Legal/Compliance Requirements ● TreeExplainer (TreeSHAP) is fast for Tree-based models ● SHAP provides a uniﬁed package - Local and Global Interpretability all based on Common Shapley Values (while if we use LIME, it’s only Local) ● Interactive Visualization on Notebooks more intuitive for Business

Slide 22

Slide 22 text

SHAP - Cons ● KernelExplainer (model-agnostic) is slow and ignores Feature Dependence (TreeExplainer addresses both)

Slide 23

Slide 23 text

Thx, PLS Share your Feedback, NOW! Hey Abdul @1littlecoder, Your talk _______________