Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Machine Learning Interpretability in Python

Machine Learning Interpretability in Python

What is Interpretable ML / Explainable AI ?
Why is Interpretability required in Machine Learning?
How is it relevant to me?
Types of Interpretability (Global & Local)
IML with Python References (LIME, SHAP)

AbdulMajedRaja RS

February 22, 2020
Tweet

More Decks by AbdulMajedRaja RS

Other Decks in Technology

Transcript

  1. Outline: • What is Interpretable ML / Explainable AI ?

    • Why is Interpretability required in Machine Learning? • How is it relevant to me? • Types of Interpretability (Global & Local) • IML with Python References (LIME, SHAP) Less ML More ML
  2. Interpretable ML / eXplainable AI Interpretable Machine Learning refers to

    methods and models that make the behavior and predictions of machine learning systems understandable to humans. Otherwise, the ability to interpret / explain your ML Model/Algorithm’s Predictions
  3. Why IML? • Fairness: Ensuring that predictions are unbiased and

    do not implicitly or explicitly discriminate against protected groups. • Privacy: Ensuring that sensitive information in the data is protected. • Reliability or Robustness: Ensuring that small changes in the input do not lead to large changes in the prediction. • Causality: Check that only causal relationships are picked up. • Trust: It is easier for humans to trust a system that explains its decisions compared to a black box. • Legal: Compliance (like from GDPR) emphasise on Right to Explanation
  4. Simply, “ An interpretable model can tell you why it

    has decided that a certain person should not get a loan, and it becomes easier for a human to judge whether the decision is based on a learned demographic (e.g. racial, gender) bias.
  5. How is it relevant to me? • You are a

    Data Scientist who takes pride in your work, But what’s with the pride when you have no clue about why it does what it does! • You may not be a Data Scientist or ML Engineer, But a Technologist / Team member of a given project, you want to validate what’s inside it • A lot of times, Data points used for an ML Model are Real Human Beings - Could be You and I - Today or Some day!
  6. LIME Local Interpretable Model-agnostic Explanations Surrogate models are trained to

    approximate the predictions of the underlying black box model. Works for Tabular, Text, Image Dataset
  7. LIME - Pros • Model-Agnostic, Different Data Types (Tabular/Text) •

    Easy to interpret for Non-ML users (even non-Techies) • Relatively faster than SHAP in many cases • Local Fidelity Measure helps us understand how well the LIME is approximating to the Black-box Model
  8. LIME - Cons • Interpretability could be Unstable (since it’s

    based on Sampling of Data) • Local Model Approximation could be bad making LIME interpretability unreliable • First Time setup could be a tedious task for new comers
  9. SHAP SHapley Additive exPlanations Based on Shapley Values from Game

    Theory Shapley values – a method from coalitional game theory – tells us how to fairly distribute the “payout” among the features. https://datascience.sia-partners.com/en/blog/interpretable-machine-learning
  10. SHAP SHapley Additive exPlanations Based on Shapley Values from Game

    Theory Shapley values – a method from coalitional game theory – tells us how to fairly distribute the “payout” among the features. Feature (Columns) of a data instance act as players in a coalition https://datascience.sia-partners.com/en/blog/interpretable-machine-learning
  11. SHAP - Pros • Quite easy to get it up

    and running • Strong Theoretical Foundation (in Game theory) makes SHAP a nice candidate for Legal/Compliance Requirements • TreeExplainer (TreeSHAP) is fast for Tree-based models • SHAP provides a unified package - Local and Global Interpretability all based on Common Shapley Values (while if we use LIME, it’s only Local) • Interactive Visualization on Notebooks more intuitive for Business
  12. SHAP - Cons • KernelExplainer (model-agnostic) is slow and ignores

    Feature Dependence (TreeExplainer addresses both)