Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Demystifying complex models: Learnings from usi...

Yasir Ekinci
September 27, 2018

Demystifying complex models: Learnings from using SHAP explainers in the real world at GoCardless (London Python meetup)

Complex algorithms (GBDTs, deep neural nets, etc.) have the ability to perform far better than linear models because they can capture non-linear behaviour and interaction effects. However, interpreting these models was typically more difficult or in some cases (e.g. neural nets) even impossible. For ML applications where explainability on the local (individual) level was key, this typically meant we were limited to simpler, more explainable models like Logistic Regression.

Recently we have seen advances in using simpler, locally interpretable models that are trained on top of the outputs of complex models. SHAP (SHapley Additive exPlanations) is a unified approach to explain the output of any machine learning model.

In this talk, we will share our experience of using SHAP in a real-world ML application, the changes we made to both our training and prediction phases and considerations to take into account when using SHAP.

Yasir Ekinci

September 27, 2018
Tweet

Other Decks in Technology

Transcript

  1. Start with why Why is this score high? How do

    we know if the model is right? If this is not fraud, why did the model make a mistake?
  2. WTF is SHAP? SHapley Additive exPlanations a unified approach to

    explain the output of any machine learning model [kind of]
  3. Diagram: how SHAP fits in the ML process Input data

    Prediction Explanation SHAP model Complex model https://github.com/slundberg/shap/blob/master/README.md
  4. Estimator: wrapping a classifier + explainer Estimator Untrained complex model

    Placeholder for Explainer https://gist.github.com/yoziru-desu/8093eada2b612ca144b233d538488340
  5. Training phase: adding explainer to fit function Fit model with

    training data Trained complex model SHAP TreeExplainer Fit explainer with trained model
  6. What do SHAP explanations look like? Feature Input value Explanation

    (log odds) feature_2 7.81804 + 5.74192 feature_7 TRUE +0.717551 feature_18 FALSE -0.33685 feature_28 0.208333 +0.554768 feature_17 0.87623 -0.27466
  7. Core properties of explainability on the local level Positive vs

    negative impact Relative importance SHAP values are additive Feature Explanation (log odds) feature_2 + 5.74192 feature_7 +0.717551 feature_28 +0.554768 feature_17 -0.27466 feature_18 -0.33685
  8. Global SHAP summary: what does it tell us? Positive vs

    negative impact Relative importance But what about categorical features? Missing or non-numeric
  9. Visualising a single feature can uncover relationships Is the relationship

    between the output and this feature linear / exponential / ... Density of the impact: how likely will a certain feature value occur?
  10. Interaction effects: still hard to explain Doable for 2 features

    (2D ) Human brain is limited to 3 dimensions What’s the interaction with the other 20 features? (20D) feature_13 = TRUE feature_13 = FALSE
  11. Summary 1 SHAP is relatively easy to integrate 2 It

    comes with a computational cost 3 Works well for explaining local predictions 4 Makes the model more explainable globally 5 But interaction effects are still hard to understand
  12. We’re hiring! Want to know more about the job opportunities

    within our data and engineering teams? Speak with Haron or email [email protected]