Demystifying complex models: Learnings from using SHAP explainers in the real world at GoCardless (London Python meetup)

Demystifying complex models What we learned using SHAP explainers in
the field Yasir Ekinci

Start with why

Start with why Why is this score high? How do
we know if the model is right? If this is not fraud, why did the model make a mistake?

The dilemma: simple vs complex models Simple model Complex model
Explainable Accurate

WTF is SHAP? SHapley Additive exPlanations a unified approach to
explain the output of any machine learning model [kind of]

Diagram: how SHAP fits in the ML process Input data
Prediction Explanation SHAP model Complex model https://github.com/slundberg/shap/blob/master/README.md

_ How we use SHAP

Estimator: wrapping a classifier + explainer Estimator Untrained complex model
Placeholder for Explainer https://gist.github.com/yoziru-desu/8093eada2b612ca144b233d538488340

Training phase: adding explainer to fit function Fit model with
training data Trained complex model SHAP TreeExplainer Fit explainer with trained model

Prediction phase: using a new explain method Explanation SHAP TreeExplainer
Input data

What do SHAP explanations look like? Feature Input value Explanation
(log odds) feature_2 7.81804 + 5.74192 feature_7 TRUE +0.717551 feature_18 FALSE -0.33685 feature_28 0.208333 +0.554768 feature_17 0.87623 -0.27466

_ What we learned

1 SHAP adds additional computation and doesn’t scale as well

SHAP explanations don’t come for free 6.8x explanation time 2.6x
prediction time 10x data

2 SHAP values work well for explaining local predictions

Core properties of explainability on the local level Positive vs
negative impact Relative importance SHAP values are additive Feature Explanation (log odds) feature_2 + 5.74192 feature_7 +0.717551 feature_28 +0.554768 feature_17 -0.27466 feature_18 -0.33685

Diagnosing score changes by using SHAP values Why did the
score suddenly change?

3 Adding SHAP to your complex model doesn’t solve global
explainability completely

Global SHAP summary: what does it tell us? Positive vs
negative impact Relative importance But what about categorical features? Missing or non-numeric

Visualising a single feature can uncover relationships Is the relationship
between the output and this feature linear / exponential / ... Density of the impact: how likely will a certain feature value occur?

Interaction effects: still hard to explain Doable for 2 features
(2D ) Human brain is limited to 3 dimensions What’s the interaction with the other 20 features? (20D) feature_13 = TRUE feature_13 = FALSE

Summary 1 SHAP is relatively easy to integrate 2 It
comes with a computational cost 3 Works well for explaining local predictions 4 Makes the model more explainable globally 5 But interaction effects are still hard to understand

Questions?

We’re hiring! Want to know more about the job opportunities
within our data and engineering teams? Speak with Haron or email [email protected]

Demystifying complex models: Learnings from usi...

Demystifying complex models: Learnings from using SHAP explainers in the real world at GoCardless (London Python meetup)

Yasir Ekinci

Other Decks in Technology

Featured

Transcript

Demystifying complex models What we learned using SHAP explainers in

Start with why

Start with why Why is this score high? How do

The dilemma: simple vs complex models Simple model Complex model

WTF is SHAP? SHapley Additive exPlanations a unified approach to

Diagram: how SHAP fits in the ML process Input data

_ How we use SHAP

Estimator: wrapping a classifier + explainer Estimator Untrained complex model

Training phase: adding explainer to fit function Fit model with

Prediction phase: using a new explain method Explanation SHAP TreeExplainer

What do SHAP explanations look like? Feature Input value Explanation

_ What we learned

1 SHAP adds additional computation and doesn’t scale as well

SHAP explanations don’t come for free 6.8x explanation time 2.6x

2 SHAP values work well for explaining local predictions

Core properties of explainability on the local level Positive vs

Diagnosing score changes by using SHAP values Why did the

3 Adding SHAP to your complex model doesn’t solve global

Global SHAP summary: what does it tell us? Positive vs

Visualising a single feature can uncover relationships Is the relationship

Interaction effects: still hard to explain Doable for 2 features

Summary 1 SHAP is relatively easy to integrate 2 It

Questions?

We’re hiring! Want to know more about the job opportunities