Demystifying complex models: Learnings from using SHAP explainers in the real world at GoCardless (London Python meetup)

Slide 1

Slide 1 text

Demystifying complex models What we learned using SHAP explainers in the field Yasir Ekinci

Slide 2

Slide 2 text

Start with why

Slide 3

Slide 3 text

Start with why Why is this score high? How do we know if the model is right? If this is not fraud, why did the model make a mistake?

Slide 4

Slide 4 text

The dilemma: simple vs complex models Simple model Complex model Explainable Accurate

Slide 5

Slide 5 text

WTF is SHAP? SHapley Additive exPlanations a unified approach to explain the output of any machine learning model [kind of]

Slide 6

Slide 6 text

Diagram: how SHAP fits in the ML process Input data Prediction Explanation SHAP model Complex model https://github.com/slundberg/shap/blob/master/README.md

Slide 7

Slide 7 text

_ How we use SHAP

Slide 8

Slide 8 text

Estimator: wrapping a classifier + explainer Estimator Untrained complex model Placeholder for Explainer https://gist.github.com/yoziru-desu/8093eada2b612ca144b233d538488340

Slide 9

Slide 9 text

Training phase: adding explainer to fit function Fit model with training data Trained complex model SHAP TreeExplainer Fit explainer with trained model

Slide 10

Slide 10 text

Prediction phase: using a new explain method Explanation SHAP TreeExplainer Input data

Slide 11

Slide 11 text

What do SHAP explanations look like? Feature Input value Explanation (log odds) feature_2 7.81804 + 5.74192 feature_7 TRUE +0.717551 feature_18 FALSE -0.33685 feature_28 0.208333 +0.554768 feature_17 0.87623 -0.27466

Slide 12

Slide 12 text

_ What we learned

Slide 13

Slide 13 text

1 SHAP adds additional computation and doesn’t scale as well

Slide 14

Slide 14 text

SHAP explanations don’t come for free 6.8x explanation time 2.6x prediction time 10x data

Slide 15

Slide 15 text

2 SHAP values work well for explaining local predictions

Slide 16

Slide 16 text

Core properties of explainability on the local level Positive vs negative impact Relative importance SHAP values are additive Feature Explanation (log odds) feature_2 + 5.74192 feature_7 +0.717551 feature_28 +0.554768 feature_17 -0.27466 feature_18 -0.33685

Slide 17

Slide 17 text

Diagnosing score changes by using SHAP values Why did the score suddenly change?

Slide 18

Slide 18 text

3 Adding SHAP to your complex model doesn’t solve global explainability completely

Slide 19

Slide 19 text

Global SHAP summary: what does it tell us? Positive vs negative impact Relative importance But what about categorical features? Missing or non-numeric

Slide 20

Slide 20 text

Visualising a single feature can uncover relationships Is the relationship between the output and this feature linear / exponential / ... Density of the impact: how likely will a certain feature value occur?

Slide 21

Slide 21 text

Interaction effects: still hard to explain Doable for 2 features (2D ) Human brain is limited to 3 dimensions What’s the interaction with the other 20 features? (20D) feature_13 = TRUE feature_13 = FALSE

Slide 22

Slide 22 text

Summary 1 SHAP is relatively easy to integrate 2 It comes with a computational cost 3 Works well for explaining local predictions 4 Makes the model more explainable globally 5 But interaction effects are still hard to understand

Slide 23

Slide 23 text

Questions?

Slide 24

Slide 24 text

We’re hiring! Want to know more about the job opportunities within our data and engineering teams? Speak with Haron or email [email protected]