Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Preventing churn like a bandit

Gerben Oostra
November 30, 2019

Preventing churn like a bandit

Losing customers, also referred to as churning, is something that any company wants to prevent. But not by predicting churn, assuming correlation is causation, or by acting on prescribed actions. Let me show how to combine techniques from uplift modelling, causal inference and reinforcement learning, into one contextual bandit system that balances exploitation & exploration and deals with biases.
This talk has been presented at PyData Eindhoven 2019 (Netherlands): https://pydata.org/eindhoven2019/schedule/presentation/16/preventing-churn-like-a-bandit/
The first section, why to predict Uplift to prevent churn, is explained further in the following blogpost: https://medium.com/bigdatarepublic/for-effective-treatment-of-churn-dont-predict-churn-58328967ec4f

Gerben Oostra

November 30, 2019
Tweet

More Decks by Gerben Oostra

Other Decks in Technology

Transcript

  1. Churn is a revenue drain Preventing churn like a bandit

    * 25% & 75% quantiles, depends on business size. https://www.profitwell.com/blog/average-revenue-churn-rate-benchmarks Companies lose between 2% and 16%* of revenue every month due to churn.
  2. 3 issues within churn prevention Preventing churn like a bandit

    1: Models predicting churn 2: Assuming correlation is causation 3: Select the best predicted treatment
  3. Running example: Telco (internet) Preventing churn like a bandit Telco

    Inc Provides subscriptions with TV & Internet & Mobile Has a lot of diverse customers
  4. Preventing churn like a bandit Issue 1: Models predict churn

    Churn Classification Model Do nothing Direct mail Telemarketing Door 2 door Choose between Features Predicted churn propensity Policy Train Historical data
  5. Preventing churn like a bandit Issue 1: It can’t be

    that bad, right? Retrain Churn Classification Model Low scores Do nothing Churners High scores We learn from non-actions Non-churners Low scores
  6. Preventing churn like a bandit Issue 1: It can’t be

    that bad, right? Retrain Churn Classification Model High scores Telemarketing Persuadable Low scores Lost causes High scores We learn to keep calling lost causes We learn to stop calling the persuadable
  7. Action P(churn | A) P(retention | A) Uplift 0.6 0.4

    0 @ 0.65 0.35 -0.05 TM 0.4 0.6 0.2 D2D 0.35 0.65 0.25 Classification Model Direct mail Telemarketing Door 2 door Features Predictions Solution 1a: Predict uplift Policy Do nothing Preventing churn like a bandit
  8. Action P(churn | A) P(retention | A) Uplift 0.6 0.4

    0 @ 0.65 0.35 -0.05 TM 0.4 0.6 0.2 D2D 0.35 0.65 0.25 Classification Model Direct mail Telemarketing Door 2 door Features Predictions Solution 1a: Predict uplift Policy The default Preventing churn like a bandit
  9. Action P(churn | A) P(retention | A) Uplift 0.6 0.4

    0 @ 0.65 0.35 -0.05 TM 0.4 0.6 0.2 D2D 0.35 0.65 0.25 Classification Model Direct mail Telemarketing Door 2 door Features Predictions Solution 1a: Predict uplift Policy Uplift(A,x) = P(retention | A, x) – P(retention | , x) The default Preventing churn like a bandit
  10. Action P(churn | A) P(retention | A) Uplift 0.6 0.4

    0 @ 0.65 0.35 -0.05 TM 0.4 0.6 0.2 D2D 0.35 0.65 0.25 Classification Model Direct mail Telemarketing Door 2 door Features Predictions Solution 1a: Predict uplift Policy Uplift(A,x) = P(retention | A, x) – P(retention | , x) Transformed Outcome Trick Regression model with labels (0, -2, 2) Athey, S., & Imbens, G. W. (2015). Machine learning methods for estimating heterogeneous causal effects. stat, 1050(5). Uplift Regression Model The default Preventing churn like a bandit
  11. Preventing churn like a bandit CLV € 300 Cost €

    0.- € 0.25 € 5.- € 20.- Solution 1b : Base policy on economic result Action Uplift 0 @ -0.05 TM 0.20 D2D 0.25 × Result € 0.- € -15.25 € 55,- € 55,- − = Model Policy
  12. Preventing churn like a bandit How to do predict treatment

    effect (uplift) 1. Transformed outcome 2. Dealing with counterfactuals
  13. Preventing churn like a bandit Predicting uplift: The Transformed Outcome

    Retained? Y Treated W==1 Control W==0 Yes 2 -2 No 0 0 ∗ = − 1 − Y = retained? (1 or 0) W = treated? (1 or 0) p = P(W==1) treatment policy When: p = 0.5 Athey, S., & Imbens, G. W. (2015). Machine learning methods for estimating heterogeneous causal effects. stat, 1050(5).
  14. Preventing churn like a bandit Minimizing RMSE results in Uplift

    Retained? Treated Control Yes 2 -2 No 0 0 2 RMSE Lost causes -2 0 0 Uplift:
  15. Preventing churn like a bandit Minimizing RMSE results in Uplift

    Retained? Treated Control Yes 2 -2 No 0 0 RMSE Persuadable -2 0 1 Uplift: 2
  16. Preventing churn like a bandit Minimizing RMSE results in Uplift

    Retained? Treated Control Yes 2 -2 No 0 0 RMSE Sleeping dogs -2 0 -1 Uplift: 2
  17. Preventing churn like a bandit Minimizing RMSE results in Uplift

    Retained? Treated Control Yes 2 -2 No 0 0 RMSE Sure causes -2 0 0 Uplift: 2
  18. Preventing churn like a bandit Transformed outcome for multiple treatments

    Retained? @ TM DM Yes -2 2 2 2 No 0 0 0 0 Churn prevention Features @ TM D2D 1 1 - - - 2 0 - - - 3 - - - 1 4 - 0 - - 5 - - 1 - Retained? Features @ TM D2D 1 -2 -2 -2 2 0 0 0 3 - - 2 4 0 - - 5 - 2 - Transformed outcome = control @, TM, D2D = Treated Retained? Control Treated Yes -2 2 No 0 0
  19. Preventing churn like a bandit Model setup for Uplift Churn

    prevention Action Uplift @ -0.15 TM 0.2 D2D 0.26 Regression Model (Re)Train Predictions Features Features labels Features @ TM D2D 1 -2 -2 -2 2 0 0 0 3 - - 2 4 0 - - 5 - 2 -
  20. Preventing churn like a bandit Handling counterfactuals with masked feedback

    Features 1 2 3 4 5 Features @ TM D2D ✕ ✕ ✕ : Loss: RMSE TM? @? D2D? TM @ D2D @ TM D2D -2 -2 -2 0 0 0 0 0 2 0 0 0 0 2 0 labels
  21. Preventing churn like a bandit Handling counterfactuals with masked feedback

    Features 1 2 3 4 5 @ TM D2D 1 1 1 1 1 1 0 0 1 1 0 0 0 1 0 Features @ TM D2D ✕ ✕ ✕ : Loss: RMSE TM? @? D2D? TM @ D2D @ TM D2D -2 -2 -2 0 0 0 0 0 2 0 0 0 0 2 0 labels mask
  22. Preventing churn like a bandit Handling counterfactuals with masked feedback

    Features 1 2 3 4 5 @ TM D2D 1 1 1 1 1 1 0 0 1 1 0 0 0 1 0 Features @ TM D2D ✕ ✕ ✕ : Loss: RMSE TM? @? D2D? TM @ D2D @ TM D2D -2 -2 -2 0 0 0 0 0 2 0 0 0 0 2 0 labels mask
  23. Preventing churn like a bandit Handling counterfactuals with masked feedback

    Features 1 2 3 4 5 @ TM D2D 1 1 1 1 1 1 0 0 1 1 0 0 0 1 0 Features @ TM D2D ✕ ✕ ✕ : Loss: RMSE TM? @? D2D? TM @ D2D Predicted uplifts @ TM D2D -0.15 0.2 0.25 @ TM D2D -2 -2 -2 0 0 0 0 0 2 0 0 0 0 2 0 labels mask
  24. Preventing churn like a bandit Inference with core network Features

    1 2 3 4 5 Features @ TM D2D Action Uplift @ -0.15 TM 0.2 D2D 0.25 Predictions
  25. Preventing churn like a bandit Issue 1: Predicting churn Churn

    Classification Model Policy (Re)Train Features Historical data Labels Features Predicted churn propensity Choose Action
  26. Preventing churn like a bandit Solution 1: Contextual bandit maximizing

    revenue Uplift regression Model x CLV – Cost (Re)Train Predicted uplift Action based on value Feedback (churned/retained) Features Historical data Transformed outcome Labels Features
  27. Preventing churn like a bandit x CLV – Cost Predicted

    uplift Action based on value Feedback (churned/retained) Features Predict @ TM D2D → (Re)Train @ TM D2D → ✕ ✕ ✕ TM? @? D2D? TM @ D2D Transformed Outcome Labels Masks Features Solution 1: Contextual bandit maximizing revenue
  28. Preventing churn like a bandit Issue 2: Assuming correlation is

    causation T R Retention Treatment Features Causal graph Correlation has predictive power We need causation for prescriptive power We only observe correlations Here correlation is causation
  29. Preventing churn like a bandit Issue 2: Assuming correlation is

    causation T R Retention Treatment Features Historically: Past retention campaigns Future: Our model Causal graph Correlation has predictive power We need causation for prescriptive power We only observe correlations Correlation is not causation
  30. Preventing churn like a bandit Issue 2: Removing `feature –

    treatment` bias T R Retention Treatment Features To be removed Causal graph 1. Random trials 2. Change features • Feature selection • Encoding 3. Causal inference + Simple & straightforward - Not using historical data - Expensive experiments - No feedback loop - Only possible once: Created model recreates bias - Restricting predictive power - Harder to model - Models actual situation
  31. Preventing churn like a bandit Solution 2: Causal inference (Inverse

    propensity weighting) T R Retained? Treatment Features Propensity model to learn the correlation Propensity Model Age / location / .. P(T | x) @ TM - = 1 | Age / location / .. @ TM 1 2 - Propensity Model | → Weight samples inverse to propensity
  32. Preventing churn like a bandit How to do propensity weighting

    1. Calibrate propensities 2. Propensity clipping
  33. Preventing churn like a bandit Propensity weighting : clipping -

    = 1 | lim 6 7|8 →9 - → ∞ 1. Clip value 2. Trim dataset ; 7|8 = 0.95, 0.05, | ′- = F | GH, 0.05 ≤ ≤ 0.95 0, > 0.95 0, < 0.05
  34. Preventing churn like a bandit Solution 2: Causal model with

    inverse propensity weighting x CLV – Cost Predicted uplift Best Action Feedback (churned/retained) Features Predict @ TM D2D → (Re)Train @ TM D2D → ✕ ✕ ✕ TM? @? D2D? TM @ D2D Transformed Outcome Labels Masks - = 1 | Propensity Model | → Weights Features
  35. Preventing churn like a bandit Issue 3: Select the best

    predicted treatment Action Result € 0.- @ € -15.25 TM € 55,- D2D € 55,- Greedy Policy 100% Exploit limits feedback • We learn if selection worked • We never learn alternatives Feedback (Future) training data
  36. Preventing churn like a bandit Solution 3: Balance exploration &

    exploitation 1. Determine optimal treatment for each customer 2. Determine which customers to treat
  37. Preventing churn like a bandit Solution 3a: Thompson sampling for

    treatment assignment Action Uplift @ -0.05 TM 0.2 D2D 0.25 Uplift P(Uplift | T) @ TM D2D Most likely Uplift Underlying distributions of Uplift Use sample as prediction Action Uplift Result @ 0.5 € 149.75 TM 0.1 € 25 D2D 0.2 € 40 Action Uplift Result @ -0.5 €-150.25 TM 0.3 € 85 D2D 0.15 € 25 Action Uplift Result @ -0.5 €-150.25 TM 0.3 € 85 D2D 0.15 € 25 Action Uplift Result @ -0.5 €-150.25 TM 0.3 € 85 D2D 0.15 € 25 Bayesian modelling
  38. Preventing churn like a bandit Solution 3b: Perturbed Results for

    customer selection # customers Desc Result Change x% to random value from all results X % Selected Budget Every bin has random selection of full range
  39. Preventing churn like a bandit Solution 3: Balance exploration and

    exploitation x CLV – Cost Predicted uplift Best Action Feedback (churned/retained) Features Predict @ TM D2D → (Re)Train @ TM D2D → ✕ ✕ ✕ TM? @? D2D? TM @ D2D Transformed Outcome Labels Masks - = 1 | Propensity Model | → Weights Features
  40. Preventing churn like a bandit Solution 3: Balance exploration and

    exploitation Thompson Sampling Feedback (churned/retained) Features x CLV – Cost - = 1 | Propensity Model | → Action Uplift @ -0.05 TM 0.2 D2D 0.25 @ TM D2D Sampled prediction Perturbate Results (Re)Train @ TM D2D → ✕ ✕ ✕ TM? @? D2D? TM @ D2D Predict @ TM D2D → Transformed Outcome Weights Labels Masks Predicted uplift Best Action Features
  41. Preventing churn like a bandit Contextual bandit with Uplift modelling,

    causal inference & Thompson sampling Thompson Sampling Feedback (churned/retained) Features x CLV – Cost - = 1 | Propensity Model | → Action Uplift @ -0.05 TM 0.2 D2D 0.25 @ TM D2D Sampled prediction Perturbate Results (Re)Train @ TM D2D → ✕ ✕ ✕ TM? @? D2D? TM @ D2D Predict @ TM D2D → Transformed Outcome Weights Labels Masks Predicted uplift Best Action Features
  42. Phone +31 (0)168 479294 Email [email protected] Coltbaan 4C, 3439 NG

    Nieuwegein, The Netherlands Address Questions?