Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Seminar #45 - Cohort Analysis Part 4 - Analyzing What Makes Churn with Prediction Models

Seminar #45 - Cohort Analysis Part 4 - Analyzing What Makes Churn with Prediction Models

Analyzing which customers churn and why seems to fit for the typical machine learning topic. But given that the probability of customer churns is greatly influenced by the customer lifetime, the typical machine learning models are not a good fit.

As we have seen in the series of Cohort Analysis seminars, the survival curve gives us a good picture of how the customer churn changes over time. It turned out that there are statistical and machine learning models that we can use to predict the survival curve. By using these models, we can analyze what makes customers churn and predict who could churn.

In this seminar, Kan introduces Cox Regression Model and Random Survival Model to analyze the customer churn with a live demo.

Subscribe ↓
https://www.youtube.com/channel/UCOVfLaSQBvMRwZCyiccq4Iw

Twitter ↓
https://twitter.com/ExploratoryData

UI Tool: Exploratory(https://exploratory.io/)
Exploratory Online Seminar: https://exploratory.io/online-seminar

Kan Nishida

May 19, 2021
Tweet

More Decks by Kan Nishida

Other Decks in Technology

Transcript

  1. Kan Nishida CEO/co-founder Exploratory Summary In Spring 2016, launched Exploratory,

    Inc. to democratize Data Science. Prior to Exploratory, Kan was a director of product development at Oracle leading teams to build various Data Science products in areas including Machine Learning, BI, Data Visualization, Mobile Analytics, Big Data, etc. While at Oracle, Kan also provided training and consulting services to help organizations transform with data. @KanAugust Speaker
  2. 3 Data Science is not just for Engineers and Statisticians.

    Exploratory makes it possible for Everyone to do Data Science. The Third Wave
  3. 5 Questions Communication (Dashboard, Note, Slides) Data Access Data Wrangling

    Visualization Analytics (Statistics / Machine Learning) ExploratoryɹModern & Simple UI
  4. 7 Agenda • Survival Model - Cox Regression • Survival

    Model - Random Survival Model • Difference between Cox Regression vs. Survival Model • Prediction
  5. SaaS - Software as a Service A business model where

    you charge a software license fee as subscription basis for a value you provided through the software.
  6. For SaaS businesses, the initial payment from a customer tends

    to smaller but it accumulates over time.
  7. 10 Jan $100 You sell a product to one user

    and collect the money right a way. Non Subscription Model
  8. 11 This means, you can spend $80 to acquire this

    customer but still can make a profit. Jan $100 Expense Revenue Profit
  9. 12 Jan $10 You collect only the monthly subscription amount

    at the first month in case of the monthly subscription. Subscription Model
  10. 13 If you spend $80 to acquire the customer you

    will lose money. Jan $10 Subscription Expense $80
  11. 14 Jan Feb Mar Total … Dec $120 But, you’ll

    accumulate the revenue from the same customer over time. $10 $10 $10 $10
  12. 15 Jan Feb Mar Total … Dec $120 So if

    the customer retain for a long enough, you’ll be able to pay off the initial cost of acquisition. Expense $80 $10 $10 $10 $10
  13. 16 Jan Feb Mar Total Apr May Jun Jul Aug

    Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec The longer your customer retain you’ll make more money. Customer Life Time Value - CLTV
  14. 17 Jan Feb Mar Total Apr May Jun Jul Aug

    Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec This means not only you can spend more money on acquiring customers but also can invest more in your products and people. Profit
  15. Jan Feb Mar Apr It’s important to divide the MRR

    into cohorts of customers life time. 4 mon. 4 mon. 4 mon. 4 mon. 3 mon. 3 mon. 3 mon. 3 mon. 2 mon. 1 mon. 1 mon. 3 mon. 2 mon. 3 mon. This customer has paid 4 months of subscription by this time.
  16. By dividing the MRR by the cohort of when the

    users converted, you can see how efficiently your business is growing and can grow further. 9 Month Ago 8 Month Ago 7 Month Ago 6 Month Ago 5 Month Ago 4 Month Ago 3 Month Ago 2 Month Ago 1 Month Ago
  17. 20 9 Month Ago 8 Month Ago 7 Month Ago

    6 Month Ago 5 Month Ago 4 Month Ago 3 Month Ago 2 Month Ago 1 Month Ago Converted at Jan Feb Mar Apr May Jun Jul Aug Sep Customers churn, so each cohort’s MRR tends to go down.
  18. 21 If customers don’t churn then the ratio of longer

    customers in MRR is bigger. if the older cohorts retain longer you can accelerate your growth as you acquire new customer. If many customers churn then the ratio of shorter customers in MRR is bigger. 9 Month Ago 8 Month Ago 7 Month Ago 6 Month Ago 5 Month Ago 4 Month Ago 3 Month Ago 2 Month Ago 1 Month Ago
  19. Before answering the question, let’s realize that Customer Churn is

    a bit tricky idea as compared to Customer Conversion.
  20. 26 Converted 60% No Converted 40% 60% of the lead

    customers converted. Lead Customers
  21. 27 0 0.25 0.5 0.75 1 40% 60% Not Converted

    Converted Visualizing the 60% customer conversion.
  22. 30 Month In 35% 100% 40% 32% 60% 48% Here’s

    a survival curve that shows the survival rate (or retention rate) through each period.
  23. 32

  24. 33 48% 48% of Customers would retain through 2nd month.

    Month 35% 100% 40% 32% 60% Each retention rate on this chart indicate the rate of customers who would retain through a given period.
  25. 34 The customer churn rates are depending on how long

    the customer being a customer. 52% Month 100% 60% 40%
  26. Now, let’s think about another question. ‘What makes customer convert?’

    If we have a hypothesis like ‘Mac users might convert more than Windows users’, then we can compare the conversion ratios between the two OS groups.
  27. 36 0 0.25 0.5 0.75 1 Mac Windows 60% 40%

    40% 60% Not Converted Converted Mac users converted more (60%) than Windows users, so the OS makes the difference.
  28. But, if the question is, ‘what makes customer churn?’, it

    become a bit tricky. Because, that depends on which period we are talking about!
  29. So, instead of comparing the two numbers (average churn rates),

    why don’t we compare the two survival curves!
  30. 40 Overall, Windows users tend to churn more than Mac

    users. Especially in the first 4 months after they converted With Confidence Interval
  31. 41 We can use the survival curve to see what

    makes customers churn more or less. The steeper the survival curve is the more chance of churning.
  32. 42 Now, instead of generating the survival curve for every

    potential variable of your interest, we can build survival models to see which variable have more influence on the slope of curve.
  33. • Each row represents each user. • It indicates which

    product features each user used in the first week since the conversion.
  34. We want to know which product features help customers retain

    longer. Maybe, using ‘Add Birthday’ feature in the first week of the conversion helps customers retain longer. Let’s find that out.
  35. Assign ‘AddBirthday’ column to Color to see if the usage

    of this feature makes any difference on the survival curve.
  36. Just like we did for creating the survival curve, we

    assign the start time, the end time, and the event status columns.
  37. Then, this time, click on the Predictor Variable button to

    select the variables of your interest.
  38. Once you click on the Run button, you’ll get the

    Cox Regression model being created.
  39. Open the Importance tab to see which variables are more

    influential on the survival curves. The variables with gray color are considered ‘not significant.’
  40. Open the Survival Curves tab to see how each of

    the variable would make the difference on the survival curves.
  41. Customers who used ‘Clear Activity Log’ feature tend to churn

    more than those who didn’t use the feature.
  42. On the other hand, customers who received friend request tend

    to churn less than those who didn’t receive the request.
  43. Open the Prediction tab to see what difference each variable

    would make on the churn rate (cancel rate).
  44. The churn rate is about 64% for those who used

    ‘Clear Activity Log’ feature and it is higher than the one for those who didn’t use the feature.
  45. These are churn rates at the 3rd month. By default,

    it uses the predicted values at the mean period.
  46. 3 Survival Rate (Retention Rate) Cancel Rate (Churn Rate) Under

    Survival Curve tab Under Prediction tab
  47. Blue color indicates that the chance of churn is more

    likely. The users who cleared their activity logs are more likely to churn.
  48. Red color indicates that the chance of churn is less

    likely. The users who liked posts on their timeline or received friend requests are less likely to churn.
  49. Hazard ratio 1 means that it’s 50/50 on whether the

    users churn or not. Any variables confidence intervals crossing 1 are not considered to make significant difference in either direction (more churn or less churn).
  50. A chance of observing an event (e.g. cancel) occurring by

    the end of a given time (1 day, 1 month, etc.). 74 Hazard
  51. 75 Let’s say we have 2 survival curves for 2

    cohorts - Mac users and Windows users.
  52. 78 5.5% 5% The hazard ratio of Windows users compared

    to Mac users is 1.1. 0.055/0.05 = 1.1 Windows Mac
  53. 79 5.5% 5% Hazard Ratio for Windows Users: 1.1 If

    the hazard ratio is greater than 1 the chance of churn is more likely.
  54. 80 5.5% 5% The hazard ratio of Mac users compared

    to Windows users is 0.91. 0.05/0.055 = 0.91 Windows Mac
  55. 81 5.5% 5% Hazard Ratio for Mac Users: 0.91 If

    the hazard ratio is less than 1 the chance of churn is less likely.
  56. Assuming the hazard ratio is constant, it find the most

    optimal hazard ratios for a given set of predictor variables to fit the actual survival curves. 82 Cox Regression y = exp(-H(t) * exp(a * predictor)) y : survival curve (a series of survival ratios)
 t : survival period Hazard Ratio
  57. 84 5.5% 5% This hazard ratio is constant throughout all

    the periods… Hazard Ratio for Windows Users: 1.1
  58. 85 5.5% 5% If the hazard ratio of Windows users

    compared to Mac users is 1.1 at any given period the survival curves would look like this.
  59. Now that we know that users who ‘likes posts on

    their timeline’ or ‘receives friend requests’ are less likely to churn, how the actual data look like?
  60. The actual data also shows that the users who received

    the friend requests indeed less likely to churn.
  61. The actual data also shows that the users who liked

    posts on their timeline also less likely to churn.
  62. 91 Now that we know that users who ‘likes posts

    on their timeline’ or ‘receives friend requests’ are less likely to churn, how about the users who do both activities?
  63. We can create a column that combines the two columns

    and create the survival curve with this cohort.
  64. Select the two columns with Control or Command key, and

    select ‘Unite Multiple Columns’ from the column header menu.
  65. Move the Pin button to the latest step so that

    we can use the data to create the survival curve.
  66. 104 Data Result Decision Tree Decision tree is a series

    of forks with conditional questions.
  67. 105 Data Result Decision Tree There is a way to

    predict the survival curve (a series of survival rates over time) with Decision Tree.
  68. It creates a set of trees and have them predict.

    The prediction results can be different between the trees, if such cases it takes the majority vote or the mean. 106 Random Forest Data Sample Sample Sample Result Result Result Average
  69. It uses the Random Forest algorithm to predict the survival

    curves. Each tree predict a survival curve. Random Forest takes the mean of the survival curves. 107 Survival Random Forest Data Sample Sample Sample Result Result Result Average
  70. We can see a similar result as we saw with

    the Cox Regression model.
  71. Here, I’ve created a Cox Regression model with ‘OS’ variable

    with ‘Windows’, ‘Mac’, and ‘Linux’ values.
  72. Cox Regression model has a constraint that the hazard ratio

    is constant, hence the order of the curves and the way the curves are declining are consistent throughout the time.
  73. Here, I’ve created a Survival Forest model with ‘OS’ variable

    with ‘Windows’, ‘Mac’, and ‘Linux’ values.
  74. Survival Forest doesn’t have a constraint like the one that

    Cox Regression has. This makes it possible for the order of cohorts can be different time to time and the shape of the curve is much more flexible.
  75. 119 Cox Regression has a constraint of the hazard ratio

    being constant, hence it is not good at capturing ‘non-linear’ patterns. Prediction with Cox Regression Actual Data Female Male Female Male
  76. 120 Prediction with Survival Forest Machine Learning models like Survival

    Model tend to capture the pattern in actual data better because it doesn’t have the constraint that holds the Statistical models like Cox Regression. Actual Data Female Male Female Male
  77. 121 • For prediction, the machine learning models like Random

    Survival Forest tend to perform better given the nature of being able to capture even the non-linear patterns while the statistical learning models like Cox Regression can only capture linear patterns. • However, the machine learning models can’t be used to evaluate whether a given relationship is significant or not and to evaluate causal relationship, both of which are common tasks for the statistical learning models like Cox Regression. Statistical Models vs. Machine Learning Models
  78. Cox Regression (Stats Learning) Random Forest (Machine Learning) Significance Check

    (Statistical Test) Yes No Non Linear Relationship Can’t capture Can capture Correlation & Causation Can be used for evaluating Causal Hypothesis Only Correlation Categorical Predictors Internally converts it to multiple variables. Interpret the hazard ratio for each variable as a ratio against the base level. As is 122
  79. You can predict the probability of survival for each individual

    by using the model - Survival Random Forest / Cox Regression - you have created under the Analytics view.
  80. Step 2. Open a another data frame. This is the

    data you want to predict for.
  81. Step 3: Select the model you created before and set

    the Survival Time for Prediction