Save 37% off PRO during our Black Friday Sale! »

Seminar #45 - Cohort Analysis Part 4 - Analyzing What Makes Churn with Prediction Models

Seminar #45 - Cohort Analysis Part 4 - Analyzing What Makes Churn with Prediction Models

Analyzing which customers churn and why seems to fit for the typical machine learning topic. But given that the probability of customer churns is greatly influenced by the customer lifetime, the typical machine learning models are not a good fit.

As we have seen in the series of Cohort Analysis seminars, the survival curve gives us a good picture of how the customer churn changes over time. It turned out that there are statistical and machine learning models that we can use to predict the survival curve. By using these models, we can analyze what makes customers churn and predict who could churn.

In this seminar, Kan introduces Cox Regression Model and Random Survival Model to analyze the customer churn with a live demo.

Subscribe ↓
https://www.youtube.com/channel/UCOVfLaSQBvMRwZCyiccq4Iw

Twitter ↓
https://twitter.com/ExploratoryData

UI Tool: Exploratory(https://exploratory.io/)
Exploratory Online Seminar: https://exploratory.io/online-seminar

19fc8f6113c5c3d86e6176362ff29479?s=128

Kan Nishida
PRO

May 19, 2021
Tweet

Transcript

  1. EXPLORATORY Online Seminar #45 Cohort Analysis Part 4 Finding What

    Makes Churn with Survival Model
  2. Kan Nishida CEO/co-founder Exploratory Summary In Spring 2016, launched Exploratory,

    Inc. to democratize Data Science. Prior to Exploratory, Kan was a director of product development at Oracle leading teams to build various Data Science products in areas including Machine Learning, BI, Data Visualization, Mobile Analytics, Big Data, etc. While at Oracle, Kan also provided training and consulting services to help organizations transform with data. @KanAugust Speaker
  3. 3 Data Science is not just for Engineers and Statisticians.

    Exploratory makes it possible for Everyone to do Data Science. The Third Wave
  4. 4 Questions Communication Data Access Data Wrangling Visualization Analytics (Statistics

    / Machine Learning) Data Science Workflow
  5. 5 Questions Communication (Dashboard, Note, Slides) Data Access Data Wrangling

    Visualization Analytics (Statistics / Machine Learning) ExploratoryɹModern & Simple UI
  6. EXPLORATORY Online Seminar #45 Cohort Analysis Part 4 Finding What

    Makes Churn with Survival Model
  7. 7 Agenda • Survival Model - Cox Regression • Survival

    Model - Random Survival Model • Difference between Cox Regression vs. Survival Model • Prediction
  8. SaaS - Software as a Service A business model where

    you charge a software license fee as subscription basis for a value you provided through the software.
  9. For SaaS businesses, the initial payment from a customer tends

    to smaller but it accumulates over time.
  10. 10 Jan $100 You sell a product to one user

    and collect the money right a way. Non Subscription Model
  11. 11 This means, you can spend $80 to acquire this

    customer but still can make a profit. Jan $100 Expense Revenue Profit
  12. 12 Jan $10 You collect only the monthly subscription amount

    at the first month in case of the monthly subscription. Subscription Model
  13. 13 If you spend $80 to acquire the customer you

    will lose money. Jan $10 Subscription Expense $80
  14. 14 Jan Feb Mar Total … Dec $120 But, you’ll

    accumulate the revenue from the same customer over time. $10 $10 $10 $10
  15. 15 Jan Feb Mar Total … Dec $120 So if

    the customer retain for a long enough, you’ll be able to pay off the initial cost of acquisition. Expense $80 $10 $10 $10 $10
  16. 16 Jan Feb Mar Total Apr May Jun Jul Aug

    Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec The longer your customer retain you’ll make more money. Customer Life Time Value - CLTV
  17. 17 Jan Feb Mar Total Apr May Jun Jul Aug

    Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec This means not only you can spend more money on acquiring customers but also can invest more in your products and people. Profit
  18. Jan Feb Mar Apr It’s important to divide the MRR

    into cohorts of customers life time. 4 mon. 4 mon. 4 mon. 4 mon. 3 mon. 3 mon. 3 mon. 3 mon. 2 mon. 1 mon. 1 mon. 3 mon. 2 mon. 3 mon. This customer has paid 4 months of subscription by this time.
  19. By dividing the MRR by the cohort of when the

    users converted, you can see how efficiently your business is growing and can grow further. 9 Month Ago 8 Month Ago 7 Month Ago 6 Month Ago 5 Month Ago 4 Month Ago 3 Month Ago 2 Month Ago 1 Month Ago
  20. 20 9 Month Ago 8 Month Ago 7 Month Ago

    6 Month Ago 5 Month Ago 4 Month Ago 3 Month Ago 2 Month Ago 1 Month Ago Converted at Jan Feb Mar Apr May Jun Jul Aug Sep Customers churn, so each cohort’s MRR tends to go down.
  21. 21 If customers don’t churn then the ratio of longer

    customers in MRR is bigger. if the older cohorts retain longer you can accelerate your growth as you acquire new customer. If many customers churn then the ratio of shorter customers in MRR is bigger. 9 Month Ago 8 Month Ago 7 Month Ago 6 Month Ago 5 Month Ago 4 Month Ago 3 Month Ago 2 Month Ago 1 Month Ago
  22. Understanding the customer churn (or retention) is critically important for

    any SaaS Businesses.
  23. What makes customer churn?

  24. Before answering the question, let’s realize that Customer Churn is

    a bit tricky idea as compared to Customer Conversion.
  25. What is the probability of Customer Conversion for our business?

  26. 26 Converted 60% No Converted 40% 60% of the lead

    customers converted. Lead Customers
  27. 27 0 0.25 0.5 0.75 1 40% 60% Not Converted

    Converted Visualizing the 60% customer conversion.
  28. What is the probability of Customer Churn for our business?

  29. It depends… The longer the customers’ lifetimes are the higher

    the probability of their churn would be.
  30. 30 Month In 35% 100% 40% 32% 60% 48% Here’s

    a survival curve that shows the survival rate (or retention rate) through each period.
  31. 31 Check out our past ‘Survival Curve’ seminar for more

    details.
  32. 32

  33. 33 48% 48% of Customers would retain through 2nd month.

    Month 35% 100% 40% 32% 60% Each retention rate on this chart indicate the rate of customers who would retain through a given period.
  34. 34 The customer churn rates are depending on how long

    the customer being a customer. 52% Month 100% 60% 40%
  35. Now, let’s think about another question. ‘What makes customer convert?’

    If we have a hypothesis like ‘Mac users might convert more than Windows users’, then we can compare the conversion ratios between the two OS groups.
  36. 36 0 0.25 0.5 0.75 1 Mac Windows 60% 40%

    40% 60% Not Converted Converted Mac users converted more (60%) than Windows users, so the OS makes the difference.
  37. But, if the question is, ‘what makes customer churn?’, it

    become a bit tricky. Because, that depends on which period we are talking about!
  38. So, instead of comparing the two numbers (average churn rates),

    why don’t we compare the two survival curves!
  39. 39 Here’s two survival curves, one for Mac customers and

    another for Windows customers.
  40. 40 Overall, Windows users tend to churn more than Mac

    users. Especially in the first 4 months after they converted With Confidence Interval
  41. 41 We can use the survival curve to see what

    makes customers churn more or less. The steeper the survival curve is the more chance of churning.
  42. 42 Now, instead of generating the survival curve for every

    potential variable of your interest, we can build survival models to see which variable have more influence on the slope of curve.
  43. 43 Let’s try!

  44. We’ll use a sample data from the Data Catalog.

  45. None
  46. None
  47. • Each row represents each user. • It indicates which

    product features each user used in the first week since the conversion.
  48. 48 Product Features Period Status

  49. We want to know which product features help customers retain

    longer. Maybe, using ‘Add Birthday’ feature in the first week of the conversion helps customers retain longer. Let’s find that out.
  50. First, let’s create the Survival Curve.

  51. Assign the period and the status information and click Run

    button.
  52. Here’s the survival curve that shows the retention rates through

    each period.
  53. Assign ‘AddBirthday’ column to Color to see if the usage

    of this feature makes any difference on the survival curve.
  54. Looks that customers who used this feature tend to churn

    more.
  55. Now, instead of trying other variables one by one, let’s

    create a Cox Regression model.
  56. Create a new Analytics, select ‘Cox Regression’.

  57. Just like we did for creating the survival curve, we

    assign the start time, the end time, and the event status columns.
  58. Then, this time, click on the Predictor Variable button to

    select the variables of your interest.
  59. Select the variables.

  60. Once you click on the Run button, you’ll get the

    Cox Regression model being created.
  61. Open the Importance tab to see which variables are more

    influential on the survival curves. The variables with gray color are considered ‘not significant.’
  62. Open the Survival Curves tab to see how each of

    the variable would make the difference on the survival curves.
  63. Customers who used ‘Clear Activity Log’ feature tend to churn

    more than those who didn’t use the feature.
  64. On the other hand, customers who received friend request tend

    to churn less than those who didn’t receive the request.
  65. Open the Prediction tab to see what difference each variable

    would make on the churn rate (cancel rate).
  66. The churn rate is about 64% for those who used

    ‘Clear Activity Log’ feature and it is higher than the one for those who didn’t use the feature.
  67. These are churn rates at the 3rd month. By default,

    it uses the predicted values at the mean period.
  68. 3 Survival Rate (Retention Rate) Cancel Rate (Churn Rate) Under

    Survival Curve tab Under Prediction tab
  69. 69 You can change the survival time period for the

    prediction tab.
  70. Open the Coefficient tab to see the ‘hazard ratio’ for

    each variable.
  71. Blue color indicates that the chance of churn is more

    likely. The users who cleared their activity logs are more likely to churn.
  72. Red color indicates that the chance of churn is less

    likely. The users who liked posts on their timeline or received friend requests are less likely to churn.
  73. Hazard ratio 1 means that it’s 50/50 on whether the

    users churn or not. Any variables confidence intervals crossing 1 are not considered to make significant difference in either direction (more churn or less churn).
  74. A chance of observing an event (e.g. cancel) occurring by

    the end of a given time (1 day, 1 month, etc.). 74 Hazard
  75. 75 Let’s say we have 2 survival curves for 2

    cohorts - Mac users and Windows users.
  76. 76 100% 5% The hazard for Mac users for the

    3rd month is 5%.
  77. 77 100% 5.5% The hazard for Windows users for the

    3rd month is 5.5%.
  78. 78 5.5% 5% The hazard ratio of Windows users compared

    to Mac users is 1.1. 0.055/0.05 = 1.1 Windows Mac
  79. 79 5.5% 5% Hazard Ratio for Windows Users: 1.1 If

    the hazard ratio is greater than 1 the chance of churn is more likely.
  80. 80 5.5% 5% The hazard ratio of Mac users compared

    to Windows users is 0.91. 0.05/0.055 = 0.91 Windows Mac
  81. 81 5.5% 5% Hazard Ratio for Mac Users: 0.91 If

    the hazard ratio is less than 1 the chance of churn is less likely.
  82. Assuming the hazard ratio is constant, it find the most

    optimal hazard ratios for a given set of predictor variables to fit the actual survival curves. 82 Cox Regression y = exp(-H(t) * exp(a * predictor)) y : survival curve (a series of survival ratios)
 t : survival period Hazard Ratio
  83. 83 What do you mean by ‘Hazard Ratio is constant’?

  84. 84 5.5% 5% This hazard ratio is constant throughout all

    the periods… Hazard Ratio for Windows Users: 1.1
  85. 85 5.5% 5% If the hazard ratio of Windows users

    compared to Mac users is 1.1 at any given period the survival curves would look like this.
  86. None
  87. Now that we know that users who ‘likes posts on

    their timeline’ or ‘receives friend requests’ are less likely to churn, how the actual data look like?
  88. The actual data also shows that the users who received

    the friend requests indeed less likely to churn.
  89. With confidence interval, the difference becomes significant after around the

    fifth month.
  90. The actual data also shows that the users who liked

    posts on their timeline also less likely to churn.
  91. 91 Now that we know that users who ‘likes posts

    on their timeline’ or ‘receives friend requests’ are less likely to churn, how about the users who do both activities?
  92. We can create a column that combines the two columns

    and create the survival curve with this cohort.
  93. Select the two columns with Control or Command key, and

    select ‘Unite Multiple Columns’ from the column header menu.
  94. Click the Run button.

  95. A new column with the values form the two columns

    combined is created.
  96. To make the values more intuitive, we can replace the

    values.
  97. 97 Assign new values for those original 4 values.

  98. The values are all replaced with the new values.

  99. Move the Pin button to the latest step so that

    we can use the data to create the survival curve.
  100. Assign the newly combined column to Color.

  101. Assign the newly combined column to Color.

  102. We can see that the customers who performed both activities

    are much less likely to churn.
  103. Survival Random Forest 103

  104. 104 Data Result Decision Tree Decision tree is a series

    of forks with conditional questions.
  105. 105 Data Result Decision Tree There is a way to

    predict the survival curve (a series of survival rates over time) with Decision Tree.
  106. It creates a set of trees and have them predict.

    The prediction results can be different between the trees, if such cases it takes the majority vote or the mean. 106 Random Forest Data Sample Sample Sample Result Result Result Average
  107. It uses the Random Forest algorithm to predict the survival

    curves. Each tree predict a survival curve. Random Forest takes the mean of the survival curves. 107 Survival Random Forest Data Sample Sample Sample Result Result Result Average
  108. Let’s do it! 108

  109. Duplicate the analytics with Cox Regression.

  110. Switch the Analytics Type to Survival Forest.

  111. A survival forest model is created.

  112. We can see a similar result as we saw with

    the Cox Regression model.
  113. The survival curves are slightly different from the one with

    Cox Regression.
  114. Cox Regression vs. Random Survival Forest

  115. Here, I’ve created a Cox Regression model with ‘OS’ variable

    with ‘Windows’, ‘Mac’, and ‘Linux’ values.
  116. Cox Regression model has a constraint that the hazard ratio

    is constant, hence the order of the curves and the way the curves are declining are consistent throughout the time.
  117. Here, I’ve created a Survival Forest model with ‘OS’ variable

    with ‘Windows’, ‘Mac’, and ‘Linux’ values.
  118. Survival Forest doesn’t have a constraint like the one that

    Cox Regression has. This makes it possible for the order of cohorts can be different time to time and the shape of the curve is much more flexible.
  119. 119 Cox Regression has a constraint of the hazard ratio

    being constant, hence it is not good at capturing ‘non-linear’ patterns. Prediction with Cox Regression Actual Data Female Male Female Male
  120. 120 Prediction with Survival Forest Machine Learning models like Survival

    Model tend to capture the pattern in actual data better because it doesn’t have the constraint that holds the Statistical models like Cox Regression. Actual Data Female Male Female Male
  121. 121 • For prediction, the machine learning models like Random

    Survival Forest tend to perform better given the nature of being able to capture even the non-linear patterns while the statistical learning models like Cox Regression can only capture linear patterns. • However, the machine learning models can’t be used to evaluate whether a given relationship is significant or not and to evaluate causal relationship, both of which are common tasks for the statistical learning models like Cox Regression. Statistical Models vs. Machine Learning Models
  122. Cox Regression (Stats Learning) Random Forest (Machine Learning) Significance Check

    (Statistical Test) Yes No Non Linear Relationship Can’t capture Can capture Correlation & Causation Can be used for evaluating Causal Hypothesis Only Correlation Categorical Predictors Internally converts it to multiple variables. Interpret the hazard ratio for each variable as a ratio against the base level. As is 122
  123. One more thing…

  124. Prediction with Survival Models Cox Regression / Survival Random Forest

  125. You can predict the probability of survival for each individual

    by using the model - Survival Random Forest / Cox Regression - you have created under the Analytics view.
  126. Step 1: Build a Prediction Model with Survival Algorithms

  127. Step 2. Open a another data frame. This is the

    data you want to predict for.
  128. Step 3: Select ‘Predict with Model (Analytics View)’ from the

    Step menu
  129. Step 3: Select the model you created before and set

    the Survival Time for Prediction
  130. You’ll see new columns being added with the probability of

    survival rate for each customer.
  131. Next Seminar

  132. EXPLORATORY Online Seminar #46 5/26/2021 (Wed) 11AM PT RFM Analysis

    for Sales Data
  133. None
  134. Information Email kan@exploratory.io Website https://exploratory.io Twitter @ExploratoryData Seminar https://exploratory.io/online-seminar

  135. Q & A 135

  136. EXPLORATORY 136