Slide 1

Slide 1 text

EXPLORATORY Online Seminar #45 Cohort Analysis Part 4 Finding What Makes Churn with Survival Model

Slide 2

Slide 2 text

Kan Nishida CEO/co-founder Exploratory Summary In Spring 2016, launched Exploratory, Inc. to democratize Data Science. Prior to Exploratory, Kan was a director of product development at Oracle leading teams to build various Data Science products in areas including Machine Learning, BI, Data Visualization, Mobile Analytics, Big Data, etc. While at Oracle, Kan also provided training and consulting services to help organizations transform with data. @KanAugust Speaker

Slide 3

Slide 3 text

3 Data Science is not just for Engineers and Statisticians. Exploratory makes it possible for Everyone to do Data Science. The Third Wave

Slide 4

Slide 4 text

4 Questions Communication Data Access Data Wrangling Visualization Analytics (Statistics / Machine Learning) Data Science Workflow

Slide 5

Slide 5 text

5 Questions Communication (Dashboard, Note, Slides) Data Access Data Wrangling Visualization Analytics (Statistics / Machine Learning) ExploratoryɹModern & Simple UI

Slide 6

Slide 6 text

EXPLORATORY Online Seminar #45 Cohort Analysis Part 4 Finding What Makes Churn with Survival Model

Slide 7

Slide 7 text

7 Agenda • Survival Model - Cox Regression • Survival Model - Random Survival Model • Difference between Cox Regression vs. Survival Model • Prediction

Slide 8

Slide 8 text

SaaS - Software as a Service A business model where you charge a software license fee as subscription basis for a value you provided through the software.

Slide 9

Slide 9 text

For SaaS businesses, the initial payment from a customer tends to smaller but it accumulates over time.

Slide 10

Slide 10 text

10 Jan $100 You sell a product to one user and collect the money right a way. Non Subscription Model

Slide 11

Slide 11 text

11 This means, you can spend $80 to acquire this customer but still can make a profit. Jan $100 Expense Revenue Profit

Slide 12

Slide 12 text

12 Jan $10 You collect only the monthly subscription amount at the first month in case of the monthly subscription. Subscription Model

Slide 13

Slide 13 text

13 If you spend $80 to acquire the customer you will lose money. Jan $10 Subscription Expense $80

Slide 14

Slide 14 text

14 Jan Feb Mar Total … Dec $120 But, you’ll accumulate the revenue from the same customer over time. $10 $10 $10 $10

Slide 15

Slide 15 text

15 Jan Feb Mar Total … Dec $120 So if the customer retain for a long enough, you’ll be able to pay off the initial cost of acquisition. Expense $80 $10 $10 $10 $10

Slide 16

Slide 16 text

16 Jan Feb Mar Total Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec The longer your customer retain you’ll make more money. Customer Life Time Value - CLTV

Slide 17

Slide 17 text

17 Jan Feb Mar Total Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec This means not only you can spend more money on acquiring customers but also can invest more in your products and people. Profit

Slide 18

Slide 18 text

Jan Feb Mar Apr It’s important to divide the MRR into cohorts of customers life time. 4 mon. 4 mon. 4 mon. 4 mon. 3 mon. 3 mon. 3 mon. 3 mon. 2 mon. 1 mon. 1 mon. 3 mon. 2 mon. 3 mon. This customer has paid 4 months of subscription by this time.

Slide 19

Slide 19 text

By dividing the MRR by the cohort of when the users converted, you can see how efficiently your business is growing and can grow further. 9 Month Ago 8 Month Ago 7 Month Ago 6 Month Ago 5 Month Ago 4 Month Ago 3 Month Ago 2 Month Ago 1 Month Ago

Slide 20

Slide 20 text

20 9 Month Ago 8 Month Ago 7 Month Ago 6 Month Ago 5 Month Ago 4 Month Ago 3 Month Ago 2 Month Ago 1 Month Ago Converted at Jan Feb Mar Apr May Jun Jul Aug Sep Customers churn, so each cohort’s MRR tends to go down.

Slide 21

Slide 21 text

21 If customers don’t churn then the ratio of longer customers in MRR is bigger. if the older cohorts retain longer you can accelerate your growth as you acquire new customer. If many customers churn then the ratio of shorter customers in MRR is bigger. 9 Month Ago 8 Month Ago 7 Month Ago 6 Month Ago 5 Month Ago 4 Month Ago 3 Month Ago 2 Month Ago 1 Month Ago

Slide 22

Slide 22 text

Understanding the customer churn (or retention) is critically important for any SaaS Businesses.

Slide 23

Slide 23 text

What makes customer churn?

Slide 24

Slide 24 text

Before answering the question, let’s realize that Customer Churn is a bit tricky idea as compared to Customer Conversion.

Slide 25

Slide 25 text

What is the probability of Customer Conversion for our business?

Slide 26

Slide 26 text

26 Converted 60% No Converted 40% 60% of the lead customers converted. Lead Customers

Slide 27

Slide 27 text

27 0 0.25 0.5 0.75 1 40% 60% Not Converted Converted Visualizing the 60% customer conversion.

Slide 28

Slide 28 text

What is the probability of Customer Churn for our business?

Slide 29

Slide 29 text

It depends… The longer the customers’ lifetimes are the higher the probability of their churn would be.

Slide 30

Slide 30 text

30 Month In 35% 100% 40% 32% 60% 48% Here’s a survival curve that shows the survival rate (or retention rate) through each period.

Slide 31

Slide 31 text

31 Check out our past ‘Survival Curve’ seminar for more details.

Slide 32

Slide 32 text

32

Slide 33

Slide 33 text

33 48% 48% of Customers would retain through 2nd month. Month 35% 100% 40% 32% 60% Each retention rate on this chart indicate the rate of customers who would retain through a given period.

Slide 34

Slide 34 text

34 The customer churn rates are depending on how long the customer being a customer. 52% Month 100% 60% 40%

Slide 35

Slide 35 text

Now, let’s think about another question. ‘What makes customer convert?’ If we have a hypothesis like ‘Mac users might convert more than Windows users’, then we can compare the conversion ratios between the two OS groups.

Slide 36

Slide 36 text

36 0 0.25 0.5 0.75 1 Mac Windows 60% 40% 40% 60% Not Converted Converted Mac users converted more (60%) than Windows users, so the OS makes the difference.

Slide 37

Slide 37 text

But, if the question is, ‘what makes customer churn?’, it become a bit tricky. Because, that depends on which period we are talking about!

Slide 38

Slide 38 text

So, instead of comparing the two numbers (average churn rates), why don’t we compare the two survival curves!

Slide 39

Slide 39 text

39 Here’s two survival curves, one for Mac customers and another for Windows customers.

Slide 40

Slide 40 text

40 Overall, Windows users tend to churn more than Mac users. Especially in the first 4 months after they converted With Confidence Interval

Slide 41

Slide 41 text

41 We can use the survival curve to see what makes customers churn more or less. The steeper the survival curve is the more chance of churning.

Slide 42

Slide 42 text

42 Now, instead of generating the survival curve for every potential variable of your interest, we can build survival models to see which variable have more influence on the slope of curve.

Slide 43

Slide 43 text

43 Let’s try!

Slide 44

Slide 44 text

We’ll use a sample data from the Data Catalog.

Slide 45

Slide 45 text

No content

Slide 46

Slide 46 text

No content

Slide 47

Slide 47 text

• Each row represents each user. • It indicates which product features each user used in the first week since the conversion.

Slide 48

Slide 48 text

48 Product Features Period Status

Slide 49

Slide 49 text

We want to know which product features help customers retain longer. Maybe, using ‘Add Birthday’ feature in the first week of the conversion helps customers retain longer. Let’s find that out.

Slide 50

Slide 50 text

First, let’s create the Survival Curve.

Slide 51

Slide 51 text

Assign the period and the status information and click Run button.

Slide 52

Slide 52 text

Here’s the survival curve that shows the retention rates through each period.

Slide 53

Slide 53 text

Assign ‘AddBirthday’ column to Color to see if the usage of this feature makes any difference on the survival curve.

Slide 54

Slide 54 text

Looks that customers who used this feature tend to churn more.

Slide 55

Slide 55 text

Now, instead of trying other variables one by one, let’s create a Cox Regression model.

Slide 56

Slide 56 text

Create a new Analytics, select ‘Cox Regression’.

Slide 57

Slide 57 text

Just like we did for creating the survival curve, we assign the start time, the end time, and the event status columns.

Slide 58

Slide 58 text

Then, this time, click on the Predictor Variable button to select the variables of your interest.

Slide 59

Slide 59 text

Select the variables.

Slide 60

Slide 60 text

Once you click on the Run button, you’ll get the Cox Regression model being created.

Slide 61

Slide 61 text

Open the Importance tab to see which variables are more influential on the survival curves. The variables with gray color are considered ‘not significant.’

Slide 62

Slide 62 text

Open the Survival Curves tab to see how each of the variable would make the difference on the survival curves.

Slide 63

Slide 63 text

Customers who used ‘Clear Activity Log’ feature tend to churn more than those who didn’t use the feature.

Slide 64

Slide 64 text

On the other hand, customers who received friend request tend to churn less than those who didn’t receive the request.

Slide 65

Slide 65 text

Open the Prediction tab to see what difference each variable would make on the churn rate (cancel rate).

Slide 66

Slide 66 text

The churn rate is about 64% for those who used ‘Clear Activity Log’ feature and it is higher than the one for those who didn’t use the feature.

Slide 67

Slide 67 text

These are churn rates at the 3rd month. By default, it uses the predicted values at the mean period.

Slide 68

Slide 68 text

3 Survival Rate (Retention Rate) Cancel Rate (Churn Rate) Under Survival Curve tab Under Prediction tab

Slide 69

Slide 69 text

69 You can change the survival time period for the prediction tab.

Slide 70

Slide 70 text

Open the Coefficient tab to see the ‘hazard ratio’ for each variable.

Slide 71

Slide 71 text

Blue color indicates that the chance of churn is more likely. The users who cleared their activity logs are more likely to churn.

Slide 72

Slide 72 text

Red color indicates that the chance of churn is less likely. The users who liked posts on their timeline or received friend requests are less likely to churn.

Slide 73

Slide 73 text

Hazard ratio 1 means that it’s 50/50 on whether the users churn or not. Any variables confidence intervals crossing 1 are not considered to make significant difference in either direction (more churn or less churn).

Slide 74

Slide 74 text

A chance of observing an event (e.g. cancel) occurring by the end of a given time (1 day, 1 month, etc.). 74 Hazard

Slide 75

Slide 75 text

75 Let’s say we have 2 survival curves for 2 cohorts - Mac users and Windows users.

Slide 76

Slide 76 text

76 100% 5% The hazard for Mac users for the 3rd month is 5%.

Slide 77

Slide 77 text

77 100% 5.5% The hazard for Windows users for the 3rd month is 5.5%.

Slide 78

Slide 78 text

78 5.5% 5% The hazard ratio of Windows users compared to Mac users is 1.1. 0.055/0.05 = 1.1 Windows Mac

Slide 79

Slide 79 text

79 5.5% 5% Hazard Ratio for Windows Users: 1.1 If the hazard ratio is greater than 1 the chance of churn is more likely.

Slide 80

Slide 80 text

80 5.5% 5% The hazard ratio of Mac users compared to Windows users is 0.91. 0.05/0.055 = 0.91 Windows Mac

Slide 81

Slide 81 text

81 5.5% 5% Hazard Ratio for Mac Users: 0.91 If the hazard ratio is less than 1 the chance of churn is less likely.

Slide 82

Slide 82 text

Assuming the hazard ratio is constant, it find the most optimal hazard ratios for a given set of predictor variables to fit the actual survival curves. 82 Cox Regression y = exp(-H(t) * exp(a * predictor)) y : survival curve (a series of survival ratios)
 t : survival period Hazard Ratio

Slide 83

Slide 83 text

83 What do you mean by ‘Hazard Ratio is constant’?

Slide 84

Slide 84 text

84 5.5% 5% This hazard ratio is constant throughout all the periods… Hazard Ratio for Windows Users: 1.1

Slide 85

Slide 85 text

85 5.5% 5% If the hazard ratio of Windows users compared to Mac users is 1.1 at any given period the survival curves would look like this.

Slide 86

Slide 86 text

No content

Slide 87

Slide 87 text

Now that we know that users who ‘likes posts on their timeline’ or ‘receives friend requests’ are less likely to churn, how the actual data look like?

Slide 88

Slide 88 text

The actual data also shows that the users who received the friend requests indeed less likely to churn.

Slide 89

Slide 89 text

With confidence interval, the difference becomes significant after around the fifth month.

Slide 90

Slide 90 text

The actual data also shows that the users who liked posts on their timeline also less likely to churn.

Slide 91

Slide 91 text

91 Now that we know that users who ‘likes posts on their timeline’ or ‘receives friend requests’ are less likely to churn, how about the users who do both activities?

Slide 92

Slide 92 text

We can create a column that combines the two columns and create the survival curve with this cohort.

Slide 93

Slide 93 text

Select the two columns with Control or Command key, and select ‘Unite Multiple Columns’ from the column header menu.

Slide 94

Slide 94 text

Click the Run button.

Slide 95

Slide 95 text

A new column with the values form the two columns combined is created.

Slide 96

Slide 96 text

To make the values more intuitive, we can replace the values.

Slide 97

Slide 97 text

97 Assign new values for those original 4 values.

Slide 98

Slide 98 text

The values are all replaced with the new values.

Slide 99

Slide 99 text

Move the Pin button to the latest step so that we can use the data to create the survival curve.

Slide 100

Slide 100 text

Assign the newly combined column to Color.

Slide 101

Slide 101 text

Assign the newly combined column to Color.

Slide 102

Slide 102 text

We can see that the customers who performed both activities are much less likely to churn.

Slide 103

Slide 103 text

Survival Random Forest 103

Slide 104

Slide 104 text

104 Data Result Decision Tree Decision tree is a series of forks with conditional questions.

Slide 105

Slide 105 text

105 Data Result Decision Tree There is a way to predict the survival curve (a series of survival rates over time) with Decision Tree.

Slide 106

Slide 106 text

It creates a set of trees and have them predict. The prediction results can be different between the trees, if such cases it takes the majority vote or the mean. 106 Random Forest Data Sample Sample Sample Result Result Result Average

Slide 107

Slide 107 text

It uses the Random Forest algorithm to predict the survival curves. Each tree predict a survival curve. Random Forest takes the mean of the survival curves. 107 Survival Random Forest Data Sample Sample Sample Result Result Result Average

Slide 108

Slide 108 text

Let’s do it! 108

Slide 109

Slide 109 text

Duplicate the analytics with Cox Regression.

Slide 110

Slide 110 text

Switch the Analytics Type to Survival Forest.

Slide 111

Slide 111 text

A survival forest model is created.

Slide 112

Slide 112 text

We can see a similar result as we saw with the Cox Regression model.

Slide 113

Slide 113 text

The survival curves are slightly different from the one with Cox Regression.

Slide 114

Slide 114 text

Cox Regression vs. Random Survival Forest

Slide 115

Slide 115 text

Here, I’ve created a Cox Regression model with ‘OS’ variable with ‘Windows’, ‘Mac’, and ‘Linux’ values.

Slide 116

Slide 116 text

Cox Regression model has a constraint that the hazard ratio is constant, hence the order of the curves and the way the curves are declining are consistent throughout the time.

Slide 117

Slide 117 text

Here, I’ve created a Survival Forest model with ‘OS’ variable with ‘Windows’, ‘Mac’, and ‘Linux’ values.

Slide 118

Slide 118 text

Survival Forest doesn’t have a constraint like the one that Cox Regression has. This makes it possible for the order of cohorts can be different time to time and the shape of the curve is much more flexible.

Slide 119

Slide 119 text

119 Cox Regression has a constraint of the hazard ratio being constant, hence it is not good at capturing ‘non-linear’ patterns. Prediction with Cox Regression Actual Data Female Male Female Male

Slide 120

Slide 120 text

120 Prediction with Survival Forest Machine Learning models like Survival Model tend to capture the pattern in actual data better because it doesn’t have the constraint that holds the Statistical models like Cox Regression. Actual Data Female Male Female Male

Slide 121

Slide 121 text

121 • For prediction, the machine learning models like Random Survival Forest tend to perform better given the nature of being able to capture even the non-linear patterns while the statistical learning models like Cox Regression can only capture linear patterns. • However, the machine learning models can’t be used to evaluate whether a given relationship is significant or not and to evaluate causal relationship, both of which are common tasks for the statistical learning models like Cox Regression. Statistical Models vs. Machine Learning Models

Slide 122

Slide 122 text

Cox Regression (Stats Learning) Random Forest (Machine Learning) Significance Check (Statistical Test) Yes No Non Linear Relationship Can’t capture Can capture Correlation & Causation Can be used for evaluating Causal Hypothesis Only Correlation Categorical Predictors Internally converts it to multiple variables. Interpret the hazard ratio for each variable as a ratio against the base level. As is 122

Slide 123

Slide 123 text

One more thing…

Slide 124

Slide 124 text

Prediction with Survival Models Cox Regression / Survival Random Forest

Slide 125

Slide 125 text

You can predict the probability of survival for each individual by using the model - Survival Random Forest / Cox Regression - you have created under the Analytics view.

Slide 126

Slide 126 text

Step 1: Build a Prediction Model with Survival Algorithms

Slide 127

Slide 127 text

Step 2. Open a another data frame. This is the data you want to predict for.

Slide 128

Slide 128 text

Step 3: Select ‘Predict with Model (Analytics View)’ from the Step menu

Slide 129

Slide 129 text

Step 3: Select the model you created before and set the Survival Time for Prediction

Slide 130

Slide 130 text

You’ll see new columns being added with the probability of survival rate for each customer.

Slide 131

Slide 131 text

Next Seminar

Slide 132

Slide 132 text

EXPLORATORY Online Seminar #46 5/26/2021 (Wed) 11AM PT RFM Analysis for Sales Data

Slide 133

Slide 133 text

No content

Slide 134

Slide 134 text

Information Email kan@exploratory.io Website https://exploratory.io Twitter @ExploratoryData Seminar https://exploratory.io/online-seminar

Slide 135

Slide 135 text

Q & A 135

Slide 136

Slide 136 text

EXPLORATORY 136