Slide 1

Slide 1 text

EXPLORATORY Online Seminar #46 RFM Analysis for Sales Data

Slide 2

Slide 2 text

Kan Nishida CEO/co-founder Exploratory Summary In Spring 2016, launched Exploratory, Inc. to democratize Data Science. Prior to Exploratory, Kan was a director of product development at Oracle leading teams to build various Data Science products in areas including Machine Learning, BI, Data Visualization, Mobile Analytics, Big Data, etc. While at Oracle, Kan also provided training and consulting services to help organizations transform with data. @KanAugust Speaker

Slide 3

Slide 3 text

3 Data Science is not just for Engineers and Statisticians. Exploratory makes it possible for Everyone to do Data Science. The Third Wave

Slide 4

Slide 4 text

4 Questions Communication Data Access Data Wrangling Visualization Analytics (Statistics / Machine Learning) Data Science Workflow

Slide 5

Slide 5 text

5 Questions Communication (Dashboard, Note, Slides) Data Access Data Wrangling Visualization Analytics (Statistics / Machine Learning) ExploratoryɹModern & Simple UI

Slide 6

Slide 6 text

EXPLORATORY Online Seminar #46 RFM Analysis for Sales Data

Slide 7

Slide 7 text

7 RFM RFM Analysis helps you understand your customers better based on the following three metrics. • Recency - how recent they have purchased • Frequency - how often they have purchased • Monetary - how much they have spent

Slide 8

Slide 8 text

8 Who are the most loyal customers?

Slide 9

Slide 9 text

How much have they spent in the past? 9 $1000 $100 $5,000 $4000 $3000 Customers who have spent the most amount of money!

Slide 10

Slide 10 text

10 But…

Slide 11

Slide 11 text

How often have they purchased in the last 12 months? 11 15 1 60 45 30 Some customers might not have spent a lot, but come and shop very often!

Slide 12

Slide 12 text

The way you approach to each customer group would be different. 12 Frequency $6000 $4000 $2000 Sales 15 1 45 30 60

Slide 13

Slide 13 text

13 But, are they still purchasing?

Slide 14

Slide 14 text

14 Oct Aug Apr Mar Feb Sep Nov Dec Jan 2020 2021 Last Time Purchase How recent have they purchased? (When was their last purchases?) We should also take it into account on how recent they have purchased.

Slide 15

Slide 15 text

15 RFM Analysis helps you understand your customers better based on the following three metrics. • Recency - how recent they have purchased • Frequency - how often they have purchased • Monetary - how much they have spent You can use these metrics to score the customers and segment them so that you can take more optimized actions to improve your business.

Slide 16

Slide 16 text

16 We have positioned each customer based on their monetary and the frequency scores. Monetary $1,500 $1,000 $500 30 10 20 Frequency

Slide 17

Slide 17 text

17 These are the ideal customers who spend the most amount and most often. Monetary $1,500 $1,000 $500 30 10 20 Frequency

Slide 18

Slide 18 text

18 These are the least ideal customers who spend the least amount and rarely come back. Monetary $1,500 $1,000 $500 30 10 20 Frequency

Slide 19

Slide 19 text

19 These are the customers who spend the most amount but rarely come back. Maybe we can send promotional emails to bring them back. Monetary $1,500 $1,000 $500 30 10 20 Frequency

Slide 20

Slide 20 text

20 These are the customers who spend the least amount but purchases the most often. Maybe we can introduce other products that matches with their purchasing items. Monetary $1,500 $1,000 $500 10 20 Frequency 30

Slide 21

Slide 21 text

21 Monetary $1,500 $1,000 $500 10 20 Frequency 30 We can create multiple groups based on the metrics (e.g. Monetary, Frequency) and give different colors based on the number of customers.

Slide 22

Slide 22 text

We can use a Heatmap chart to visualize how many customers are in each segment. 22 Monetary Frequency

Slide 23

Slide 23 text

23 Data for RFM Analysis

Slide 24

Slide 24 text

24 Each customer is presented in each row.

Slide 25

Slide 25 text

25 There are the three metrics - Monetary, Frequency, and Recency.

Slide 26

Slide 26 text

26 Once we have the data prepared, then we can assign the columns to Heatmap chart and visualize it.

Slide 27

Slide 27 text

27 Typically, you get your sales data in a format like this. Each row represents each product order, not each customer.

Slide 28

Slide 28 text

28 Each customer has multiple rows for all the purchases he/she has made.

Slide 29

Slide 29 text

29 Data Wrangling Time! Sales Data RFM Data

Slide 30

Slide 30 text

30 Let’s do it!

Slide 31

Slide 31 text

We’ll use a sample data from the Data Catalog.

Slide 32

Slide 32 text

32 Click the Import button for ‘Sample Data for RFM Analysis’.

Slide 33

Slide 33 text

Once the data is imported it is automatically visualized under the Summary view.

Slide 34

Slide 34 text

Each row represents each product order by a given customer.

Slide 35

Slide 35 text

1. Recency - How many days have passed since the last purchase? 2. Frequency - Total number of purchases per day 3. Monetary - Total amount of purchases per day 35 Metrics Definition

Slide 36

Slide 36 text

36 Sales / Duration The duration varies among the customers. The longer the duration is the sales tend to be more. In order to compare among the customers, we can divide the sales amount by the duration for each customer, and we’ll use this as the Monetary. Monetary - Total Sales Amount per Day

Slide 37

Slide 37 text

37 Number_of_Orders / Duration The same thing for the Frequency. The longer the duration is the number of orders tend to be more. So, we can divide the number of orders by the duration for each customer, and we’ll use this as the Frequency. Frequency - Number of Orders per Day

Slide 38

Slide 38 text

38 Duration Last_Order_Date - First_Order_Date We need to calculate the duration first before calculating the Monetary and the Frequency.

Slide 39

Slide 39 text

1. First Date of Purchase 2. Last Date of Purchase 3. Total Sales 4. Number of Purchases 39 We need to calculate the following items for each customer.

Slide 40

Slide 40 text

40 Summarize

Slide 41

Slide 41 text

41 Select ‘Summarize’ from the column header menu.

Slide 42

Slide 42 text

42 Select the ‘Market’ and the ‘Customer ID’ columns for Group By so that we can summarize the data by the customer in each market.

Slide 43

Slide 43 text

43 Select the ‘Order Date’ with ‘Min’ to return the first purchasing date for each customer.

Slide 44

Slide 44 text

44 Select the ‘Order Date’ with ‘Max’ to return the last purchasing date for each customer.

Slide 45

Slide 45 text

45 Select the ‘Sales’ with ‘SUM’ to calculate the total sales amount for each customer.

Slide 46

Slide 46 text

46 There is a bit of challenge for calculating the number of purchases…

Slide 47

Slide 47 text

47 Each row is not necessarily an each order. If a customer ordered 4 products then we have 4 rows each of which is for a particular product order. It’s kind of like a line item in the invoice.

Slide 48

Slide 48 text

48 Instead of counting the number of rows, we can calculate the unique number of the Order Date values to have the number of orders.

Slide 49

Slide 49 text

49 Select the ‘Order Date’ with ‘Unique’ to calculate the unique number of order dates for each customer.

Slide 50

Slide 50 text

50 Click the Run button to execute.

Slide 51

Slide 51 text

51 Let’s fix the column names.

Slide 52

Slide 52 text

52 Select ‘Rename’ from the column header menu.

Slide 53

Slide 53 text

No content

Slide 54

Slide 54 text

54 Filter

Slide 55

Slide 55 text

55 There are customers who have purchased only once. We want to remove these customers for this analysis.

Slide 56

Slide 56 text

56 Select ‘Filter’ and ‘Greater than’ from the column header menu.

Slide 57

Slide 57 text

57

Slide 58

Slide 58 text

58

Slide 59

Slide 59 text

We can see that the minimal number of orders is now 2.

Slide 60

Slide 60 text

60 Create Calculations

Slide 61

Slide 61 text

61 Now, it’s time to calculate Monetary, Frequency, and Recency. • Monetary - Sales / Duration • Frequency - Number of Orders / Duration • Recency - Last Date of All - Last Date of a given Customer

Slide 62

Slide 62 text

62 Duration Last_Order_Date - First_Order_Date We need to calculate the duration first before calculating the Monetary and the Frequency.

Slide 63

Slide 63 text

No content

Slide 64

Slide 64 text

No content

Slide 65

Slide 65 text

65 A new ‘Duration’ column is created but the data type is ‘difftime’. We want to convert it to ‘numeric’ so that we can use it for further calculations.

Slide 66

Slide 66 text

66

Slide 67

Slide 67 text

67

Slide 68

Slide 68 text

No content

Slide 69

Slide 69 text

Let’s calculate the Monetary.

Slide 70

Slide 70 text

No content

Slide 71

Slide 71 text

We’ve got Monetary!

Slide 72

Slide 72 text

Now let’s calculate the Frequency.

Slide 73

Slide 73 text

No content

Slide 74

Slide 74 text

We’ve got Frequency!

Slide 75

Slide 75 text

75 Recency max(Last_Order_Date) - Last_Order_Date We can calculate the recency as the difference between the last date of order for each customer and the last date of the data (everybody).

Slide 76

Slide 76 text

76 today() - Last_Order_Date If you have the data up to today you can use the ‘today’ or ‘now’ function. now() - Last_Order_Date Data Type: Date Data Type: POSIXct

Slide 77

Slide 77 text

We’ll create the Recency.

Slide 78

Slide 78 text

No content

Slide 79

Slide 79 text

No content

Slide 80

Slide 80 text

We’ve got Recency!

Slide 81

Slide 81 text

We’ve got all three of RFM metrics!

Slide 82

Slide 82 text

82 We can create 5 buckets for each of the metrics - R (Recency) F (Frequency) M (Monetary) - and visualize with Heatmap chart.

Slide 83

Slide 83 text

How can we bucket (or bin) the numerical values?

Slide 84

Slide 84 text

84 Equal Width method 100 200 300 400 500 600 700 800 900 1000 Recency

Slide 85

Slide 85 text

85 Each group will have the same number of rows (customers). 100 200 300 400 500 600 700 800 900 1000 Recency Equal Frequency method

Slide 86

Slide 86 text

We’ll use the ‘Equal Frequency’ method this time.

Slide 87

Slide 87 text

Assign the Recency column to X-Axis.

Slide 88

Slide 88 text

Assign the Monetary column to X-Axis.

Slide 89

Slide 89 text

Change the ‘Category’ creation method to ‘Equal Frequency’.

Slide 90

Slide 90 text

No content

Slide 91

Slide 91 text

91 Very New, New, Neutral, Old, Very Old Group names for Recency

Slide 92

Slide 92 text

We can update the X-Axis label names.

Slide 93

Slide 93 text

93 Group names for Monetary Very Low, Low, Neutral, High, Very High

Slide 94

Slide 94 text

We can update the Y-Axis label names.

Slide 95

Slide 95 text

95 By the way, there is something weird about the X-Axis…

Slide 96

Slide 96 text

96 The recency values are time sensitive information. And we, human being, are so used to think the time goes from the left to the right.

Slide 97

Slide 97 text

97 Let’s change the order so that we can intuitively understand it.

Slide 98

Slide 98 text

No content

Slide 99

Slide 99 text

Now, the Recency values are more in the natural order.

Slide 100

Slide 100 text

Click the cell with the most customers. These are who have purchased the most but haven’t come back for a long time.

Slide 101

Slide 101 text

Click on the Show Detail button.

Slide 102

Slide 102 text

to see a list of the customers.

Slide 103

Slide 103 text

Appendix

Slide 104

Slide 104 text

104 If we bucket with the ‘Equal Frequency’ method then shouldn’t the each box of the heatmap have the same number of customers?

Slide 105

Slide 105 text

105

Slide 106

Slide 106 text

106

Slide 107

Slide 107 text

107

Slide 108

Slide 108 text

108

Slide 109

Slide 109 text

109

Slide 110

Slide 110 text

110

Slide 111

Slide 111 text

One more thing…

Slide 112

Slide 112 text

We have a ‘Market’ column.

Slide 113

Slide 113 text

Can we show this heatmap for each market?

Slide 114

Slide 114 text

No content

Slide 115

Slide 115 text

We can change the number of the columns to layout the charts nicely.

Slide 116

Slide 116 text

No content

Slide 117

Slide 117 text

No content

Slide 118

Slide 118 text

No content

Slide 119

Slide 119 text

Summary

Slide 120

Slide 120 text

120 RFM Analysis helps you understand your customers better based on the following three metrics. • Recency - how recent they have purchased • Frequency - how often they have purchased • Monetary - how much they have spent You can use these metrics to score the customers and segment them so that you can take more optimized actions to improve your business.

Slide 121

Slide 121 text

121 Data Wrangling! Sales Data RFM Data

Slide 122

Slide 122 text

122 Once we have the data prepared, then we can assign the columns to Heatmap chart and visualize it.

Slide 123

Slide 123 text

That’s it for today!

Slide 124

Slide 124 text

Next Seminar

Slide 125

Slide 125 text

EXPLORATORY Online Seminar #47 6/2/2021 (Wed) 11AM PT Survey Data Analysis

Slide 126

Slide 126 text

No content

Slide 127

Slide 127 text

Information Email [email protected] Website https://exploratory.io Twitter @ExploratoryData Seminar https://exploratory.io/online-seminar

Slide 128

Slide 128 text

Q & A 128

Slide 129

Slide 129 text

EXPLORATORY 129