Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Predict Customer Personality to Boost Marketing Campaign by Using Machine Learning

Predict Customer Personality to Boost Marketing Campaign by Using Machine Learning

Agustina Sri Wardani

February 15, 2023
Tweet

More Decks by Agustina Sri Wardani

Other Decks in Technology

Transcript

  1. Predict Customer Personality to boost marketing campaign by using Machine

    Learning Created by: Agustina Sri Wardani [email protected] https://www.linkedin.com/in/agustinaswd/ Hi, nice to meet you. I’m Tina, a newbie in the data world. The world is changing so fast. I believe data is the gate away for me to face and win today’s disruption. This is my 3rd mini project in Bootcamp Data Science Rakamin Academy. Feel free to review or give feedback on my project. You can also check out my other project here if you are interested.
  2. Overview “A company can rapidly develop when it knows its

    customer personality behavior, so it can provide better services and benefits to customers who have the potential to become loyal customers. By processing historical marketing campaign data to improve performance and target the right customers so they can transact on the company's platform, from this data insight, our focus is to create a cluster prediction model to make it easier for companies to make decisions”
  3. Data Info • 2240 total row & 30 total columns.

    • Income column missing 24 rows. • Need to adjust data type for Dt_Customers, change it to datetime. You can check here for the source code
  4. Exploratory Data Analysis • TotalKids has a high correlation with

    Teenhome & IsParents • NumCatalogPurchases has a high correlation with MntMeatProducts • TotalSpending has a high correlation with MntCoke, MntMeatProducts, NumCatalogPurchases • NumWeb Purchases has a high correlation with TotalTransaction • AcceptedCmp5 has a high correlation with TotalAccCmpg Multivariate Analysis
  5. Exploratory Data Analysis The graph above shows that Income and

    Total Spending has a positive correlation The graph above shows that Income and Age doesn’t has a correlation The graph above shows that Age and Total Spending doesn’t has a correlation
  6. Exploratory Data Analysis The graph above shows that Age and

    Convention Rate doesn’t has a correlation The graph above shows that Convention Rate and Total Spending has a positive correlation The graph above shows that Convention Rate and Income has a positive correlation
  7. Insight & Recommendation Insight • The greater the income, the

    greater the convention rate that a customer has. • The greater the income, the greater the total spending customers spend on our platform. • The relationship between total spending and the convention rate is also linear. • The greater the total spending, the greater the convention rate.No specific age describes "certain ages have a higher convention rate.“ Recommendation The greater the income, the greater the convention rate. We can target customers whose income is higher (60.000.000) and we will give them the more specific campaign
  8. Data Cleaning Column Income has 24 missing value, we will

    drop it. Missing Value There’s no duplicated value Duplicated Value Now we have 2216 total data Drop Missing Value Drop columns which are not needed Drop Columns You can check here for the source code
  9. Data Preprocessing Encoding strategy: • label encode: Education • One

    Hot Encoding: MaritalStatus, AgeGroup Feature Encoding Label Encode One Hot Encoding Feature Standardisation One Hot Encoding
  10. Data Modeling Use RFM Analysis for feature selection to find

    the best clustering • (Recency) Recency: The last time the customer made a transaction. • (Frequency) Conversion Rate: Frequency of transactions / visit made by the Customer. • (Monetary) Income: Customers’ revenue Elbow Method We choose to use 3 clusters for this model You can check here for the source code
  11. Data Modeling We can see that the clustering with 4

    and 6 clusters has a slightly higher score than clustering with 3 clusters. But, the number of 3 clusters will produce a better group in providing insights for the business/marketing team to improve the company's business. So, we still choose to cluster with 3 clusters Sillhouette Score
  12. Cluster Summary Clusters: 2: High Spender 1: Low Spender 0:

    Risk of Churn Cluster 2 (High Spender) • Have 873 customers • Senior Adults (>55 years old)is the domination(226 customers) • Customers with the highest income (IDR 81 million in a year) and total spending (IDR 1.3 million per year) • The second highest recency • The highest conversion rate • The lowest number of the website visit • The lowest number of deals/promo purchase • Is the Customer Champions
  13. Cluster Summary Cluster 1 (Low Spender) • Have 877 customers

    • Middle Adults (36-55 years old) is the dominant (497 customers) • Customers with the lowest income (IDR 44 million per year) and total spending (IDR 400K per year) • The lowest recency • The lowest conversion rate • The highest number of the website visit • The second highest number of deals/promo purchase • Is the Customer Need Attention Cluster 0 (Risk of Churn) • Have 466 customers • Middle Adults (>55 years old)is the domination (454 customers) • Customers who are at the second highest income (IDR 45 million per year) and total spending (IDR 427K per year) • The highest recency • The second highest conversion rate • The second highest number of the website visit • The highest number the deals/promo purchase • Is the Customer At Risk
  14. Recommendation • Must maintain our High Spender Customers. We can

    make a special promo for customers in this cluster so they don't churn • The Low Spender and Risk of Churn clusters are the group that often visit our website but rarely buy. This case happened maybe cause their promo is not suitable for them. We need to do more analysis to know why this case happened