Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Ivan Kozyev, Crazy Panda

wnconf
August 01, 2018

Ivan Kozyev, Crazy Panda

Machine Learning Algorithms in Marketing Analytics

(White Nights Conference St.Petersburg 2018)
The official conference website — http://wnconf.com

wnconf

August 01, 2018
Tweet

More Decks by wnconf

Other Decks in Business

Transcript

  1. Machine Learning is… Wikipedia: … a subset of artificial intelligence

    in the field of computer science that often uses statistical techniques to give computers the ability to “learn” (i.e., progressively improve performance on a specific task) with data, without being explicitly programmed. My interpretation: … a number of algorithms, which can be used to automatize and optimize mundane tasks, make predictions and find deviations in big data. These algorithms can “learn” and make their own “decisions”.
  2. Problems best solved by Machine Learning: Classification: • Payer or

    non-payer • Type of player Bartle-wise • Share of fraud in traffic Prediction: • LTV prediction • About-to-churn players • Timing for special offers
  3. Main issues with Machine Learning: • Preparing datasets • Getting

    enough data for building models • Going for global challenges right away
  4. Practical cases from Crazy Panda: Case 1: Proxy-event for payers

    Case 2: Payer classification Case 3: LTV prediction
  5. Case 1: Proxy-event for payers Can be useful for: •

    Event optimization for Advertising Networks: • Reducing amount of time taking for event to appear • Increasing number of events • Maintaining high value for these events • Estimating traffic value early • Possible fraud detection Goal: To detect users likely to be payers as early as possible
  6. Case 1: Proxy-event for payers What proportion of positive identifications

    was actually correct? What proportion of actual positives was identified correctly? 1. Precision vs Recall 2. Be prepared to manage missing Data Hints:
  7. Case 1: Proxy- event for payers • Increased number of

    events on Day 1 after registration from 1% to 11% • Substantial reduction of the optimization window for advertising campaigns (from 7 days to 1 day) • Lower budget needed for channel value estimation Results and achievements
  8. Case 2: Payer classification Goal: Identify what type of payer

    a user is as early as possible Can be useful for: • LTV prediction • Estimating traffic value early on • Possible fraud detection • Increasing number of events for early optimization
  9. Case 2: Payer classification Payers distribution by sum of payments

    (normalized) Payer distribution by sum of payments (normalized)
  10. Case 2: Payer classification Hints: 1. Two models are better

    than just one 2. More Data != better results 3. Managing errors: some are dangerous, other are not
  11. Case 2: Payer classification • Designed optimization event for high

    value users • Built a base for an LTV model • Can easily be used to detect fraud Results and achievements
  12. Case 3: LTV prediction Can be useful for: • I

    have discovered so much use of this, which this margin is too narrow to contain Goal: Estimate LTV as early as possible and increase its accuracy with each day of life for that cohort
  13. Case 3: LTV prediction Hints: 1. Payers make payments and

    we already know how to classify them early. 2. If we know how many different payers currently make up the cohort, we can use historical distribution or even make another machine learning model to predict how many more and what type of payers will be converted. 3. If we don’t have enough Data to fuel LTV prediction for 100 days, we can always go for 30 days and then use the classic mathematical model to extrapolate effect (for example)
  14. Conclusions If you want to start using Machine Learning: •

    Start with easy objectives • Measure your errors, some can, in fact, be positive • Test your models not only on typical, but also on abnormal data • Know your limits (mostly having enough data)