Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Agent Based Methods. Modelling Consumer Choice

Agent Based Methods. Modelling Consumer Choice

Adi Andrei, Data Scientist at Technosophics. Talk at Data Science London @ds_ldn 29/05/13

Data Science London

May 31, 2013
Tweet

More Decks by Data Science London

Other Decks in Technology

Transcript

  1. Data Science and ABM Typical Data Science —  Story-Telling using

    Data —  Search for patterns —  Propose a story (model) of how the system works —  Validate the story (macro) —  Tell the story —  Story is used to predict outcome of decisions Agent Based Modeling —  Dramatic re-enactment using Data —  Search for patterns —  Propose a story (model) of how the elements of the system work —  Validate the story (micro & macro) —  Re-enact the story (simulation) —  Story is used to observe impact propagation and predict outcome of decisions
  2. Case Study - FMCG company Model customer choice using ABMs

    for: —  Recommendation system —  given a product category, which actual SKU from that category should we recommend during a checkout process —  Product design —  predicting performance in the market: i.e. will customers choose it from other products in the same category —  optimization: what needs to be changed in order to perform well
  3. A fist full of Data Product category Number of unique

    SKUs Number of Transactions Fresh Juice 79 17,000 Mayo 96 200,000 Body Wash 175 50,000 Peanut Butter 133 185,000 Salad Dressing 450 200,000 One year of data, from one large supermarket
  4. Factors that influence choice —  Product attributes —  sweetness — 

    consistency —  pack size —  Pricing and Promotions —  Price —  Promotions —  Social Network effects / interaction —  location —  gender —  ethnicity —  household income Ideal Point Distance - extract using half of the data - need to be updated periodically
  5. Maximum Utility Model For each choice i, the utility: ui

    = w1 *IdealPointDistancei + w2 *pricei + w3 *promotionsi ! ! ! + w4 *social_neti ! ! ! !w5 = tremblingHand (Probability of random choice) ! Constrain: w1 +w2 +w3 +w4 +w5 =1 | MaxArg(ui ) , if rational choice Predicted Choice = | | Random(i) , if trembling hand
  6. The Challenge: Parameter estimation —  Hybrid PSO/GA algorithm —  Particle

    Swarm Optimization —  Genetic Algorithms —  Dimensions: w 1 ,w 2 ,….w 5 —  Evaluation = run a complete simulation for each individual configuration
  7. Results and Observations —  Mayo : —  96 SKUs — 

    200,000 transactions —  Social Net has smallest weight, but eliminating it degrades the model significantly —  Wtrembling_hand = 0 MaxUtil 35% Random 2%
  8. Answer: Not so good Juice 0.18 MayoNew 0.33 BodyWash 0.08

    PeanutButter 0.20 SaladDressing 0.04 0.00 0.25 0.50 0.75 1.00 Category SKUs Transactions Fresh Juice 79 17,000 Mayo 96 200,000 Body Wash 175 50,000 Peanut Butter 80 185,000 Salad Dressing 450 200,000 Random 2% - 4%
  9. Habitual Behavior —  Choosing groceries is a highly repetitive, habitual

    action. This should be accounted for somehow. —  Adding is_last_choice, and frequency do NOT improve the existing model. ui = w1 *IdealPointDistancei + w2 *pricei + w3 *promotionsi ! ! ! + w4 *social_neti + w5 *is_last_choice + w5*frequency !
  10. Habitual Model —  Reinforced Memory Value —  the preference for

    each choice spikes when it is made, then it decreases in time unless it is being reinforced by the customer making the same choice again ui = w1 *IdealPointDistancei + w2 *pricei + w3 *promotionsi ! ! ! + w4 *social_neti + w5 *memory_value ! ! ! | MaxArg(ui ) , if rational choice Predicted Choice = | | Random(i) , if trembling hand !
  11. Results Max Util Habitual Juice 0.18 0.40 MayoNew 0.33 0.49

    BodyWash 0.08 0.30 PeanutButter 0.20 0.44 SaladDressing 0.04 0.25 0.00 0.25 0.50 0.75 1.00
  12. Observations —  Consistent good performance across categories —  Performs 50%-600%

    better than original model, especially for categories with lots of choices —  Makes Ideal Points redundant, resulting in a completely On-Line model (no need for re-training) —  Reduces the influence of price an promotions. Category SKUs Transactions Fresh Juice 79 17,000 Mayo 96 200,000 Body Wash 175 50,000 Peanut Butter 80 185,000 Salad Dressing 450 200,000 Max Util Habitual Juice 0.18 0.40 MayoNew 0.33 0.49 BodyWash 0.08 0.30 PeanutButter 0.20 0.44 SaladDressing 0.04 0.25 0.00 0.25 0.50 0.75 1.00
  13. Micro-level performance 0% 20% 40% 60% 80% 100% 0 5000

    10000 15000 20000 25000 0 10 20 30 40 50 60 70 80 90 100 customers %switches per customer •  30% of the customer behaved according to the model more than 90% of the time •  30% of the customers did not behave according to the model more than 90% of the time •  Not all customers have made same number of transactions. •  Model is not perfect but seems to accurately capture the behavior of a significant part of the customers
  14. Epilogue “The purpose of a storyteller is not to tell

    you how to think, but to give you questions to think upon.” ― Brandon Sanderson
  15. Remaining Questions —  We captured very well the behavior of

    30% of the customers, but only partially the behavior of the others —  What makes these 30% different than the others? —  How is their story different? —  Can it be figured out with the data that we have? —  What are the triggers and attributes that make people behave in a non-habitual manner?
  16. Client Feedback —  Real-world testing showed significant improvement in recommendations

    being translated into sales for these products. —  (no numbers – confidential - etc)