Slide 1

Slide 1 text

Agent Based Modeling in Data Science Modeling Consumer Choice Adi Andrei – May 2013

Slide 2

Slide 2 text

Data Science and ABM Typical Data Science —  Story-Telling using Data —  Search for patterns —  Propose a story (model) of how the system works —  Validate the story (macro) —  Tell the story —  Story is used to predict outcome of decisions Agent Based Modeling —  Dramatic re-enactment using Data —  Search for patterns —  Propose a story (model) of how the elements of the system work —  Validate the story (micro & macro) —  Re-enact the story (simulation) —  Story is used to observe impact propagation and predict outcome of decisions

Slide 3

Slide 3 text

Introduction

Slide 4

Slide 4 text

Case Study - FMCG company Model customer choice using ABMs for: —  Recommendation system —  given a product category, which actual SKU from that category should we recommend during a checkout process —  Product design —  predicting performance in the market: i.e. will customers choose it from other products in the same category —  optimization: what needs to be changed in order to perform well

Slide 5

Slide 5 text

A fist full of Data Product category Number of unique SKUs Number of Transactions Fresh Juice 79 17,000 Mayo 96 200,000 Body Wash 175 50,000 Peanut Butter 133 185,000 Salad Dressing 450 200,000 One year of data, from one large supermarket

Slide 6

Slide 6 text

Act 1 The Econometrics Story

Slide 7

Slide 7 text

There’s something about… Mayo —  96 individual SKUs —  Which one to recommend?

Slide 8

Slide 8 text

Factors that influence choice —  Product attributes —  sweetness —  consistency —  pack size —  Pricing and Promotions —  Price —  Promotions —  Social Network effects / interaction —  location —  gender —  ethnicity —  household income Ideal Point Distance - extract using half of the data - need to be updated periodically

Slide 9

Slide 9 text

Maximum Utility Model For each choice i, the utility: ui = w1 *IdealPointDistancei + w2 *pricei + w3 *promotionsi ! ! ! + w4 *social_neti ! ! ! !w5 = tremblingHand (Probability of random choice) ! Constrain: w1 +w2 +w3 +w4 +w5 =1 | MaxArg(ui ) , if rational choice Predicted Choice = | | Random(i) , if trembling hand

Slide 10

Slide 10 text

The Challenge: Parameter estimation —  Hybrid PSO/GA algorithm —  Particle Swarm Optimization —  Genetic Algorithms —  Dimensions: w 1 ,w 2 ,….w 5 —  Evaluation = run a complete simulation for each individual configuration

Slide 11

Slide 11 text

Results and Observations —  Mayo : —  96 SKUs —  200,000 transactions —  Social Net has smallest weight, but eliminating it degrades the model significantly —  Wtrembling_hand = 0 MaxUtil 35% Random 2%

Slide 12

Slide 12 text

And they lived happily ever after…. … or did they?

Slide 13

Slide 13 text

Act 2 The Neuroscience Story

Slide 14

Slide 14 text

The Big Question How does the model perform when applied to other product categories?

Slide 15

Slide 15 text

Answer: Not so good Juice 0.18 MayoNew 0.33 BodyWash 0.08 PeanutButter 0.20 SaladDressing 0.04 0.00 0.25 0.50 0.75 1.00 Category SKUs Transactions Fresh Juice 79 17,000 Mayo 96 200,000 Body Wash 175 50,000 Peanut Butter 80 185,000 Salad Dressing 450 200,000 Random 2% - 4%

Slide 16

Slide 16 text

Help! We need a better story!

Slide 17

Slide 17 text

Habitual Behavior —  Choosing groceries is a highly repetitive, habitual action. This should be accounted for somehow. —  Adding is_last_choice, and frequency do NOT improve the existing model. ui = w1 *IdealPointDistancei + w2 *pricei + w3 *promotionsi ! ! ! + w4 *social_neti + w5 *is_last_choice + w5*frequency !

Slide 18

Slide 18 text

No content

Slide 19

Slide 19 text

Habitual Model —  Reinforced Memory Value —  the preference for each choice spikes when it is made, then it decreases in time unless it is being reinforced by the customer making the same choice again ui = w1 *IdealPointDistancei + w2 *pricei + w3 *promotionsi ! ! ! + w4 *social_neti + w5 *memory_value ! ! ! | MaxArg(ui ) , if rational choice Predicted Choice = | | Random(i) , if trembling hand !

Slide 20

Slide 20 text

Results Max Util Habitual Juice 0.18 0.40 MayoNew 0.33 0.49 BodyWash 0.08 0.30 PeanutButter 0.20 0.44 SaladDressing 0.04 0.25 0.00 0.25 0.50 0.75 1.00

Slide 21

Slide 21 text

Observations —  Consistent good performance across categories —  Performs 50%-600% better than original model, especially for categories with lots of choices —  Makes Ideal Points redundant, resulting in a completely On-Line model (no need for re-training) —  Reduces the influence of price an promotions. Category SKUs Transactions Fresh Juice 79 17,000 Mayo 96 200,000 Body Wash 175 50,000 Peanut Butter 80 185,000 Salad Dressing 450 200,000 Max Util Habitual Juice 0.18 0.40 MayoNew 0.33 0.49 BodyWash 0.08 0.30 PeanutButter 0.20 0.44 SaladDressing 0.04 0.25 0.00 0.25 0.50 0.75 1.00

Slide 22

Slide 22 text

Micro-level performance 0% 20% 40% 60% 80% 100% 0 5000 10000 15000 20000 25000 0 10 20 30 40 50 60 70 80 90 100 customers %switches per customer •  30% of the customer behaved according to the model more than 90% of the time •  30% of the customers did not behave according to the model more than 90% of the time •  Not all customers have made same number of transactions. •  Model is not perfect but seems to accurately capture the behavior of a significant part of the customers

Slide 23

Slide 23 text

Epilogue “The purpose of a storyteller is not to tell you how to think, but to give you questions to think upon.” ― Brandon Sanderson

Slide 24

Slide 24 text

Remaining Questions —  We captured very well the behavior of 30% of the customers, but only partially the behavior of the others —  What makes these 30% different than the others? —  How is their story different? —  Can it be figured out with the data that we have? —  What are the triggers and attributes that make people behave in a non-habitual manner?

Slide 25

Slide 25 text

Client Feedback —  Real-world testing showed significant improvement in recommendations being translated into sales for these products. —  (no numbers – confidential - etc)

Slide 26

Slide 26 text

Thank you Neuroscience Nature-inspired Machine Learning Agent Based Simulations Econometrics Programming [email protected]