Agent Based Methods. Modelling Consumer Choice

Agent Based Modeling in Data Science Modeling Consumer Choice Adi
Andrei – May 2013

Data Science and ABM Typical Data Science   Story-Telling using
Data   Search for patterns   Propose a story (model) of how the system works   Validate the story (macro)   Tell the story   Story is used to predict outcome of decisions Agent Based Modeling   Dramatic re-enactment using Data   Search for patterns   Propose a story (model) of how the elements of the system work   Validate the story (micro & macro)   Re-enact the story (simulation)   Story is used to observe impact propagation and predict outcome of decisions

Introduction

Case Study - FMCG company Model customer choice using ABMs
for:   Recommendation system   given a product category, which actual SKU from that category should we recommend during a checkout process   Product design   predicting performance in the market: i.e. will customers choose it from other products in the same category   optimization: what needs to be changed in order to perform well

A fist full of Data Product category Number of unique
SKUs Number of Transactions Fresh Juice 79 17,000 Mayo 96 200,000 Body Wash 175 50,000 Peanut Butter 133 185,000 Salad Dressing 450 200,000 One year of data, from one large supermarket

Act 1 The Econometrics Story

There’s something about… Mayo   96 individual SKUs   Which
one to recommend?

Factors that influence choice   Product attributes   sweetness  
consistency   pack size   Pricing and Promotions   Price   Promotions   Social Network effects / interaction   location   gender   ethnicity   household income Ideal Point Distance - extract using half of the data - need to be updated periodically

Maximum Utility Model For each choice i, the utility: ui
= w1 *IdealPointDistancei + w2 *pricei + w3 *promotionsi ! ! ! + w4 *social_neti ! ! ! !w5 = tremblingHand (Probability of random choice) ! Constrain: w1 +w2 +w3 +w4 +w5 =1 | MaxArg(ui ) , if rational choice Predicted Choice = | | Random(i) , if trembling hand

The Challenge: Parameter estimation   Hybrid PSO/GA algorithm   Particle
Swarm Optimization   Genetic Algorithms   Dimensions: w 1 ,w 2 ,….w 5   Evaluation = run a complete simulation for each individual configuration

Results and Observations   Mayo :   96 SKUs  
200,000 transactions   Social Net has smallest weight, but eliminating it degrades the model significantly   Wtrembling_hand = 0 MaxUtil 35% Random 2%

And they lived happily ever after…. … or did they?

Act 2 The Neuroscience Story

The Big Question How does the model perform when applied
to other product categories?

Answer: Not so good Juice 0.18 MayoNew 0.33 BodyWash 0.08
PeanutButter 0.20 SaladDressing 0.04 0.00 0.25 0.50 0.75 1.00 Category SKUs Transactions Fresh Juice 79 17,000 Mayo 96 200,000 Body Wash 175 50,000 Peanut Butter 80 185,000 Salad Dressing 450 200,000 Random 2% - 4%

Help! We need a better story!

Habitual Behavior   Choosing groceries is a highly repetitive, habitual
action. This should be accounted for somehow.   Adding is_last_choice, and frequency do NOT improve the existing model. ui = w1 *IdealPointDistancei + w2 *pricei + w3 *promotionsi ! ! ! + w4 *social_neti + w5 *is_last_choice + w5*frequency !

Habitual Model   Reinforced Memory Value   the preference for
each choice spikes when it is made, then it decreases in time unless it is being reinforced by the customer making the same choice again ui = w1 *IdealPointDistancei + w2 *pricei + w3 *promotionsi ! ! ! + w4 *social_neti + w5 *memory_value ! ! ! | MaxArg(ui ) , if rational choice Predicted Choice = | | Random(i) , if trembling hand !

Results Max Util Habitual Juice 0.18 0.40 MayoNew 0.33 0.49
BodyWash 0.08 0.30 PeanutButter 0.20 0.44 SaladDressing 0.04 0.25 0.00 0.25 0.50 0.75 1.00

Observations   Consistent good performance across categories   Performs 50%-600%
better than original model, especially for categories with lots of choices   Makes Ideal Points redundant, resulting in a completely On-Line model (no need for re-training)   Reduces the influence of price an promotions. Category SKUs Transactions Fresh Juice 79 17,000 Mayo 96 200,000 Body Wash 175 50,000 Peanut Butter 80 185,000 Salad Dressing 450 200,000 Max Util Habitual Juice 0.18 0.40 MayoNew 0.33 0.49 BodyWash 0.08 0.30 PeanutButter 0.20 0.44 SaladDressing 0.04 0.25 0.00 0.25 0.50 0.75 1.00

Micro-level performance 0% 20% 40% 60% 80% 100% 0 5000
10000 15000 20000 25000 0 10 20 30 40 50 60 70 80 90 100 customers %switches per customer •  30% of the customer behaved according to the model more than 90% of the time •  30% of the customers did not behave according to the model more than 90% of the time •  Not all customers have made same number of transactions. •  Model is not perfect but seems to accurately capture the behavior of a significant part of the customers

Epilogue “The purpose of a storyteller is not to tell
you how to think, but to give you questions to think upon.” ― Brandon Sanderson

Remaining Questions   We captured very well the behavior of
30% of the customers, but only partially the behavior of the others   What makes these 30% different than the others?   How is their story different?   Can it be figured out with the data that we have?   What are the triggers and attributes that make people behave in a non-habitual manner?

Client Feedback   Real-world testing showed significant improvement in recommendations
being translated into sales for these products.   (no numbers – confidential - etc)

Thank you Neuroscience Nature-inspired Machine Learning Agent Based Simulations Econometrics
Programming [email protected]

Agent Based Methods. Modelling Consumer Choice

Agent Based Methods. Modelling Consumer Choice

Data Science London

More Decks by Data Science London

Other Decks in Technology

Featured

Transcript

Agent Based Modeling in Data Science Modeling Consumer Choice Adi

Data Science and ABM Typical Data Science   Story-Telling using

Introduction

Case Study - FMCG company Model customer choice using ABMs

A fist full of Data Product category Number of unique

Act 1 The Econometrics Story

There’s something about… Mayo   96 individual SKUs   Which

Factors that influence choice   Product attributes   sweetness 

Maximum Utility Model For each choice i, the utility: ui

The Challenge: Parameter estimation   Hybrid PSO/GA algorithm   Particle

Results and Observations   Mayo :   96 SKUs 

And they lived happily ever after…. … or did they?

Act 2 The Neuroscience Story

The Big Question How does the model perform when applied

Answer: Not so good Juice 0.18 MayoNew 0.33 BodyWash 0.08

Help! We need a better story!

Habitual Behavior   Choosing groceries is a highly repetitive, habitual

Habitual Model   Reinforced Memory Value   the preference for

Results Max Util Habitual Juice 0.18 0.40 MayoNew 0.33 0.49

Observations   Consistent good performance across categories   Performs 50%-600%

Micro-level performance 0% 20% 40% 60% 80% 100% 0 5000

Epilogue “The purpose of a storyteller is not to tell

Remaining Questions   We captured very well the behavior of

Client Feedback   Real-world testing showed significant improvement in recommendations

Thank you Neuroscience Nature-inspired Machine Learning Agent Based Simulations Econometrics