Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Insight Project
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
mxie
February 20, 2017
Science
1
53
Insight Project
Insight project: High value customer profile extration
mxie
February 20, 2017
Tweet
Share
Other Decks in Science
See All in Science
LayerXにおける業務の完全自動運転化に向けたAI技術活用事例 / layerx-ai-jsai2025
shimacos
4
21k
Accelerated Computing for Climate forecast
inureyes
PRO
0
150
白金鉱業Meetup_Vol.20 効果検証ことはじめ / Introduction to Impact Evaluation
brainpadpr
2
1.6k
データベース09: 実体関連モデル上の一貫性制約
trycycle
PRO
0
1.1k
Performance Evaluation and Ranking of Drivers in Multiple Motorsports Using Massey’s Method
konakalab
0
140
機械学習 - 決定木からはじめる機械学習
trycycle
PRO
0
1.2k
SpatialRDDパッケージによる空間回帰不連続デザイン
saltcooky12
0
160
Text-to-SQLの既存の評価指標を問い直す
gotalab555
1
170
Celebrate UTIG: Staff and Student Awards 2025
utig
0
680
NDCG is NOT All I Need
statditto
2
2.8k
[Paper Introduction] From Bytes to Ideas:Language Modeling with Autoregressive U-Nets
haruumiomoto
0
190
Accelerating operator Sinkhorn iteration with overrelaxation
tasusu
0
190
Featured
See All Featured
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
10
1.1k
Agile that works and the tools we love
rasmusluckow
331
21k
How to train your dragon (web standard)
notwaldorf
97
6.5k
B2B Lead Gen: Tactics, Traps & Triumph
marketingsoph
0
52
The MySQL Ecosystem @ GitHub 2015
samlambert
251
13k
Are puppies a ranking factor?
jonoalderson
1
2.7k
Noah Learner - AI + Me: how we built a GSC Bulk Export data pipeline
techseoconnect
PRO
0
100
Building a Modern Day E-commerce SEO Strategy
aleyda
45
8.6k
Amusing Abliteration
ianozsvald
0
96
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
234
17k
Art, The Web, and Tiny UX
lynnandtonic
304
21k
AI Search: Where Are We & What Can We Do About It?
aleyda
0
6.9k
Transcript
High Value Customer Profile Extrac6on Miao Xie Consul6ng Project
: Help business owners discover loca6ons Frozen Food Discovered location
: Where do valuable customers come from? Extract valuable customer
profiles
Data • 3.5 million customers • Transac6on, geoloca6on Customer • 300+ stores • Annual
sales, geoloca6on Store • 50K neighborhoods • Demographics, geoloca6on Neighborhoods
Data • 3.5 million customers • Transac6on, geoloca6on Customer • 300+ stores • Annual
sales, geoloca6on Store • 50K neighborhoods • Demographics, geoloca6on Neighborhoods
Data • 3.5 million customers • Transac6on, geoloca6on Customer • 300+ stores • Annual
sales, geoloca6on Store • 50 K neighborhoods • Demographics, geoloca6on Neighborhoods Valuable customers
What metric iden6fies valuable customers?
What metric iden6fies valuable customers? No metric exists!
Sales What metric iden6fies valuable customers? No metric exists!
Sales What metric iden6fies valuable customers? No metric exists!
The Metric: Customer AKrac6on Score
The Metric: Customer AKrac6on Score How a&racted customers are to
a store Customer Attraction Score = Store Sales Nearby Population
Customer Attraction Score = Store Sales Nearby Population Frozen Food
Frozen Food Customer Attraction Score = Store Sales Nearby Population
Neighborhoods
Popula6on Size 4 4 1 3 5 1 18 Neighborhoods
Customer Attraction Score = Store Sales Nearby Population
Data loaded in PostgreSQL Data cleaning AKrac6on score (AS) Super
store Average store Features Model Logis6c regression Random forest classifica6on Work Flow Scikit-learn KMeans cluster Missing values, outliers
Data loaded in PostgreSQL Data cleaning AKrac6on score (AS) Super
store Average store Features Model Logis6c regression Random forest classifica6on Work Flow Scikit-learn KMeans cluster Missing values, outliers
Data loaded in PostgreSQL Data cleaning Customer aKrac6on score Super
store Average store Features Model Logis6c regression Random forest classifica6on Work Flow Scikit-learn KMeans cluster
Data loaded in PostgreSQL Data cleaning Customer aKrac6on score Super
stores Average stores Features Model Logis6c regression Random forest classifica6on Work Flow KMeans cluster
Data loaded in PostgreSQL Data cleaning Customer aKrac6on score Super
stores Average stores Features Model Logis6c regression Random forest classifica6on Work Flow
Who are the valuable customers?
People without a degree like the store 0 0.1 0.2
0.3 0.4 # No degree $ Vegetable Age > 65 yr Feature importance from random forest model
People with large spending in vegetables are less likely to
visit the store 0 0.1 0.2 0.3 0.4 # No degree $ Vegetable Age > 65 yr Feature importance from random forest model
Seniors like the store 0 0.1 0.2 0.3 0.4 #
No degree $ Vegetable Age > 65 yr Feature importance from random forest model
Validation Model to predict high value store loca6ons Random forest
classifica6on
Complete overlap Predicted VS Current Store Locations
New locations Predicted VS Current Store Locations
Valida6on: AUC = 0.92 and Accuracy = 87%
Miao Xie PhD in Physical Chemistry at UCLA – Characterizing
superhard materials Product engineer in Silicon Valley
Backup
None
None