Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
"Haute Couture" and "Prêt-à-Porter" Data Science
Search
Christophe Bourguignat
April 15, 2016
Technology
0
470
"Haute Couture" and "Prêt-à-Porter" Data Science
Talk given @ Telecom ParisTech on April 2016
Christophe Bourguignat
April 15, 2016
Tweet
Share
More Decks by Christophe Bourguignat
See All by Christophe Bourguignat
Adding Neurons to your Assistants
kriss
1
360
Software Engineers, the New Data Scientists
kriss
1
140
Machine Learning for Chief Future Officers
kriss
1
130
Whitening The Blackbox : Why And How To Explain Machine Learning Predictions ?
kriss
1
1.2k
Building a Data Science Team
kriss
2
410
Lean Machine Learning
kriss
5
770
Kaggle Criteo Challenge and Online Learning
kriss
1
280
The #FrenchData landscape
kriss
0
490
Other Decks in Technology
See All in Technology
ヒューリスティック評価を用いたゲームQA実践事例
gree_tech
PRO
0
540
kubellが考える戦略と実行を繋ぐ活用ファーストのデータ分析基盤
kubell_hr
0
140
エニグモ_会社紹介資料(エンジニア職種向け).pdf
enigmo_hr
0
2.2k
BPaaSにおける人と協働する前提のAIエージェント-AWS登壇資料
kentarofujii
0
120
バッチ処理で悩むバックエンドエンジニアに捧げるAWS Glue入門
diggymo
3
120
20250903_1つのAWSアカウントに複数システムがある環境におけるアクセス制御をABACで実現.pdf
yhana
2
340
Obsidian応用活用術
onikun94
1
390
開発者を支える Internal Developer Portal のイマとコレカラ / To-day and To-morrow of Internal Developer Portals: Supporting Developers
aoto
PRO
1
320
実運用で考える PGO
kworkdev
PRO
0
150
DuckDB-Wasmを使って ブラウザ上でRDBMSを動かす
hacusk
1
140
5年目から始める Vue3 サイト改善 #frontendo
tacck
PRO
3
190
サンドボックス技術でAI利活用を促進する
koh_naga
0
190
Featured
See All Featured
The Art of Programming - Codeland 2020
erikaheidi
55
13k
XXLCSS - How to scale CSS and keep your sanity
sugarenia
248
1.3M
GitHub's CSS Performance
jonrohan
1032
460k
Optimising Largest Contentful Paint
csswizardry
37
3.4k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
30
9.6k
How to Ace a Technical Interview
jacobian
279
23k
Designing Experiences People Love
moore
142
24k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
31
2.2k
jQuery: Nuts, Bolts and Bling
dougneiner
64
7.9k
Mobile First: as difficult as doing things right
swwweet
224
9.9k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
PRO
23
1.4k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
139
34k
Transcript
Christophe Bourguignat zelros.com /
[email protected]
/ @zelrosHQ
None
Agenda Models interpretation Models production A short history of Kaggle
MODELS INTERPRETATION
WHY ? Models opacity is a major reject cause by
users Unfortunately, predictive models that are the most powerful are usually the least interpretable
None
None
None
FEATURE IMPORTANCE
None
None
None
AEROSOLVE (AirBnb) Prior = general belief, before looking at the
data Inform the model of our prior beliefs by adding them to a text configuration file during training
None
None
None
Scikit Learn
Scikit Learn March 2014
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn https://github.com/andosa/treeinterpreter/blob/master/treeinterpreter/treeinterpreter.py
EXEMPLE ON BOSTON DATASET
None
http://blog.datadive.net/prediction-intervals-for-random-forests/ Prediction Intervals for Random Forests
None
None
PRODUCTION
None
None
TRADITIONAL B.I. DEPARTMENT DATA ANALYSTS ETL ENGINEER DBAs
“INFINITE LOOP OF SADNESS” DATA SCIENTISTS IT / DATA ENGINEERS
SOFTWARE ENGINEERS BUSINESS http://multithreaded.stitchfix.com/blog/2016/03/16/engineers-shouldnt-write-etl/
CODE http://treycausey.com/software_dev_skills.html
COMPLEXITY AND TECHNICAL DEBT Underutilized features Undeclared consumers Pipeline Jungles
- preparing data in a ML-friendly format http://static.googleusercontent.com/media/research.google.com/fr//pubs/archive/43146.pdf
PRODUCTION FAILS Unseen category Unreproductible feat eng workflow (PMML) Leakage
in DataBase fields (churn) Monitoring
A BRIEF HISTORY OF KAGGLE
June 2013 Sept 2013 Nov 2014 Apr 2015 Mar 2016
None
None
None
None
None
None
None
Refinements : - hashing function - adaptive learning rate (different
flavours) - Vowpal Wabbit - Dropout - PyPy
None
None
None
None
None
None
None
None
QUESTIONS ? zelros.com /
[email protected]
/ @zelrosHQ