Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
"Haute Couture" and "Prêt-à-Porter" Data Science
Search
Christophe Bourguignat
April 15, 2016
Technology
0
430
"Haute Couture" and "Prêt-à-Porter" Data Science
Talk given @ Telecom ParisTech on April 2016
Christophe Bourguignat
April 15, 2016
Tweet
Share
More Decks by Christophe Bourguignat
See All by Christophe Bourguignat
Adding Neurons to your Assistants
kriss
1
340
Software Engineers, the New Data Scientists
kriss
1
140
Machine Learning for Chief Future Officers
kriss
1
130
Whitening The Blackbox : Why And How To Explain Machine Learning Predictions ?
kriss
1
1.1k
Building a Data Science Team
kriss
2
400
Lean Machine Learning
kriss
5
740
Kaggle Criteo Challenge and Online Learning
kriss
1
260
The #FrenchData landscape
kriss
0
460
Other Decks in Technology
See All in Technology
Amazon VPC Lattice 最新アップデート紹介 - PrivateLink も似たようなアップデートあったけど違いとは
bigmuramura
0
190
株式会社ログラス − エンジニア向け会社説明資料 / Loglass Comapany Deck for Engineer
loglass2019
3
32k
Turing × atmaCup #18 - 1st Place Solution
hakubishin3
0
480
開発生産性向上! 育成を「改善」と捉えるエンジニア育成戦略
shoota
2
320
LINEヤフーのフロントエンド組織・体制の紹介【24年12月】
lycorp_recruit_jp
0
530
Qiita埋め込み用スライド
naoki_0531
0
3.7k
コンテナセキュリティのためのLandlock入門
nullpo_head
2
320
re:Invent 2024 Innovation Talks(NET201)で語られた大切なこと
shotashiratori
0
310
私なりのAIのご紹介 [2024年版]
qt_luigi
1
120
オプトインカメラ:UWB測位を応用したオプトイン型のカメラ計測
matthewlujp
0
170
Amazon Kendra GenAI Index 登場でどう変わる? 評価から学ぶ最適なRAG構成
naoki_0531
0
100
KnowledgeBaseDocuments APIでベクトルインデックス管理を自動化する
iidaxs
1
260
Featured
See All Featured
How to Think Like a Performance Engineer
csswizardry
22
1.2k
Done Done
chrislema
181
16k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
44
6.9k
Building Applications with DynamoDB
mza
91
6.1k
Building an army of robots
kneath
302
44k
Code Reviewing Like a Champion
maltzj
520
39k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
95
17k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
28
900
GitHub's CSS Performance
jonrohan
1030
460k
jQuery: Nuts, Bolts and Bling
dougneiner
61
7.5k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
280
13k
Imperfection Machines: The Place of Print at Facebook
scottboms
266
13k
Transcript
Christophe Bourguignat zelros.com /
[email protected]
/ @zelrosHQ
None
Agenda Models interpretation Models production A short history of Kaggle
MODELS INTERPRETATION
WHY ? Models opacity is a major reject cause by
users Unfortunately, predictive models that are the most powerful are usually the least interpretable
None
None
None
FEATURE IMPORTANCE
None
None
None
AEROSOLVE (AirBnb) Prior = general belief, before looking at the
data Inform the model of our prior beliefs by adding them to a text configuration file during training
None
None
None
Scikit Learn
Scikit Learn March 2014
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn https://github.com/andosa/treeinterpreter/blob/master/treeinterpreter/treeinterpreter.py
EXEMPLE ON BOSTON DATASET
None
http://blog.datadive.net/prediction-intervals-for-random-forests/ Prediction Intervals for Random Forests
None
None
PRODUCTION
None
None
TRADITIONAL B.I. DEPARTMENT DATA ANALYSTS ETL ENGINEER DBAs
“INFINITE LOOP OF SADNESS” DATA SCIENTISTS IT / DATA ENGINEERS
SOFTWARE ENGINEERS BUSINESS http://multithreaded.stitchfix.com/blog/2016/03/16/engineers-shouldnt-write-etl/
CODE http://treycausey.com/software_dev_skills.html
COMPLEXITY AND TECHNICAL DEBT Underutilized features Undeclared consumers Pipeline Jungles
- preparing data in a ML-friendly format http://static.googleusercontent.com/media/research.google.com/fr//pubs/archive/43146.pdf
PRODUCTION FAILS Unseen category Unreproductible feat eng workflow (PMML) Leakage
in DataBase fields (churn) Monitoring
A BRIEF HISTORY OF KAGGLE
June 2013 Sept 2013 Nov 2014 Apr 2015 Mar 2016
None
None
None
None
None
None
None
Refinements : - hashing function - adaptive learning rate (different
flavours) - Vowpal Wabbit - Dropout - PyPy
None
None
None
None
None
None
None
None
QUESTIONS ? zelros.com /
[email protected]
/ @zelrosHQ