Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
"Haute Couture" and "Prêt-à-Porter" Data Science
Search
Christophe Bourguignat
April 15, 2016
Technology
0
430
"Haute Couture" and "Prêt-à-Porter" Data Science
Talk given @ Telecom ParisTech on April 2016
Christophe Bourguignat
April 15, 2016
Tweet
Share
More Decks by Christophe Bourguignat
See All by Christophe Bourguignat
Adding Neurons to your Assistants
kriss
1
340
Software Engineers, the New Data Scientists
kriss
1
140
Machine Learning for Chief Future Officers
kriss
1
130
Whitening The Blackbox : Why And How To Explain Machine Learning Predictions ?
kriss
1
1.1k
Building a Data Science Team
kriss
2
400
Lean Machine Learning
kriss
5
750
Kaggle Criteo Challenge and Online Learning
kriss
1
260
The #FrenchData landscape
kriss
0
470
Other Decks in Technology
See All in Technology
「人物ごとのアルバム」の精度改善の軌跡/Improving accuracy of albums by person
mixi_engineers
PRO
1
140
東京Ruby会議12 Ruby と Rust と私 / Tokyo RubyKaigi 12 Ruby, Rust and me
eagletmt
3
960
Amazon Q Developerで.NET Frameworkプロジェクトをモダナイズしてみた
kenichirokimura
1
200
コロプラのオンボーディングを採用から語りたい
colopl
5
1.4k
AWS Community Builderのススメ - みんなもCommunity Builderに応募しよう! -
smt7174
0
200
Goで実践するBFP
hiroyaterui
1
120
デジタルアイデンティティ技術 認可・ID連携・認証 応用 / 20250114-OIDF-J-EduWG-TechSWG
oidfj
2
710
comilioとCloudflare、そして未来へと向けて
oliver_diary
6
460
メンバーがオーナーシップを発揮しやすいチームづくり
ham0215
2
260
dbtを中心にして組織のアジリティとガバナンスのトレードオンを考えてみた
gappy50
0
330
RubyでKubernetesプログラミング
sat
PRO
4
160
新卒1年目、はじめてのアプリケーションサーバー【IBM WebSphere Liberty】
ktgrryt
0
150
Featured
See All Featured
Rails Girls Zürich Keynote
gr2m
94
13k
Code Reviewing Like a Champion
maltzj
521
39k
Typedesign – Prime Four
hannesfritz
40
2.5k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
280
13k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
132
33k
XXLCSS - How to scale CSS and keep your sanity
sugarenia
248
1.3M
Fashionably flexible responsive web design (full day workshop)
malarkey
406
66k
Adopting Sorbet at Scale
ufuk
74
9.2k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
356
29k
A Philosophy of Restraint
colly
203
16k
Optimizing for Happiness
mojombo
376
70k
The Invisible Side of Design
smashingmag
299
50k
Transcript
Christophe Bourguignat zelros.com /
[email protected]
/ @zelrosHQ
None
Agenda Models interpretation Models production A short history of Kaggle
MODELS INTERPRETATION
WHY ? Models opacity is a major reject cause by
users Unfortunately, predictive models that are the most powerful are usually the least interpretable
None
None
None
FEATURE IMPORTANCE
None
None
None
AEROSOLVE (AirBnb) Prior = general belief, before looking at the
data Inform the model of our prior beliefs by adding them to a text configuration file during training
None
None
None
Scikit Learn
Scikit Learn March 2014
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn https://github.com/andosa/treeinterpreter/blob/master/treeinterpreter/treeinterpreter.py
EXEMPLE ON BOSTON DATASET
None
http://blog.datadive.net/prediction-intervals-for-random-forests/ Prediction Intervals for Random Forests
None
None
PRODUCTION
None
None
TRADITIONAL B.I. DEPARTMENT DATA ANALYSTS ETL ENGINEER DBAs
“INFINITE LOOP OF SADNESS” DATA SCIENTISTS IT / DATA ENGINEERS
SOFTWARE ENGINEERS BUSINESS http://multithreaded.stitchfix.com/blog/2016/03/16/engineers-shouldnt-write-etl/
CODE http://treycausey.com/software_dev_skills.html
COMPLEXITY AND TECHNICAL DEBT Underutilized features Undeclared consumers Pipeline Jungles
- preparing data in a ML-friendly format http://static.googleusercontent.com/media/research.google.com/fr//pubs/archive/43146.pdf
PRODUCTION FAILS Unseen category Unreproductible feat eng workflow (PMML) Leakage
in DataBase fields (churn) Monitoring
A BRIEF HISTORY OF KAGGLE
June 2013 Sept 2013 Nov 2014 Apr 2015 Mar 2016
None
None
None
None
None
None
None
Refinements : - hashing function - adaptive learning rate (different
flavours) - Vowpal Wabbit - Dropout - PyPy
None
None
None
None
None
None
None
None
QUESTIONS ? zelros.com /
[email protected]
/ @zelrosHQ