Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
"Haute Couture" and "Prêt-à-Porter" Data Science
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Christophe Bourguignat
April 15, 2016
Technology
0
480
"Haute Couture" and "Prêt-à-Porter" Data Science
Talk given @ Telecom ParisTech on April 2016
Christophe Bourguignat
April 15, 2016
Tweet
Share
More Decks by Christophe Bourguignat
See All by Christophe Bourguignat
Adding Neurons to your Assistants
kriss
1
360
Software Engineers, the New Data Scientists
kriss
1
150
Machine Learning for Chief Future Officers
kriss
1
140
Whitening The Blackbox : Why And How To Explain Machine Learning Predictions ?
kriss
1
1.2k
Building a Data Science Team
kriss
2
410
Lean Machine Learning
kriss
5
780
Kaggle Criteo Challenge and Online Learning
kriss
1
290
The #FrenchData landscape
kriss
0
490
Other Decks in Technology
See All in Technology
AWSをCLIで理解したい! / I want to understand AWS using the CLI
mel_27
2
130
Devinを導入したら予想外の人たちに好評だった
tomuro
0
930
ITインフラの液体冷却
recruitengineers
PRO
2
110
作りっぱなしで終わらせない! 価値を出し続ける AI エージェントのための「信頼性」設計 / Designing Reliability for AI Agents that Deliver Continuous Value
aoto
PRO
1
180
8万デプロイ
iwamot
PRO
2
160
Agentic Software Modernization - Back to the Roots (Zürich Agentic Coding and Architectures, März 2026)
feststelltaste
1
200
DevOpsエージェントで実現する!! AWS Well-Architected(W-A) を実現するシステム設計 / 20260307 Masaki Okuda
shift_evolve
PRO
3
210
Kaggleで鍛えたスキルの実務での活かし方 競技とプロダクト開発のリアル
recruitengineers
PRO
1
180
型を書かないRuby開発への挑戦
riseshia
0
190
問い合わせ自動化の技術的挑戦
recruitengineers
PRO
2
170
Exadata Database Service on Dedicated Infrastructure(ExaDB-D) UI スクリーン・キャプチャ集
oracle4engineer
PRO
7
7.1k
Windows ネットワークを再確認する
murachiakira
PRO
0
290
Featured
See All Featured
The browser strikes back
jonoalderson
0
760
svc-hook: hooking system calls on ARM64 by binary rewriting
retrage
2
150
Designing Experiences People Love
moore
143
24k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
133
19k
From Legacy to Launchpad: Building Startup-Ready Communities
dugsong
0
170
Abbi's Birthday
coloredviolet
2
5.1k
Leveraging LLMs for student feedback in introductory data science courses - posit::conf(2025)
minecr
1
190
Paper Plane
katiecoart
PRO
0
47k
Imperfection Machines: The Place of Print at Facebook
scottboms
269
14k
Designing for humans not robots
tammielis
254
26k
The Anti-SEO Checklist Checklist. Pubcon Cyber Week
ryanjones
0
87
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
37
6.3k
Transcript
Christophe Bourguignat zelros.com /
[email protected]
/ @zelrosHQ
None
Agenda Models interpretation Models production A short history of Kaggle
MODELS INTERPRETATION
WHY ? Models opacity is a major reject cause by
users Unfortunately, predictive models that are the most powerful are usually the least interpretable
None
None
None
FEATURE IMPORTANCE
None
None
None
AEROSOLVE (AirBnb) Prior = general belief, before looking at the
data Inform the model of our prior beliefs by adding them to a text configuration file during training
None
None
None
Scikit Learn
Scikit Learn March 2014
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn https://github.com/andosa/treeinterpreter/blob/master/treeinterpreter/treeinterpreter.py
EXEMPLE ON BOSTON DATASET
None
http://blog.datadive.net/prediction-intervals-for-random-forests/ Prediction Intervals for Random Forests
None
None
PRODUCTION
None
None
TRADITIONAL B.I. DEPARTMENT DATA ANALYSTS ETL ENGINEER DBAs
“INFINITE LOOP OF SADNESS” DATA SCIENTISTS IT / DATA ENGINEERS
SOFTWARE ENGINEERS BUSINESS http://multithreaded.stitchfix.com/blog/2016/03/16/engineers-shouldnt-write-etl/
CODE http://treycausey.com/software_dev_skills.html
COMPLEXITY AND TECHNICAL DEBT Underutilized features Undeclared consumers Pipeline Jungles
- preparing data in a ML-friendly format http://static.googleusercontent.com/media/research.google.com/fr//pubs/archive/43146.pdf
PRODUCTION FAILS Unseen category Unreproductible feat eng workflow (PMML) Leakage
in DataBase fields (churn) Monitoring
A BRIEF HISTORY OF KAGGLE
June 2013 Sept 2013 Nov 2014 Apr 2015 Mar 2016
None
None
None
None
None
None
None
Refinements : - hashing function - adaptive learning rate (different
flavours) - Vowpal Wabbit - Dropout - PyPy
None
None
None
None
None
None
None
None
QUESTIONS ? zelros.com /
[email protected]
/ @zelrosHQ