Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
"Haute Couture" and "Prêt-à-Porter" Data Science
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Christophe Bourguignat
April 15, 2016
Technology
0
480
"Haute Couture" and "Prêt-à-Porter" Data Science
Talk given @ Telecom ParisTech on April 2016
Christophe Bourguignat
April 15, 2016
Tweet
Share
More Decks by Christophe Bourguignat
See All by Christophe Bourguignat
Adding Neurons to your Assistants
kriss
1
370
Software Engineers, the New Data Scientists
kriss
1
150
Machine Learning for Chief Future Officers
kriss
1
140
Whitening The Blackbox : Why And How To Explain Machine Learning Predictions ?
kriss
1
1.2k
Building a Data Science Team
kriss
2
420
Lean Machine Learning
kriss
5
780
Kaggle Criteo Challenge and Online Learning
kriss
1
300
The #FrenchData landscape
kriss
0
500
Other Decks in Technology
See All in Technology
TUNA Camp 2026 京都Stage ヒューリスティックアルゴリズム入門
terryu16
0
600
【社内勉強会】新年度からコーディングエージェントを使いこなす - 構造と制約で引き出すClaude Codeの実践知
nwiizo
28
14k
20260326_AIDD事例紹介_ULSC.pdf
findy_eventslides
0
160
OpenClawでPM業務を自動化
knishioka
1
320
出版記念イベントin大阪「書籍紹介&私がよく使うMCPサーバー3選と社内で安全に活用する方法」
kintotechdev
0
110
「活動」は激変する。「ベース」は変わらない ~ 4つの軸で捉える_AI時代ソフトウェア開発マネジメント
sentokun
0
130
契約書からの情報抽出を行うLLMのスループットを、バッチ処理を用いて最大40%改善した話
sansantech
PRO
3
320
AIにより大幅に強化された AWS Transform Customを触ってみる
0air
0
170
Amazon Qはアマコネで頑張っています〜 Amazon Q in Connectについて〜
yama3133
1
150
イベントで大活躍する電子ペーパー名札を作る(その2) 〜 M5PaperとM5PaperS3 〜 / IoTLT @ JLCPCB オープンハードカンファレンス
you
PRO
0
210
AWS Systems Managerのハイブリッドアクティベーションを使用したガバメントクラウド環境の統合管理
toru_kubota
1
190
Oracle AI Database@Google Cloud:サービス概要のご紹介
oracle4engineer
PRO
5
1.2k
Featured
See All Featured
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
659
61k
GraphQLの誤解/rethinking-graphql
sonatard
75
12k
Typedesign – Prime Four
hannesfritz
42
3k
What’s in a name? Adding method to the madness
productmarketing
PRO
24
4k
Data-driven link building: lessons from a $708K investment (BrightonSEO talk)
szymonslowik
1
990
A Soul's Torment
seathinner
5
2.5k
Become a Pro
speakerdeck
PRO
31
5.9k
The agentic SEO stack - context over prompts
schlessera
0
720
Ethics towards AI in product and experience design
skipperchong
2
240
KATA
mclloyd
PRO
35
15k
Skip the Path - Find Your Career Trail
mkilby
1
91
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
27k
Transcript
Christophe Bourguignat zelros.com /
[email protected]
/ @zelrosHQ
None
Agenda Models interpretation Models production A short history of Kaggle
MODELS INTERPRETATION
WHY ? Models opacity is a major reject cause by
users Unfortunately, predictive models that are the most powerful are usually the least interpretable
None
None
None
FEATURE IMPORTANCE
None
None
None
AEROSOLVE (AirBnb) Prior = general belief, before looking at the
data Inform the model of our prior beliefs by adding them to a text configuration file during training
None
None
None
Scikit Learn
Scikit Learn March 2014
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn March 2014 April 2015
Scikit Learn https://github.com/andosa/treeinterpreter/blob/master/treeinterpreter/treeinterpreter.py
EXEMPLE ON BOSTON DATASET
None
http://blog.datadive.net/prediction-intervals-for-random-forests/ Prediction Intervals for Random Forests
None
None
PRODUCTION
None
None
TRADITIONAL B.I. DEPARTMENT DATA ANALYSTS ETL ENGINEER DBAs
“INFINITE LOOP OF SADNESS” DATA SCIENTISTS IT / DATA ENGINEERS
SOFTWARE ENGINEERS BUSINESS http://multithreaded.stitchfix.com/blog/2016/03/16/engineers-shouldnt-write-etl/
CODE http://treycausey.com/software_dev_skills.html
COMPLEXITY AND TECHNICAL DEBT Underutilized features Undeclared consumers Pipeline Jungles
- preparing data in a ML-friendly format http://static.googleusercontent.com/media/research.google.com/fr//pubs/archive/43146.pdf
PRODUCTION FAILS Unseen category Unreproductible feat eng workflow (PMML) Leakage
in DataBase fields (churn) Monitoring
A BRIEF HISTORY OF KAGGLE
June 2013 Sept 2013 Nov 2014 Apr 2015 Mar 2016
None
None
None
None
None
None
None
Refinements : - hashing function - adaptive learning rate (different
flavours) - Vowpal Wabbit - Dropout - PyPy
None
None
None
None
None
None
None
None
QUESTIONS ? zelros.com /
[email protected]
/ @zelrosHQ