Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Scaling Machine Learning at Holiday Extras (Big Data LDN 2019))
Search
Rebecca Vickery
November 13, 2019
Technology
0
120
Scaling Machine Learning at Holiday Extras (Big Data LDN 2019))
Rebecca Vickery
November 13, 2019
Tweet
Share
More Decks by Rebecca Vickery
See All by Rebecca Vickery
Pair Programming with AI
rebeccavickery
1
67
Machine Learning for Everyone
rebeccavickery
0
17
Data Preparation and the Importance of How Machines Learn
rebeccavickery
0
140
Scaling_Machine_Learning_at_Holiday_Extras_-_MUC.pdf
rebeccavickery
0
1.2k
Gender Bias, Why we Need More Women in Tech
rebeccavickery
0
1.2k
The Fastest Way to Learn Data Science
rebeccavickery
0
48
Employing Google Cloud Machine Learning Engine to Develop Models in Production
rebeccavickery
0
1.2k
Other Decks in Technology
See All in Technology
Databricks におけるデータエンジニアリング
databricksjapan
0
370
AWS パートナー企業でテクニカルサポートに従事して2年経ったので思うところをまとめてみた
kazzpapa3
3
1.3k
Garoon 開発チーム / Garoon development team
cybozuinsideout
PRO
1
2.9k
Delivering Millions of Messages within seconds @ Duolingo
pelelgrino
0
320
巨大なテーブルのテーブル定義を無停止で安全に誰でも変更できるようにする / Table-definitions-for-huge-tables-can-be-modified-by-anyone-safely-and-non-disruptively
freee
1
720
【SORACOM UG】SIM Deep Dive セキュアエレメント編
soracom
PRO
0
240
自動生成を活用した、運用保守コストを抑える Error/Alert/Runbook の一元集約管理 / Centralized management of Error/Alert/Runbook to minimize operational costs using automated code generation
biwashi
9
2k
「共通基盤」を超えよ! 今、Platform Engineeringに取り組むべき理由
jacopen
25
5.7k
強みを伸ばすキャリアデザイン
yug1224
0
200
コードを書く隙間を見つけて生きていく技術/Findy 思考の現在地
fujiwara3
24
4.8k
コンテナセキュリティの基本と脅威への対策
kyohmizu
3
670
Oracle Exadata Database Service on Cloud@Customer (ExaDB-C@C) - UI スクリーン・キャプチャ集
oracle4engineer
PRO
1
1.1k
Featured
See All Featured
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
75
41k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
219
21k
Thoughts on Productivity
jonyablonski
57
3.8k
The Language of Interfaces
destraynor
151
23k
The MySQL Ecosystem @ GitHub 2015
samlambert
242
12k
The Invisible Side of Design
smashingmag
293
49k
Large-scale JavaScript Application Architecture
addyosmani
503
110k
Mobile First: as difficult as doing things right
swwweet
216
8.6k
[RailsConf 2023] Rails as a piece of cake
palkan
22
3.9k
Atom: Resistance is Futile
akmur
258
25k
The World Runs on Bad Software
bkeepers
PRO
61
6.7k
Unsuck your backbone
ammeep
662
57k
Transcript
Scaling Machine Learning at Holiday Extras REBECCA VICKERY | DATA
SCIENTIST @vickdata
Travel planning is time consuming Airport parking Airport hotels Airport
lounges Travel insurance Holiday money Port products Car hire Airport transfers 582 minutes Over 46 days* Travel Planning *Facebook commissioned consumer research company GfK
Optimising consumer decision making Airport parking Airport hotels Airport lounges
Travel insurance Holiday money Port products Car hire Airport transfers Less Hassle. More Holiday Trip recommendations
Automated bidding Ad targeting Channel optimisation 1 Ad spend 2
Commercial 3 Customer Experience 4 Marketing Lots of other processes to optimise Automated pricing Allocation Revenue optimisation Automated call handling Personalised experiences Intelligent messaging Optimise send frequency
How to scale Use Cases and Buy in (Input Team
Deployment
How to scale Use Cases and Buy in (Input Team
Deployment
“Ideas are worth nothing unless executed”, Derek Sivers
Deploying machine learning is hard Scaling is even harder
Tools - Data Scientists Open source Lack Software Development expertise
Mainly Python c Flaticon
Tools - Software Engineers Different tools Lack ML/Data expertise Mainly
Javascript c Flaticon
Data science process The wrong kind of independence c Flaticon
People Small data science team Science + software experts are
rare c Flaticon
Two types of deployment
Bespoke Solutions “Ideas are worth nothing unless executed”, Derek Sivers
c Daniel Moyo
Unused Models Many models never make it to production “Ideas
are worth nothing unless executed”, Derek Sivers
Time to model deployment Model development = days to weeks
Model deployment = weeks to never! “Ideas are worth nothing unless executed”, Derek Sivers
The technology
c Flaticon init.py task.py setup.py model.py Model Package
Repeatable, Reusable Process init.py task.py setup.py model.py Model Package
Data transformations Scikit-learn pipelines + custom transformers Transformation occurs in
the model
Solution for other libraries too Add preprocess file to the
package Image taken from Google Cloud documentation
Further customisation Custom scoring Custom prediction routines
None
Faster time to production c flaticon Fully Managed service
Not Quite!
Collaborative Project
ML Proxy (bespoke ML microservice)
Model Versioning
Monitoring - Model Performance
Monitoring - AI Platform Performance
Time to model deployment Model development = days to weeks
Model deployment = hours to days “Ideas are worth nothing unless executed”, Derek Sivers
How to scale Use Cases and Buy in (Input Team
Deployment
The right kind of independence c flaticon Data Scientists have
full ownership over models
The right kind of independence c flaticon Data scientists work
closely together
The right kind of independence c flaticon But they also
work closely with other teams
Use cases and buy in c flaticon Focus on problems
to solve
Use cases and buy in c flaticon Don’t start in
the highest value area
Use cases and buy in Deploy a first version (not
the best) as fast as possible
Test and learn Photo by Alex Kondratiev on Unsplash Use
cases and buy in
Thank you @vickdata