Shipping ML at scale

Shipping ML at scale (for real) Massimo Belloni, Senior Data
Scientist @ Bumble [email protected] LinkedIn: in/massibelloni/ Medium: @massibelloni Lessons learnt and best practices

private & conﬁdential

Massimo Belloni Senior Data Scientist @ Bumble • MLOps, NLP
and miscellaneous • Interests in philosophy of science, consciousness, Strong vs Weak AI ◦ Coding consciousness ◦ Interpretability and trust • MSc Computer Science and Engineering (Politecnico di Milano) private & conﬁdential

A lot of numbers, one outcome NEARLY CIOs have plans
to develop AI solutions [1] 50% Data Science projects never make it to production [2] 87% AI solutions will provide erroneous results [1] 85% [1] Gartner [2] VentureBeat private & conﬁdential

Everyone wants to do machine learning, but doing it successfully
is complex. Machine learning deployments require engineering skills, experimentation mindset and very clear and measurable goals. private & conﬁdential

The recipe for failure • Unclear goals and objectives Are
we actually improving an existing business process in a measurable and quantiﬁable way? • Lack of engineering skills How far in the ML lifecycle are we able to go with internal resources? Are we able to communicate with other systems? • Poor experiments’ management Are all the experiments tracked and replicable? Can we trace back the history (datasets, code, parameters) of all the artifacts we are generating? private & conﬁdential

Process understanding and model’s design Deployment infrastructure and ﬁrst PoC
Model improvement and experimentation strategy private & conﬁdential

Process understanding and model’s design What are we optimising? Tracking
and observability • Every Machine Learning project should aim to improve an existing business process. • The business impact and the teams involved have to be clear from the early steps. • Where the ROI stands? Which metrics are relevant? • Know the baseline performances of your process • Make sure to have in place monitoring pipelines before designing the model • ML metrics (precision, recall, etc) have to be associated with their business counterparts • Make sure to have reliable historical data about the process you want to optimise/automate • Is the same data available in production? Is the same data available in real time? Data quality and availability private & conﬁdential

Crucial step for successful ML deployments is becoming an expert
of the processes we want to optimise. Preliminary exploratory analysis is an underestimated key step for success and expectations management. private & conﬁdential

Deployment infrastructure and ﬁrst PoC (or: where everyone thinks is
failing at) • It is easier than it looks! Deploying Machine Learning models isn’t inherently different from delivering any classical software engineering product. The vast majority of models can be deployed as a Python function inside a Flask application without any need for GPUs! (train != inference) • A philosophical topic: engineering skills in DS Having engineering skills and infrastructural knowledge is also important for designing the correct prediction pipeline and to ensure that all the necessary features are available in production. private & conﬁdential

CPU inferencing GPU inferencing single sample, multi-core optimise for big
batches Service (Flask, FastAPI, Sanic) inference model requests arrive from multiple clients, usually one sample per request no batching applied on CPU, multiple inference engines (1 per CPU core) optimise thread usage (1 per core) Reference: Optimise BERT inference on CPU Service (TF Serving, TorchServe) inference model the service applies dynamic batching to optimise GPU usage batch size is GPU (hardware) and time constrained

source Data Warehouse External sources Photo Storage ... query code
dataset dataset_id artifacts Experiment model_id Model architecture Hyperparameters Validation strategy ... Model Features Metadata Performances experiment_id = dataset_id + model_id Experiments have to be easy and handy to compare Version datasets, not queries! private & conﬁdential

Thank you! team.bumble.com private & conﬁdential

Shipping ML at scale

Shipping ML at scale

Massimo Belloni PRO

More Decks by Massimo Belloni

Other Decks in Technology

Featured

Transcript

Shipping ML at scale (for real) Massimo Belloni, Senior Data

private & conﬁdential

Massimo Belloni Senior Data Scientist @ Bumble • MLOps, NLP

A lot of numbers, one outcome NEARLY CIOs have plans

Everyone wants to do machine learning, but doing it successfully

The recipe for failure • Unclear goals and objectives Are

Process understanding and model’s design Deployment infrastructure and ﬁrst PoC

Process understanding and model’s design What are we optimising? Tracking

Crucial step for successful ML deployments is becoming an expert

Deployment infrastructure and ﬁrst PoC (or: where everyone thinks is

CPU inferencing GPU inferencing single sample, multi-core optimise for big

source Data Warehouse External sources Photo Storage ... query code

Thank you! team.bumble.com private & conﬁdential