Slide 1

Slide 1 text

Tania Allard, PhD @ixek Developer Advocate @Microsoft Google Developer expert ML -Tensorflow Practical DevOps for the busy Data Scientist OSCON 2019 http://bit.ly/OSCON-MLOps

Slide 2

Slide 2 text

2 Story time…. @ixek Down the rabbit hole

Slide 3

Slide 3 text

3 @ixek

Slide 4

Slide 4 text

4 A common story... @ixek Model / application to be productised R&D - develop, iterate fast, usually local or cloud Magic Is it live??

Slide 5

Slide 5 text

5 @ixek

Slide 6

Slide 6 text

6 Replacing the magic @ixek Model/app ready to productise R&D - develop, iterate fast, usually local or cloud MLOPs, automation, controlled deployment Worry free deployment! Wait and relax

Slide 7

Slide 7 text

7 @ixek How skills are perceived

Slide 8

Slide 8 text

8 @ixek

Slide 9

Slide 9 text

9 @ixek

Slide 10

Slide 10 text

10 DevOps is the union of people, process, and products to enable continuous delivery of value into production What is DevOps anyway? @ixek

Slide 11

Slide 11 text

11 Sort of DevOps applied to data-intensive applications. Requires close collaboration between engineers, data scientists, architects, data engineers and Ops. How does it fit for DS? @ixek

Slide 12

Slide 12 text

12 Story time…. The advice… getting started with MLOps

Slide 13

Slide 13 text

13 @ixek MlOps Aims to reduce the end-to-end cycle time of data analytics/science from the origin of ideas to the creation of data artifacts.

Slide 14

Slide 14 text

14 @ixek What to automate? Establish checkpoints Find the low hanging fruits How stable and robust are my processes? Devise a long term strategy What can I readily improve? Where am I? Can I count? Getting started

Slide 15

Slide 15 text

15 It’s all madness @ixek

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

18 @ixek Practical steps

Slide 19

Slide 19 text

19 Keep everything in source control (data, code, infrastructure) - but allow for experimentation @ixek

Slide 20

Slide 20 text

20 @ixek

Slide 21

Slide 21 text

21 @ixek

Slide 22

Slide 22 text

22 Standardize and define your environments in code (conda, pipfiles, Docker) @ixek

Slide 23

Slide 23 text

23 Use canonical data sources - always know what data you are using (where it comes and goes) @ixek

Slide 24

Slide 24 text

24 @ixek

Slide 25

Slide 25 text

25 Automate wisely @ixek

Slide 26

Slide 26 text

26 What and when to automate? @ixek ● What should we automate? ● Define success and failure metrics ● Go from simple to complex tasks ● Evaluate and monitor

Slide 27

Slide 27 text

27 https://xkcd.com/1205/

Slide 28

Slide 28 text

28 @ixek

Slide 29

Slide 29 text

29 Use pipelines for repeatability and explainability @ixek

Slide 30

Slide 30 text

No content

Slide 31

Slide 31 text

No content

Slide 32

Slide 32 text

32 Deploy portable models @ixek

Slide 33

Slide 33 text

33 @ixek

Slide 34

Slide 34 text

34 Test continuously and monitor production: push left @ixek

Slide 35

Slide 35 text

35 @ixek

Slide 36

Slide 36 text

36 Summary @ixek 1. DataOps help create value and improve end-to-end ML 2. Start by identifying the low-hanging fruits and defining automation success 3. Choose the right tooling and processes 4. Leverage people and processes 5. Implement wisely

Slide 37

Slide 37 text

37 Thank you @ixek http://bit.ly/OSCON-MLOps