Save 37% off PRO during our Black Friday Sale! »

Practical DevOps for the busy data scientist

Practical DevOps for the busy data scientist

Ecdea9b9714877b86cee08458f085481?s=128

Tania Allard

June 27, 2019
Tweet

Transcript

  1. Tania Allard, PhD @ixek Developer Advocate @Microsoft Practical DevOps for

    the busy Data Scientist http://bit.ly/MancML-trallard
  2. 2 A bit of background never hurt anyone About us

  3. 3 @ixek

  4. 4 @ixek

  5. 5 Top top view… @ixek Stable model/application ready to be

    productised R&D - develop, iterate fast, usually local or cloud Magic Is it live??
  6. 6 How I would like everything to work…. @ixek It

    works…. now send it over to production R&D - develop, iterate fast, usually local or cloud Push code, tag, tag data* Worry free deployment! Wait and relax
  7. 7 @ixek

  8. 8

  9. 9 @ixek DevOps / DataOps / MLOps

  10. 10 DevOps is the union of people, process, and products

    to enable continuous delivery of value into production What is DevOps anyway? @ixek
  11. 11 Sort of DevOps applied to data-intensive applications. Requires close

    collaboration between engineers, data scientists, architects, data engineers and Ops. How does it fit for DS? @ixek
  12. 12 @ixek Aims to reduce the end-to-end cycle time of

    data analytics/science from the origin of ideas to the creation of data artifacts.
  13. 13

  14. 14

  15. 15 7 steps to DS

  16. 16 Keep everything in source control - but allow for

    experimentation
  17. 17

  18. 18 Standardize and define your environments in code (conda, pipfiles,

    Docker)
  19. 19 Use canonical data sources - always know what data

    you are using (where it comes and goes)
  20. 20

  21. 21 Automate wisely

  22. 22 https://xkcd.com/1205/

  23. 23

  24. 24 Use pipelines for repeatability and explainability

  25. 25 Deploy portable models

  26. 26

  27. 27 Test continuously and monitor production: shift left

  28. 28

  29. 29 Thank you @ixek http://bit.ly/MancML-trallard