Save 37% off PRO during our Black Friday Sale! »

Practical DevOps for the busy data scientist

Practical DevOps for the busy data scientist


Tania Allard

October 09, 2019


  1. Practical DevOps for the busy data Scientist

  2. Slides

  3. What you’ll learn 01 02 Why MLOps/ DevOps ? Who

    is responsible? 03 04 Getting started Getting from A to B
  4. About Me

  5. Software engineering Algorithm Data Answers @ixek

  6. Machine learning Answers Data Algorithm @ixek

  7. Machine learning Answers Data Model @ixek @ixek

  8. Machine learning Answers Data Model Answers Predictions @ixek

  9. The data cycle Magic? R&D Generation @ixek

  10. Anyone? @ixek

  11. A common scenario @ixek

  12. @ixek

  13. If you had one wish? @ixek

  14. Replacing the magic ML Ops and robust pipelines R&D Generation

  15. How skills are perceived @ixek

  16. Better @ixek

  17. How they really are @ixek

  18. DevOps is the union of people, process, and products to

    enable continuous delivery of value into production - Donovan Brown What is devops @ixek
  19. MlOps Aims to reduce the end-to-end cycle time and friction

    of data analytics/science from the origin of ideas to the creation of data artifacts. What is devops @ixek
  20. But I do not work in a big company with

    many ML engineers @ixek
  21. Build your own MLOps Platform @ixek

  22. None
  23. None
  24. Practical steps @ixek

  25. We have the notebooks in source control @ixek

  26. Your saviour Source control • Code and comments only (not

    Jupyter output) • Plus every part of the pipeline • And Infrastructure and dependencies • And maybe a subset of data @ixek
  27. Everything should be in source control!! Except your training data

    which should be a known, shared data source Do not touch the raw data! Not even with a stick Your saviour @ixek
  28. Deterministic environments @ixek

  29. Whatever that environment is @ixek

  30. Your laptop is not a production environment… so ensure reproducibility

  31. @ixek

  32. Use pipelines for repeatability and reproducibility @ixek


  34. @ixek

  35. @ixek

  36. Automate wisely @ixek

  37. Adopt automation • Orchestration for Continuous Integration and Continuous Delivery

    • Gates, tasks, and processes for quality • Integration with other services • Triggers on code and non-code events @ixek
  38. Complete pipeline @ixek

  39. Kubeflow example @ixek

  40. Build pipeline- n-us/services/devops/

  41. Code event trigger @ixek

  42. Release / deploy @ixek

  43. In brief Deterministic environments Use pipelines Continuous integration and delivery

    Source control (done right) Code, infrastructure, everything! Ensure production readiness For repeatable workflows Detect errors early and seamless deployments @ixek
  44. Want to learn more? • • •

    vice/concept-ml-pipelines @ixek
  45. Come talk to us! @ ixek