Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Practical DevOps for the busy data scientist

Practical DevOps for the busy data scientist


Tania Allard

October 09, 2019

More Decks by Tania Allard

Other Decks in Programming


  1. Practical DevOps for the busy data Scientist

  2. bit.ly/PyConDE-mlops Slides

  3. What you’ll learn 01 02 Why MLOps/ DevOps ? Who

    is responsible? 03 04 Getting started Getting from A to B
  4. About Me

  5. Software engineering Algorithm Data Answers @ixek bit.ly/PyConDE-mlops

  6. Machine learning Answers Data Algorithm @ixek bit.ly/PyConDE-mlops

  7. Machine learning Answers Data Model @ixek bit.ly/PyConDE-mlops @ixek bit.ly/PyConDE-mlops

  8. Machine learning Answers Data Model Answers Predictions @ixek bit.ly/PyConDE-mlops

  9. The data cycle Magic? R&D Generation @ixek bit.ly/PyConDE-mlops

  10. Anyone? @ixek bit.ly/PyConDE-mlops

  11. A common scenario @ixek bit.ly/PyConDE-mlops

  12. @ixek bit.ly/PyConDE-mlops

  13. If you had one wish? @ixek bit.ly/PyConDE-mlops

  14. Replacing the magic ML Ops and robust pipelines R&D Generation

    @ixek bit.ly/PyConDE-mlops
  15. How skills are perceived @ixek bit.ly/PyConDE-mlops

  16. Better @ixek bit.ly/PyConDE-mlops

  17. How they really are @ixek bit.ly/PyConDE-mlops

  18. DevOps is the union of people, process, and products to

    enable continuous delivery of value into production - Donovan Brown What is devops @ixek bit.ly/PyConDE-mlops
  19. MlOps Aims to reduce the end-to-end cycle time and friction

    of data analytics/science from the origin of ideas to the creation of data artifacts. What is devops @ixek bit.ly/PyConDE-mlops
  20. But I do not work in a big company with

    many ML engineers @ixek bit.ly/PyConDE-mlops
  21. Build your own MLOps Platform @ixek bit.ly/PyConDE-mlops

  22. None
  23. None
  24. Practical steps @ixek bit.ly/PyConDE-mlops

  25. We have the notebooks in source control @ixek bit.ly/PyConDE-mlops

  26. Your saviour Source control • Code and comments only (not

    Jupyter output) • Plus every part of the pipeline • And Infrastructure and dependencies • And maybe a subset of data @ixek bit.ly/PyConDE-mlops
  27. Everything should be in source control!! Except your training data

    which should be a known, shared data source Do not touch the raw data! Not even with a stick Your saviour @ixek bit.ly/PyConDE-mlops
  28. Deterministic environments @ixek bit.ly/PyConDE-mlops

  29. Whatever that environment is @ixek bit.ly/PyConDE-mlops

  30. Your laptop is not a production environment… so ensure reproducibility

    @ixek bit.ly/PyConDE-mlops
  31. @ixek bit.ly/PyConDE-mlops

  32. Use pipelines for repeatability and reproducibility @ixek bit.ly/PyConDE-mlops

  33. ml.azure.com

  34. @ixek bit.ly/PyConDE-mlops

  35. @ixek bit.ly/PyConDE-mlops

  36. Automate wisely @ixek bit.ly/PyConDE-mlops

  37. Adopt automation • Orchestration for Continuous Integration and Continuous Delivery

    • Gates, tasks, and processes for quality • Integration with other services • Triggers on code and non-code events @ixek bit.ly/PyConDE-mlops
  38. Complete pipeline @ixek bit.ly/PyConDE-mlops

  39. Kubeflow example https://www.kubeflow.org/docs/azure/azureendtoend/ @ixek bit.ly/PyConDE-mlops

  40. Build pipeline- https://azure.microsoft.com/en-us/services/devops/https://azure.microsoft.com/e n-us/services/devops/

  41. Code event trigger @ixek bit.ly/PyConDE-mlops

  42. Release / deploy @ixek bit.ly/PyConDE-mlops

  43. In brief Deterministic environments Use pipelines Continuous integration and delivery

    Source control (done right) Code, infrastructure, everything! Ensure production readiness For repeatable workflows Detect errors early and seamless deployments @ixek bit.ly/PyConDE-mlops
  44. Want to learn more? • ml.azure.com • https://azure.microsoft.com/en-us/services/devops/ • https://docs.microsoft.com/en-us/azure/machine-learning/ser

    vice/concept-ml-pipelines @ixek bit.ly/PyConDE-mlops
  45. Come talk to us! @ ixek trallard@bitsandchips.me