Getting Machine Learning Models Ready For Production Using Python by Adit Mehta

7b0645f018c0bddc8ce3900ccc3ba70c?s=47 Pycon ZA
October 11, 2019

Getting Machine Learning Models Ready For Production Using Python by Adit Mehta

As a Scientist, it’s incredibly satisfying to be given the freedom to experiment by applying new research and rapidly prototyping. This satisfaction can be sustained quite well in a lab environment but can diminish quickly in a corporate environment. This is because of the underlying commercial value motive which science is driven by in a business setting — if it doesn’t add business value to employees or customers, there’s no place for it! Business value, however, goes beyond just being a nifty experiment which shows potential value to employees or customers. In the context of Machine Learning models, the only [business] valuable models, are models in Production!

In this talk, I will take the audience through the steps involved in moving from experiments in Jupyter Notebooks to automated model training, serving and deployments for Production using an array of Python tools such as Numpy, Pandas, SciKit Learn and Docker.

The intended audience for this talk includes Data Scientists, Software Engineers and any other Data practitioners who have or want to go through the journey of gaining real-time value from Machine Learning models in Production. This talk will impart lessons learnt in moving from Jupyter experiments to writing production-ready Python code, as well as impart important Python tools, frameworks and libraries which can be used to accelerate such a transition.

7b0645f018c0bddc8ce3900ccc3ba70c?s=128

Pycon ZA

October 11, 2019
Tweet

Transcript

  1. GETTING MACHINE LEARNING MODELS READY FOR PRODUCTION PyConZA 2019 FROM

    JUPYTER NOTEBOOKS TO PRODUCTION ADIT MEHTA DATA SCIENTIST: ABSA 11-10-2019
  2. PyConZA 2019 11-10-2019 TOOLS AND TECHNOLOGIES A WAY TO APPROACH

    SOLVING DATA SCIENCE PROJECTS LESSONS LEARNED SOMETHING FOR EVERYONE
  3. PyConZA 2019 11-10-2019 WHAT TYPICALLY HAPPENS …

  4. None
  5. None
  6. None
  7. None
  8. PyConZA 2019 11-10-2019 def convert_notebook_to_production_ready_code(jupyter_notebook): prod_ready_code = magic(jupyter_notebook) return prod_ready_code

    WOULDN’T THIS BE LOVELY? def magic(notebook): """ Magic Happens Here ... """ return prod_code
  9. PyConZA 2019 11-10-2019 THE 8 WEEK AGILE PRODUCTION CYCLE PLAN

    FROM PROBLEM TO PROD IN 8 WEEKS
  10. PyConZA 2019 11-10-2019 SPRINT 1: RESEARCH/LITERATURE REVIEW

  11. PyConZA 2019 11-10-2019 SPRINT 2: EXPLORATORY DATA ANALYSIS (EDA) SPRINT

    3-4: FEATURE ENGINEERING + MODEL SELECTION
  12. PyConZA 2019 11-10-2019 SPRINT 5: MODEL DEVELOPMENT !!! THE PIVOT

    !!!
  13. PyConZA 2019 11-10-2019 SPRINT 5: MODEL DEVELOPMENT !!! THE PIVOT

    !!! Automation of train.py Managing Environments and Package Dependencies/versions Making use of Docker on your local Automation of predict.py Strict version control – code and data! Git + DVC Airflow
  14. PyConZA 2019 11-10-2019 SPRINT 6: MODEL OPTIMISATION + REFACTORING Standard

    Model Serving Template Standardised Feature Creation Methods Unit Testing Template Standard predict_wrapper() template for making predictions This is what gets called in production! Don’t rush to C++
  15. PyConZA 2019 11-10-2019 SPRINT 7-8: END-TO-END TESTING AND DEPLOYING TO

    PRODUCTION YOUR WORK (AS A DATA SCIENTIST) IS NOW DONE!
  16. PyConZA 2019 11-10-2019 def convert_notebook_to_production_ready_code(jupyter_notebook): prod_ready_code = magic(jupyter_notebook) return prod_ready_code

    WOULDN’T THIS BE LOVELY? def magic(notebook): """ Magic Happens Here ... """ return prod_code
  17. PyConZA 2019 11-10-2019 THE MAGIC … class Magic: def __init__(self,

    problem): self.problem = problem def rightTooling(self, frameworks, libraries, technologies): return doing_the_right_things def discipline(self, principles): return avoid_technical_debt def hardWork(self, commited_team_members): return efficiency
  18. GETTING MACHINE LEARNING MODELS READY FOR PRODUCTION PyConZA 2019 FROM

    JUPYTER NOTEBOOKS TO PRODUCTION ADIT MEHTA DATA SCIENTIST: ABSA 11-10-2019