Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Getting Machine Learning Models Ready For Produ...

Pycon ZA
October 11, 2019

Getting Machine Learning Models Ready For Production Using Python by Adit Mehta

As a Scientist, it’s incredibly satisfying to be given the freedom to experiment by applying new research and rapidly prototyping. This satisfaction can be sustained quite well in a lab environment but can diminish quickly in a corporate environment. This is because of the underlying commercial value motive which science is driven by in a business setting — if it doesn’t add business value to employees or customers, there’s no place for it! Business value, however, goes beyond just being a nifty experiment which shows potential value to employees or customers. In the context of Machine Learning models, the only [business] valuable models, are models in Production!

In this talk, I will take the audience through the steps involved in moving from experiments in Jupyter Notebooks to automated model training, serving and deployments for Production using an array of Python tools such as Numpy, Pandas, SciKit Learn and Docker.

The intended audience for this talk includes Data Scientists, Software Engineers and any other Data practitioners who have or want to go through the journey of gaining real-time value from Machine Learning models in Production. This talk will impart lessons learnt in moving from Jupyter experiments to writing production-ready Python code, as well as impart important Python tools, frameworks and libraries which can be used to accelerate such a transition.

Pycon ZA

October 11, 2019
Tweet

More Decks by Pycon ZA

Other Decks in Programming

Transcript

  1. GETTING MACHINE LEARNING MODELS READY FOR PRODUCTION PyConZA 2019 FROM

    JUPYTER NOTEBOOKS TO PRODUCTION ADIT MEHTA DATA SCIENTIST: ABSA 11-10-2019
  2. PyConZA 2019 11-10-2019 TOOLS AND TECHNOLOGIES A WAY TO APPROACH

    SOLVING DATA SCIENCE PROJECTS LESSONS LEARNED SOMETHING FOR EVERYONE
  3. PyConZA 2019 11-10-2019 SPRINT 5: MODEL DEVELOPMENT !!! THE PIVOT

    !!! Automation of train.py Managing Environments and Package Dependencies/versions Making use of Docker on your local Automation of predict.py Strict version control – code and data! Git + DVC Airflow
  4. PyConZA 2019 11-10-2019 SPRINT 6: MODEL OPTIMISATION + REFACTORING Standard

    Model Serving Template Standardised Feature Creation Methods Unit Testing Template Standard predict_wrapper() template for making predictions This is what gets called in production! Don’t rush to C++
  5. PyConZA 2019 11-10-2019 SPRINT 7-8: END-TO-END TESTING AND DEPLOYING TO

    PRODUCTION YOUR WORK (AS A DATA SCIENTIST) IS NOW DONE!
  6. PyConZA 2019 11-10-2019 THE MAGIC … class Magic: def __init__(self,

    problem): self.problem = problem def rightTooling(self, frameworks, libraries, technologies): return doing_the_right_things def discipline(self, principles): return avoid_technical_debt def hardWork(self, commited_team_members): return efficiency
  7. GETTING MACHINE LEARNING MODELS READY FOR PRODUCTION PyConZA 2019 FROM

    JUPYTER NOTEBOOKS TO PRODUCTION ADIT MEHTA DATA SCIENTIST: ABSA 11-10-2019