Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PyCon & PyData. Applying deployment oriented mi...

PyCon & PyData. Applying deployment oriented mindset for building Machine Learning models

Developing a complicated ensemble model with hundreds of features fetched from a bunch of different sources? Give me two! Showing great metrics to the stakeholders and already discussing how it will hit a home run in production? Why not! And then getting stuck for months trying to deploy the model and fighting with data inconsistency and bugs? Sounds familiar? This talk will focus on providing guidelines on how to build your model development process keeping in mind the deployment phase to come later on.

Marianna Diachuk

October 09, 2019
Tweet

More Decks by Marianna Diachuk

Other Decks in Programming

Transcript

  1. AGENDA ➔ Why am I telling you this? ➔ What

    I mean by deployment? ➔ Deployment problems ➔ Model development process ➔ What deployment oriented mindset gives you 2
  2. ABOUT ME ✍ I’m a data scientist from Kyiv (Ukraine)

    ✍ I developed and deployed multiple scoring and antifraud ensemble models ✍ I leaded a small but proud team of 2 data scientists and 1 data engineer ✍ I’m a mother of 3 dragons ducks 3
  3. DEPLOYMENT PROBLEMS (SOME OF THEM) ➢ Model response inconsistency 11

    research environment development environment
  4. ➢ Model response inconsistency ➢ Impossibility to implement features calculations

    ➢ Features inconsistency 14 DEPLOYMENT PROBLEMS (SOME OF THEM)
  5. ➢ Model response inconsistency ➢ Impossibility to implement features calculations

    ➢ Features inconsistency ➢ Model is not scalable and so on and so on… 15 DEPLOYMENT PROBLEMS (SOME OF THEM)
  6. ➢ Model response inconsistency ➢ Impossibility to implement features calculations

    ➢ Features inconsistency ➢ Model is not scalable and so on and so on… 16 DEPLOYMENT PROBLEMS (SOME OF THEM)
  7. How fast should our model respond? Are there any lim

    itations for deploym ent? Can we fetch the data from DB quickly? BUSINESS UNDERSTANDING STAGE Can developers help us with deploym ent? Should we worry about features calculation time? 20
  8. BUSINESS UNDERSTANDING STAGE Pay attention to: ! Model response time

    ! Feature calculation time ! Database response ! Human resources availability 24
  9. You can: ! Limit data sources ! Work closely with

    colleagues 27 DATA UNDERSTANDING STAGE
  10. Ouch… It was one of the top features :( We

    removed one field from the application form. WORK CLOSELY WITH YOUR COLLEAGUES. 31
  11. REFACTOR YOUR CODE 1. Write your feature calculation 2. Test

    the feature 3. Improve your code readability 35
  12. REFACTOR YOUR CODE 1. Write your feature calculation 4. Improve

    your code efficiency 2. Test the feature 3. Improve your code readability 36
  13. REFACTOR YOUR CODE 1. Write your feature calculation 4. Improve

    your code efficiency 2. Test the feature 3. Improve your code readability 37
  14. WHAT DEPLOYMENT ORIENTED MINDSET GIVES YOU. ✍ Less risks in

    the late phases -> less postponed deadlines 46
  15. WHAT DEPLOYMENT ORIENTED MINDSET GIVES YOU. ✍ Easier to estimate

    tasks ✍ Less risks in the late phases -> less postponed deadlines 47
  16. WHAT DEPLOYMENT ORIENTED MINDSET GIVES YOU. ✍ Easier to estimate

    tasks ✍ Less risks in the late phases -> less postponed deadlines ✍ Better to understand responsibilities distribution 48