Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Deploying Models to production with Azure ML | Scottish Summit

Deploying Models to production with Azure ML | Scottish Summit

Deploying models is a pretty important aspect to consider while building end-to-end ML applications. I first plan to show how the models could be registered with Azure ML so as to make them accessible and allow them to be loaded for deployment. I then plan to show how configurations could be built for deploying the models with Azure ML. Azure ML allows us to easily deploy models to receive low latency real-time inferences which are required for a lot of applications, so I would majorly focus on this and also show how one could consume these models. I would further show how to also build batch inference pipelines. If time persists I would also show demos for the same.

0d7c1e828ec0afbf29c0d37702c4637d?s=128

Rishit Dagli

May 11, 2021
Tweet

Transcript

  1. #ScottishSummit2021 R i s h i t D a g

    l i D e p l o y i n g M o d e l s t o P r o d u c t i o n w i t h A z u r e M L M e t h v e n , S a t 1 7 : 3 0 @rishit_dagli Rishit-dagli www.rishit.tech
  2. Our Sponsors

  3. “Most models don’t get deployed.”

  4. 90% Of models don’t get deployed

  5. S o u r c e : L a u

    r e n c e M o r o n e y
  6. TEDx, TED-Ed Speaker High School Rishit Dagli @rishit_dagli Rishit-dagli www.rishit.tech

  7. • High School Student • TEDx and Ted-Ed Speaker •

    ♡Hackathons and competitions • ♡Research • My coordinates -www.rishit.tech $whoami
  8. Acknowledgements • Henk Boelman(Microsoft) • Dawood Iddris(AI MVP)

  9. • Devs who have worked on creating Machine Learning Models

    • Devs looking for ways to put their model into production ready manner Ideal Audience
  10. Why care about ML deployments? Source: memegenerator .net

  11. None
  12. • Package the model What things to take care of?

  13. • Package the model • Post the model on Server

    What things to take care of?
  14. • Package the model • Post the model on Server

    • Maintain the server What things to take care of?
  15. • Package the model • Post the model on Server

    • Maintain the server o Auto-scale What things to take care of?
  16. • Package the model • Post the model on Server

    • Maintain the server o Auto-scale What things to take care of?
  17. • Package the model • Post the model on Server

    • Maintain the server o Auto-scale o Global Availability What things to take care of?
  18. • Package the model • Post the model on Server

    • Maintain the server o Auto-scale o Global Availability o Latency What things to take care of?
  19. • Package the model • Post the model on Server

    • Maintain the server • API What things to take care of?
  20. • Package the model • Post the model on Server

    • Maintain the server • API • Model Versioning What things to take care of?
  21. • Package the model • Post the model on Server

    • Maintain the server • API • Model Versioning • Batch Predictions What things to take care of?
  22. Simple Deployments Why are they Inefficient?

  23. None
  24. Simple Deployments Why are they Inefficient? • No consistent API

    • No model versioning • No mini-batching • Inefficient for large models Source: Hannes Hapke
  25. Deploying a Model A Walkthrough

  26. What do we need? • Register Your Model • Load

    the Model • Perform Inference • Deploy the model
  27. What do we need? • Register Your Model • Load

    the Model • Perform Inference Do it at Scale
  28. Register a Model

  29. Register a Model TensorFlow 2 Saved Model

  30. Register a Model .onnx .pkl .pt

  31. Register a Model With the run object

  32. Register a Model With the run object .onnx .pkl .pt

  33. Creating an Inference Service • Load the Model

  34. Creating an Inference Service • Load the Model • Inference

    from the Model
  35. Creating an Inference Service • Load the Model • Inference

    from the Model Environment
  36. Creating an Inference Service

  37. Creating an Inference Service Load the registered model Really do

    the inference
  38. Load a model

  39. Load a model

  40. Let’s inference from the model

  41. Let’s inference from the model

  42. Let’s inference from the model

  43. Let’s inference from the model

  44. And that’s it

  45. Set up an environment Customizable • Can use a Docker

    Image directly • Can manage the dependencies yourself too • Can specify a custom interpreter • Customizable Spark Settings
  46. Let’s deploy it!

  47. Let’s deploy it!

  48. Let’s deploy it!

  49. Let’s deploy it!

  50. Deployment Configuration

  51. And Deploy to AKS

  52. Inference with REST • JSON response • Can specify a

    particular version
  53. Inference with REST

  54. Inference with gRPC • Better connections • Data converted to

    protocol buffer • Request types have designated type • Payload converted to base64 • Use gRPCstubs
  55. Batch Inferences

  56. Batch Inferences • Use hardware efficiently • Save costs and

    compute resources • Take multiple requests process them together • Super cool😎for large models
  57. Batch Inferences • Update the run() function • Runs on

    each batch of data
  58. Batch Inferences

  59. Configure the ParallelRun

  60. Create the pipeline

  61. Publish the pipeline

  62. Demos!

  63. #ScottishSummit2021 Thank You @rishit_dagli Rishit-dagli www.rishit.tech