Save 37% off PRO during our Black Friday Sale! »

Deploying models to production with TensorFlow Model Server

Deploying models to production with TensorFlow Model Server

How to serve TensorFlow models over HTTP and HTTPS. How we can essentially follow the main steps of putting a model into production, package it and make it ready for deployment, upload it somewhere in the cloud, make an API, and most importantly have no downtime while you are updating the model and doing version numbering efficiently. We plan to cover all these which are the steps required to deploy a model in the wild and how TensorFlow simplifies them for a developer. We will show how applications could access the model maybe through web or cloud calls. We will also show how one could make this deployment to auto-scale using GCP Cloud functions and/or Kubernetes

0d7c1e828ec0afbf29c0d37702c4637d?s=128

Rishit Dagli

May 30, 2020
Tweet

Transcript

  1. Event link: https://www.meetup.com/GDG-Ahmedabad/events/270477738/ Rishit Dagli 10-grade student, past TED-X and

    Ted-Ed speaker Deploying models to production with TensorFlow model server
  2. Rishit Dagli RESEARCH RESEARCH RESEARCH RESEARCH RESEARCH

  3. Rishit Dagli RESEARCH RESEARCH RESEARCH RESEARCH RESEARCH

  4. Event link: https://www.meetup.com/GDG-Ahmedabad/events/270477738/ Rishit Dagli 10-grade student, past TED-X and

    Ted-Ed speaker Deploying models to production with TensorFlow model server
  5. Ideal Audience • Devs who have worked on Deep Learning

    Models (Keras) • Devs looking for ways to put their model into production ready manner
  6. 01 Motivation behind a process for deployment What things to

    take care of? What is TF Model server? 02 03 04 05 06 What can it do? • Versioning • Iaas • CI/ CD Auto Scaling QnA
  7. Motivation behind a process for deployment Source: memegenerator.net

  8. None
  9. What things to take care of?

  10. What things to take care of? • Package the model

  11. What things to take care of? • Package the model

    • Post the model on Cloud Hosted Server
  12. What things to take care of? • Package the model

    • Post the model on Cloud Hosted Server • Maintain the server
  13. What things to take care of? • Package the model

    • Post the model on Cloud Hosted Server • Maintain the server ◦ Auto-scale
  14. What things to take care of? • Package the model

    • Post the model on Cloud Hosted Server • Maintain the server ◦ Auto-scale
  15. What things to take care of? • Package the model

    • Post the model on Cloud Hosted Server • Maintain the server ◦ Auto-scale ◦ Global availability
  16. What things to take care of? • Package the model

    • Post the model on Cloud Hosted Server • Maintain the server ◦ Auto-scale ◦ Global availability ◦ And many more ...
  17. What things to take care of? • Package the model

    • Post the model on Cloud Hosted Server • Maintain the server ◦ Auto-scale ◦ Global availability ◦ And many more ... • API
  18. What things to take care of? • Package the model

    • Post the model on Cloud Hosted Server • Maintain the server ◦ Auto-scale ◦ Global availability ◦ And many more ... • API • Model Versioning
  19. What is TF Model Server?

  20. Serving

  21. Serving

  22. Credits: @lmoroney

  23. Credits: @lmoroney

  24. Credits: @lmoroney

  25. What can it do?

  26. None
  27. None
  28. None
  29. None
  30. None
  31. None
  32. None
  33. Serving

  34. Installing https://www.tensorflow.org/tfx/serving/setup

  35. The Process

  36. Converting the model tf.saved_model.simple_save( keras.backend.get_session(), directory_path, inputs = {'input_image': model.input},

    outputs = {i.name: i for i in model.outputs} )
  37. None
  38. Starting the Model Server os.environ["MODEL_DIR"] = MODEL_DIR

  39. Starting the model server os.environ["MODEL_DIR"] = MODEL_DIR %%bash --bg nohup

    tensorflow_model_server \ --rest_api_port = 8501 \ --model_name = test \ --model_base_path="${MODEL_DIR}" >server.log 2>&1
  40. Starting the model server os.environ["MODEL_DIR"] = MODEL_DIR %%bash --bg nohup

    tensorflow_model_server \ --rest_api_port = 8501 \ --model_name = test \ --model_base_path="${MODEL_DIR}" >server.log 2>&1
  41. Starting the model server os.environ["MODEL_DIR"] = MODEL_DIR %%bash --bg nohup

    tensorflow_model_server \ --rest_api_port = 8501 \ --model_name = test \ --model_base_path="${MODEL_DIR}" >server.log 2>&1
  42. Starting the model server os.environ["MODEL_DIR"] = MODEL_DIR %%bash --bg nohup

    tensorflow_model_server \ --rest_api_port = 8501 \ --model_name = test \ --model_base_path="${MODEL_DIR}" >server.log 2>&1
  43. Starting the model server os.environ["MODEL_DIR"] = MODEL_DIR %%bash --bg nohup

    tensorflow_model_server \ --rest_api_port = 8501 \ --model_name = test \ --model_base_path="${MODEL_DIR}" >server.log 2>&1
  44. Doing Inference!

  45. Keep In Mind • No data as lists but as

    lists of lists
  46. Data as lists of lists xs = np.array([[case_1], [case_2] ...

    [case_n]])
  47. Making calls xs = np.array([[case_1], [case_2] ... [case_n]]) data =

    json.dumps({"signature_name": " ", "instances": xs.tolist()})
  48. Doing Inference xs = np.array([[case_1], [case_2] ... [case_n]]) data =

    json.dumps({"signature_name": " ", "instances": xs.tolist()}) json_response = requests.post( 'http://localhost:8501/v1/models/test:predict', data = data, headers = headers)
  49. Doing Inference xs = np.array([[case_1], [case_2] ... [case_n]]) data =

    json.dumps({"signature_name": " ", "instances": xs.tolist()}) json_response = requests.post( 'http://localhost:8501/v1/models/test:predict', data = data, headers = headers)
  50. Doing Inference xs = np.array([[case_1], [case_2] ... [case_n]]) data =

    json.dumps({"signature_name": " ", "instances": xs.tolist()}) json_response = requests.post( 'http://localhost:8501/v1/models/test:predict', data = data, headers = headers)
  51. Doing Inference xs = np.array([[case_1], [case_2] ... [case_n]]) data =

    json.dumps({"signature_name": " ", "instances": xs.tolist()}) json_response = requests.post( 'http://localhost:8501/v1/models/test:predict', data = data, headers = headers)
  52. • High availability

  53. • High availability • No downtime

  54. • High availability • No downtime • Focus on real

    code
  55. • High availability • No downtime • Focus on real

    code • Build better apps
  56. Serve the model on Cloud

  57. Why Care?

  58. Why Care?

  59. Creating a Kubernetes cluster gcloud container clusters create resnet-serving-cluster --num-nodes

    5
  60. Pushing the docker image docker tag $USER/resnet_serving gcr.io/tensorflow-serving/resnet docker push

    gcr.io/tensorflow-serving/resnet
  61. Pushing the docker image kubectl create -f [yaml]

  62. • Use the external IP Inference

  63. gdg-ahm.rishit.tech Code Repo

  64. Demos

  65. Key Takeaways • Why a process for deployment • What

    it takes to deploy models • Serving a model with TF Model server • Why TF Model server? • What can TF Model server do? • Deploying on Cloud
  66. Rishit Dagli Rishit-dagli rishit_dagli rishit.tech hello@rishit.tech @rishit.dagli About Me

  67. Questions

  68. Thank You