Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Keynote: Scaling Deep Learning Models in Production Using Kubernetes

Keynote: Scaling Deep Learning Models in Production Using Kubernetes

KubeCon + CloudNativeCon 2018 - Copenhagen

Youtube Video Link - https://www.youtube.com/watch?v=gcij93d9st8

Keynote: Scaling Deep Learning Models in Production Using Kubernetes - Sahil Dua, Software Developer, Booking.com

While there are a lot of machine learning frameworks and libraries available, putting the models in production at large scale is still a challenge. Sahil would like to talk about how they took on the challenge of deploying deep learning models in production: how they chose their tools and developed their internal deep learning infrastructure using Kubernetes. He will cover how they do model training in Docker containers, distributed TensorFlow training in a cluster of containers, automated re-training of models and finally, the deployment of models to serve predictions. At the large scale which they operate on, nothing comes easy. He will also talk about how they optimize their model predictions infrastructure for latency or throughput depending on the use case.

"About Sahil
Sahil is a software developer at Booking.com. He has been involved in leveraging container infrastructure to help Booking.com’s internal teams in taking advantage of deep learning techniques at scale. An open source software enthusiast, Sahil is a core contributor and community leader for DuckDuckGo's open source organization. Besides that he is one of the contributors to the Git project, pandas - open source data analysis library, Linguist project by GitHub and Go-GitHub project by Google. Sahil has been actively speaking at conferences in last one year including FOSDEM, EuroPython and SIGNAL 2017 to name some of them."

Sahil Dua

May 04, 2018
Tweet

More Decks by Sahil Dua

Other Decks in Technology

Transcript

  1. @sahildua2305 @sahildua2305 whoami ➔Software Developer @ Booking.com ➔Previously - Deep

    Learning Infrastructure ➔Open Source Contributor (Git, Pandas, Kinto, go-github, Hound, etc.) ➔Tech Speaker
  2. @sahildua2305 @sahildua2305 Image Tagging Sea view: 6.38 Balcony/Terrace: 4.82 Photo

    of the whole room: 4.21 Bed: 3.47 Decorative details: 3.15 Seating area: 2.70
  3. @sahildua2305 @sahildua2305 Image Tagging Using the image tag information in

    the right context Swimming pool, Breakfast Buffet, etc.
  4. @sahildua2305 @sahildua2305 Training with Kubernetes ➔Base images with ML frameworks

    ◆ TensorFlow, Torch, VowpalWabbit, etc. ➔Training code is installed at start time ➔Data access - Hadoop (or PVs)
  5. @sahildua2305 @sahildua2305 Serving Predictions ➔ Stateless app with common code

    ➔ Containerized ➔ No model in image ➔ REST API for predictions
  6. @sahildua2305 @sahildua2305 Deploying a new model ➔ Create a new

    Deployment ➔ Create a new Service ➔ Wait for liveness/readiness probe