Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Serverless Machine Learning with TensorFlow.js

Serverless Machine Learning with TensorFlow.js

This talk will show how to use TensorFlow with serverless platforms to bring the benefits of serverless runtimes (elastic scalability, low pricing and no charge for idle) to the task of real-time machine learning in the cloud.

Attendees will learn why serverless platforms are great for machine learning in the cloud, understand the different approaches for deploying pre-trained models and learn how to architect scalable serverless functions when using TensorFlow.

Key issues covered will include loading TensorFlow libraries in serverless runtimes, tips on improving performance on cold and warm starts and how model scoring without a GPU affects throughput. Different methods for running TF models with serverless runtimes, including TensorFlow JS, will be compared and contrasted.

Developers do not need any prior experience with machine learning or serverless cloud platforms. This talk is applicable for all serverless developers interested in machine learning, rather than being restricted to a single platform or vendor.

James Thomas

October 01, 2018
Tweet

More Decks by James Thomas

Other Decks in Technology

Transcript

  1. TWITTER
 SEARCH Check Search Results Register Search Twitter Search Face

    Recognition API Gateway Web Client ARCHITECTURE Auth0 Twitter API TensorFlow.js ~550 LOC JavaScript @THOMASJ
  2. TWITTER
 SEARCH Check Search Results Register Search Twitter Search Face

    Recognition API Gateway Web Client Auth0 Twitter API TensorFlow.js ~550 LOC JavaScript @THOMASJ {“query”: “serverless”} {“job_id”: 12345} {“query”: “serverless”}
  3. TWITTER
 SEARCH Check Search Results Register Search Twitter Search Face

    Recognition API Gateway Web Client Auth0 Twitter API TensorFlow.js ~550 LOC JavaScript @THOMASJ {“job_id”: 12345} {“url”: “…”}
  4. TWITTER
 SEARCH Check Search Results Register Search Twitter Search Face

    Recognition API Gateway Web Client Auth0 Twitter API TensorFlow.js ~550 LOC JavaScript @THOMASJ {“job_id”: 12345}
  5. TWITTER
 SEARCH Check Search Results Register Search Twitter Search Face

    Recognition API Gateway Web Client ARCHITECTURE Auth0 Twitter API TensorFlow.js ~550 LOC JavaScript @THOMASJ
  6. IS THIS THE END? $ sls create -t openwhisk-nodejs $

    npm install @tensorflow/tfjs-node $ sls deploy @THOMASJ
  7. $ sls create -t openwhisk-nodejs $ npm install @tensorflow/tfjs-node $

    sls deploy ERROR: DEPLOYMENT PACKAGE TOO LARGE. @THOMASJ Not so fast…
  8. let cold_start = false const library = 'libtensorflow.so' if (cold_start)

    { const data = from_object_store(library) write_to_fs(library, data) cold_start = true } // rest of function code… WORK AROUND? @THOMASJ
  9. FROM openwhisk/action-nodejs-v8:latest RUN npm install @tensorflow/tfjs-node ^^ Dockerfile $ docker

    build -t username/tfjs . $ docker push username/tfjs CUSTOM RUNTIMES @THOMASJ
  10. // Pre-trained image recognition model. const mobilenet = require('@tensorflow-models/ mobilenet')

    // Load model from Google over HTTP. const model = await mobilenet.load() @THOMASJ MODEL LOADING
  11. ReferenceError: fetch is not defined @THOMASJ // Pre-trained image recognition

    model. const mobilenet = require('@tensorflow-models/ mobilenet') // Load model from Google over HTTP. const model = await mobilenet.load()
  12. // Pre-trained image recognition model. const mobilenet = require('@tensorflow-models/ mobilenet')

    // Make HTTP client available in runtime global.fetch = require('node-fetch') // Load model from Google over HTTP. const model = await mobilenet.load() ReferenceError: fetch is not defined @THOMASJ
  13. const img = document.getElementById('img'); // Classify the image. const predictions

    = await model.classify(img); @THOMASJ MODEL PREDICTIONS
  14. const img = document.getElementById('img'); // Classify the image. const predictions

    = await model.classify(img); @THOMASJ ReferenceError: document is not defined
  15. TRANSLATE TO RGB VALUES pixel 1 = #f3fa0a pixel 2

    = #ea0f16 pixel 3 = #120f0a pixel 4 = #644fab … @THOMASJ
  16. const jpeg = require('jpeg-js') const buffer = fs.readFileSync('pug.jpg') const jpegData

    = jpeg.decode(jpegData, true) console.log(jpegData) /* { width: 320, height: 180, data: { '0': 91, '1': 64, ... } } // typed array */ @THOMASJ RGB values
  17. const jpeg = require('jpeg-js') const buffer = fs.readFileSync('pug.jpg') const image

    = jpeg.decode(buffer, true) 
 // remove alpha channel from RGBA values const rgb = image.data.filter((el, i) => (++i % 4)) @THOMASJ
  18. const jpeg = require('jpeg-js') const buffer = fs.readFileSync('pug.jpg') const image

    = jpeg.decode(buffer, true) const rgb = image.data.filter((el, i) => (++i % 4)) // convert rgb values to typed array const values = new Int32Array(rgb) // shape of tensor is image dimensions const shape = [image.height, image.width, 3] // input tensor from rgb values & image dimensions const input = tf.tensor3d(values, shape, ‘int32’) const predictions = await model.classify(input) @THOMASJ
  19. $ sls invoke -f classifier -p input data {"result": …}

    $ sls invoke -f classifier -p input data {"result": …} @THOMASJ
  20. $ sls invoke -f classifier -p input data {"result": …}

    $ sls invoke -f classifier -p input data {"result": …} $ sls invoke -f classifier -p input data {"result": …} @THOMASJ
  21. $ sls invoke -f classifier -p input data {"result": …}

    $ sls invoke -f classifier -p input data {"result": …} $ sls invoke -f classifier -p input data {"result": …} $ sls invoke -f classifier -p input data {"error": "OUT OF MEMORY"} @THOMASJ
  22. const jpeg = require('jpeg-js') const buffer = fs.readFileSync('pug.jpg') const image

    = jpeg.decode(buffer, true) const rgb = image.data.filter((el, i) => (++i % 4)) const values = new Int32Array(rgb) const shape = [image.height, image.width, 3] const input = tf.tensor3d(values, shape, ‘int32’) const predictions = await model.classify(input) Allocates native memory NOT garbage collected @THOMASJ
  23. const jpeg = require('jpeg-js') const buffer = fs.readFileSync('pug.jpg') const image

    = jpeg.decode(buffer, true) const rgb = image.data.filter((el, i) => (++i % 4)) const values = new Int32Array(rgb) const shape = [image.height, image.width, 3] const input = tf.tensor3d(values, shape, ‘int32’) const predictions = await model.classify(input) input.dispose() @THOMASJ
  24. How can we improve this? ❄ COLD START ❄ (~8

    seconds) WARM START (~5 seconds) @THOMASJ
  25. ❄ COLD START ❄ WARM START INITIALISATION 1200 ms 0

    ms MODEL LOADING 3200 ms 2000 ms IMAGE LOADING 500 ms x 2 500 ms x 2 FACE DETECTION 700 - 900 ms x 2 700 - 900 ms x 2 EVERYTHING ELSE 1000 ms 500 ms TOTAL DURATION ~ 8 seconds ~ 5 seconds
  26. // load weight files from filesystem global.fetch = async (file)

    => { return { json: () => JSON.parse(fs.readFileSync(file, 'utf8')), arrayBuffer: () => fs.readFileSync(file) } }
  27. ❄ COLD START ❄ WARM START INITIALISATION 1200 ms 0

    ms MODEL LOADING 2700 ms 1500 ms IMAGE LOADING 500 ms x 2 500 ms x 2 FACE DETECTION 700 - 900 ms x 2 700 - 900 ms x 2 EVERYTHING ELSE 1000 ms 500 ms TOTAL DURATION ~ 7.5 seconds ~ 4.5 seconds
  28. const faceapi = require(‘face-api.js') let LOADED = false exports.load =

    async location => { if (!LOADED) { await faceapi.loadFaceDetectionModel(location) await faceapi.loadFaceRecognitionModel(location) await faceapi.loadFaceLandmarkModel(location) LOADED = true } return faceapi } @THOMASJ MODEL CACHING
  29. ❄ COLD START ❄ WARM START INITIALISATION 1200 ms 0

    ms MODEL LOADING 2700 ms 0 ms IMAGE LOADING 500 ms x 2 500 ms x 2 FACE DETECTION 700 - 900 ms x 2 700 - 900 ms x 2 EVERYTHING ELSE 1000 ms 500 ms TOTAL DURATION ~ 7.5 seconds ~ 3 seconds
  30. const faces = await models.allFaces(input, min_score) const face = faces[0]

    console.log(face.descriptor) // face descriptor > Float32Array (length: 128) [-0.1298077404499054, 0.08730170130729675, 0.03973294794559479, 0.03567018359899521, -0.09620543569326401, 0.03842385113239288, ... Cache Values in Redis @THOMASJ
  31. COLD START ❄ + CACHE WARM START INITIALISATION 1200 ms

    1200 ms 0 ms MODEL LOADING 2700 ms 2700 ms 0 ms IMAGE LOADING 500 ms x 2 500ms 500 ms FACE DETECTION 700 - 900 ms x 2 700 - 900 ms 700 - 900 ms EVERYTHING ELSE 1000 ms 1000 ms 500 ms TOTAL DURATION ~ 7.5 seconds ~ 6 seconds ~ 2.5 seconds
  32. ❄ COLD START ❄ (~8 seconds)
 (~7.5 - 6 seconds)

    WARM START (~5 seconds) (~ 2.5 seconds) cold start savings = ~6% - 25% warm start savings = ~50% @THOMASJ