Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Serverless Machine Learning with TensorFlow.js

Serverless Machine Learning with TensorFlow.js

This talk will show how to use TensorFlow with serverless platforms to bring the benefits of serverless runtimes (elastic scalability, low pricing and no charge for idle) to the task of real-time machine learning in the cloud.

Attendees will learn why serverless platforms are great for machine learning in the cloud, understand the different approaches for deploying pre-trained models and learn how to architect scalable serverless functions when using TensorFlow.

Key issues covered will include loading TensorFlow libraries in serverless runtimes, tips on improving performance on cold and warm starts and how model scoring without a GPU affects throughput. Different methods for running TF models with serverless runtimes, including TensorFlow JS, will be compared and contrasted.

Developers do not need any prior experience with machine learning or serverless cloud platforms. This talk is applicable for all serverless developers interested in machine learning, rather than being restricted to a single platform or vendor.

Dad87927739931f134c5b1242e3d04bc?s=128

James Thomas

October 01, 2018
Tweet

Transcript

  1. James Thomas. Developer Advocate @ IBM ☁ @THOMASJ Serverless Machine

    Learning with
  2. @THOMASJ

  3. @THOMASJ

  4. “CAN I WRITE SOME CODE TO DO THIS?” @THOMASJ

  5. @THOMASJ

  6. TWITTER
 SEARCH Check Search Results Register Search Twitter Search Face

    Recognition API Gateway Web Client ARCHITECTURE Auth0 Twitter API TensorFlow.js ~550 LOC JavaScript @THOMASJ
  7. TWITTER
 SEARCH Check Search Results Register Search Twitter Search Face

    Recognition API Gateway Web Client Auth0 Twitter API TensorFlow.js ~550 LOC JavaScript @THOMASJ {“query”: “serverless”} {“job_id”: 12345} {“query”: “serverless”}
  8. TWITTER
 SEARCH Check Search Results Register Search Twitter Search Face

    Recognition API Gateway Web Client Auth0 Twitter API TensorFlow.js ~550 LOC JavaScript @THOMASJ {“job_id”: 12345} {“url”: “…”}
  9. TWITTER
 SEARCH Check Search Results Register Search Twitter Search Face

    Recognition API Gateway Web Client Auth0 Twitter API TensorFlow.js ~550 LOC JavaScript @THOMASJ {“job_id”: 12345}
  10. TWITTER
 SEARCH Check Search Results Register Search Twitter Search Face

    Recognition API Gateway Web Client ARCHITECTURE Auth0 Twitter API TensorFlow.js ~550 LOC JavaScript @THOMASJ
  11. @THOMASJ

  12. @THOMASJ

  13. @THOMASJ

  14. IS THIS THE END? $ sls create -t openwhisk-nodejs $

    npm install @tensorflow/tfjs-node $ sls deploy @THOMASJ
  15. ✋ NOT SO FAAS-T! @THOMASJ

  16. MAKE IT WORK MAKE IT RIGHT MAKE IT FAST @THOMASJ

  17. $ sls create -t openwhisk-nodejs $ npm install @tensorflow/tfjs-node $

    sls deploy ERROR: DEPLOYMENT PACKAGE TOO LARGE. @THOMASJ Not so fast…
  18. $ du -hd0 node_modules 243M node_modules @THOMASJ Not so fast…

  19. libtensorflow.so = 142M @THOMASJ Not so fast…

  20. let cold_start = false const library = 'libtensorflow.so' if (cold_start)

    { const data = from_object_store(library) write_to_fs(library, data) cold_start = true } // rest of function code… WORK AROUND? @THOMASJ
  21. PRO-TIP:
 BUILD CUSTOM TF-JS RUNTIME @THOMASJ

  22. None
  23. FROM openwhisk/action-nodejs-v8:latest RUN npm install @tensorflow/tfjs-node ^^ Dockerfile $ docker

    build -t username/tfjs . $ docker push username/tfjs CUSTOM RUNTIMES @THOMASJ
  24. service: machine-learning provider: name: openwhisk functions: classification: handler: lib.handler image:

    "username/tfjs" SERVERLESS.YAML @THOMASJ
  25. $ sls create -t openwhisk-nodejs $ npm install @tensorflow/tfjs-node $

    sls deploy SUCCESS! @THOMASJ
  26. SINGLE CONFIGURATION FILE (JSON). ( MULTIPLE WEIGHTS SHARDS (BINARY). LOADING

    TENSORFLOW.JS MODELS @THOMASJ
  27. // Pre-trained image recognition model. const mobilenet = require('@tensorflow-models/ mobilenet')

    // Load model from Google over HTTP. const model = await mobilenet.load() @THOMASJ MODEL LOADING
  28. ReferenceError: fetch is not defined @THOMASJ // Pre-trained image recognition

    model. const mobilenet = require('@tensorflow-models/ mobilenet') // Load model from Google over HTTP. const model = await mobilenet.load()
  29. // Pre-trained image recognition model. const mobilenet = require('@tensorflow-models/ mobilenet')

    // Make HTTP client available in runtime global.fetch = require('node-fetch') // Load model from Google over HTTP. const model = await mobilenet.load() ReferenceError: fetch is not defined @THOMASJ
  30. const img = document.getElementById('img'); // Classify the image. const predictions

    = await model.classify(img); @THOMASJ MODEL PREDICTIONS
  31. const img = document.getElementById('img'); // Classify the image. const predictions

    = await model.classify(img); @THOMASJ ReferenceError: document is not defined
  32. @THOMASJ

  33. TRANSLATE TO RGB VALUES pixel 1 = #f3fa0a pixel 2

    = #ea0f16 pixel 3 = #120f0a pixel 4 = #644fab … @THOMASJ
  34. const jpeg = require('jpeg-js') const buffer = fs.readFileSync('pug.jpg') const jpegData

    = jpeg.decode(jpegData, true) console.log(jpegData) /* { width: 320, height: 180, data: { '0': 91, '1': 64, ... } } // typed array */ @THOMASJ RGB values
  35. PRO-TIP:
 IGNORE THE ALPHA CHANNEL @THOMASJ

  36. const jpeg = require('jpeg-js') const buffer = fs.readFileSync('pug.jpg') const image

    = jpeg.decode(buffer, true) 
 // remove alpha channel from RGBA values const rgb = image.data.filter((el, i) => (++i % 4)) @THOMASJ
  37. const jpeg = require('jpeg-js') const buffer = fs.readFileSync('pug.jpg') const image

    = jpeg.decode(buffer, true) const rgb = image.data.filter((el, i) => (++i % 4)) // convert rgb values to typed array const values = new Int32Array(rgb) // shape of tensor is image dimensions const shape = [image.height, image.width, 3] // input tensor from rgb values & image dimensions const input = tf.tensor3d(values, shape, ‘int32’) const predictions = await model.classify(input) @THOMASJ
  38. MAKE IT WORK MAKE IT RIGHT MAKE IT FAST @THOMASJ

  39. $ sls invoke -f classifier -p input data {"result": …}

    @THOMASJ
  40. $ sls invoke -f classifier -p input data {"result": …}

    $ sls invoke -f classifier -p input data {"result": …} @THOMASJ
  41. $ sls invoke -f classifier -p input data {"result": …}

    $ sls invoke -f classifier -p input data {"result": …} $ sls invoke -f classifier -p input data {"result": …} @THOMASJ
  42. $ sls invoke -f classifier -p input data {"result": …}

    $ sls invoke -f classifier -p input data {"result": …} $ sls invoke -f classifier -p input data {"result": …} $ sls invoke -f classifier -p input data {"error": "OUT OF MEMORY"} @THOMASJ
  43. const jpeg = require('jpeg-js') const buffer = fs.readFileSync('pug.jpg') const image

    = jpeg.decode(buffer, true) const rgb = image.data.filter((el, i) => (++i % 4)) const values = new Int32Array(rgb) const shape = [image.height, image.width, 3] const input = tf.tensor3d(values, shape, ‘int32’) const predictions = await model.classify(input) Allocates native memory NOT garbage collected @THOMASJ
  44. PRO-TIP:
 USE DISPOSE() TO RELEASE MEMORY @THOMASJ

  45. const jpeg = require('jpeg-js') const buffer = fs.readFileSync('pug.jpg') const image

    = jpeg.decode(buffer, true) const rgb = image.data.filter((el, i) => (++i % 4)) const values = new Int32Array(rgb) const shape = [image.height, image.width, 3] const input = tf.tensor3d(values, shape, ‘int32’) const predictions = await model.classify(input) input.dispose() @THOMASJ
  46. MAKE IT WORK MAKE IT RIGHT MAKE IT FAST @THOMASJ

  47. How can we improve this? ❄ COLD START ❄ (~8

    seconds) WARM START (~5 seconds) @THOMASJ
  48. ❄ COLD START ❄ WARM START INITIALISATION 1200 ms 0

    ms MODEL LOADING 3200 ms 2000 ms IMAGE LOADING 500 ms x 2 500 ms x 2 FACE DETECTION 700 - 900 ms x 2 700 - 900 ms x 2 EVERYTHING ELSE 1000 ms 500 ms TOTAL DURATION ~ 8 seconds ~ 5 seconds
  49. FROM openwhisk/action-nodejs-v8:latest RUN npm install @tensorflow/tfjs-node COPY weights weights ADDING

    WEIGHTS
  50. // load weight files from filesystem global.fetch = async (file)

    => { return { json: () => JSON.parse(fs.readFileSync(file, 'utf8')), arrayBuffer: () => fs.readFileSync(file) } }
  51. ❄ COLD START ❄ WARM START INITIALISATION 1200 ms 0

    ms MODEL LOADING 2700 ms 1500 ms IMAGE LOADING 500 ms x 2 500 ms x 2 FACE DETECTION 700 - 900 ms x 2 700 - 900 ms x 2 EVERYTHING ELSE 1000 ms 500 ms TOTAL DURATION ~ 7.5 seconds ~ 4.5 seconds
  52. @THOMASJ

  53. const faceapi = require(‘face-api.js') let LOADED = false exports.load =

    async location => { if (!LOADED) { await faceapi.loadFaceDetectionModel(location) await faceapi.loadFaceRecognitionModel(location) await faceapi.loadFaceLandmarkModel(location) LOADED = true } return faceapi } @THOMASJ MODEL CACHING
  54. ❄ COLD START ❄ WARM START INITIALISATION 1200 ms 0

    ms MODEL LOADING 2700 ms 0 ms IMAGE LOADING 500 ms x 2 500 ms x 2 FACE DETECTION 700 - 900 ms x 2 700 - 900 ms x 2 EVERYTHING ELSE 1000 ms 500 ms TOTAL DURATION ~ 7.5 seconds ~ 3 seconds
  55. const faces = await models.allFaces(input, min_score) const face = faces[0]

    console.log(face.descriptor) // face descriptor > Float32Array (length: 128) [-0.1298077404499054, 0.08730170130729675, 0.03973294794559479, 0.03567018359899521, -0.09620543569326401, 0.03842385113239288, ... Cache Values in Redis @THOMASJ
  56. COLD START ❄ + CACHE WARM START INITIALISATION 1200 ms

    1200 ms 0 ms MODEL LOADING 2700 ms 2700 ms 0 ms IMAGE LOADING 500 ms x 2 500ms 500 ms FACE DETECTION 700 - 900 ms x 2 700 - 900 ms 700 - 900 ms EVERYTHING ELSE 1000 ms 1000 ms 500 ms TOTAL DURATION ~ 7.5 seconds ~ 6 seconds ~ 2.5 seconds
  57. ❄ COLD START ❄ (~8 seconds)
 (~7.5 - 6 seconds)

    WARM START (~5 seconds) (~ 2.5 seconds) cold start savings = ~6% - 25% warm start savings = ~50% @THOMASJ
  58. MAKE IT WORK MAKE IT RIGHT MAKE IT FAST @THOMASJ

  59. in the @THOMASJ

  60. @THOMASJ

  61. PRE-TRAINED MODELS @THOMASJ

  62. PRE-TRAINED MODELS COST @THOMASJ

  63. PRE-TRAINED MODELS COST 1 PERFORMANCE @THOMASJ @THOMASJ

  64. PRE-TRAINED MODELS COST 1 PERFORMANCE EASE OF USE @THOMASJ

  65. PRE-TRAINED MODELS COST 1 PERFORMANCE EASE OF USE ( TRAINING

    @THOMASJ
  66. @THOMASJ

  67. PRE-TRAINED MODELS COST 1 PERFORMANCE EASE OF USE ( TRAINING

    @THOMASJ
  68. CONCLUSION TF.JS + SERVERLESS = SCALABLE ML IN THE CLOUD

    @THOMASJ
  69. @THOMASJ

  70. QUESTIONS? @THOMASJ jamesthom.as jthomas openwhisk.org bluemix.net/openwhisk