Serverless Machine Learning with TensorFlow.js

James Thomas. Developer Advocate @ IBM ☁ @THOMASJ Serverless Machine
Learning with

@THOMASJ

“CAN I WRITE SOME CODE TO DO THIS?” @THOMASJ

@THOMASJ

TWITTER  SEARCH Check Search Results Register Search Twitter Search Face
Recognition API Gateway Web Client ARCHITECTURE Auth0 Twitter API TensorFlow.js ~550 LOC JavaScript @THOMASJ

Recognition API Gateway Web Client Auth0 Twitter API TensorFlow.js ~550 LOC JavaScript @THOMASJ {“query”: “serverless”} {“job_id”: 12345} {“query”: “serverless”}

Recognition API Gateway Web Client Auth0 Twitter API TensorFlow.js ~550 LOC JavaScript @THOMASJ {“job_id”: 12345} {“url”: “…”}

Recognition API Gateway Web Client Auth0 Twitter API TensorFlow.js ~550 LOC JavaScript @THOMASJ {“job_id”: 12345}

Recognition API Gateway Web Client ARCHITECTURE Auth0 Twitter API TensorFlow.js ~550 LOC JavaScript @THOMASJ

@THOMASJ

IS THIS THE END? $ sls create -t openwhisk-nodejs $
npm install @tensorflow/tfjs-node $ sls deploy @THOMASJ

✋ NOT SO FAAS-T! @THOMASJ

MAKE IT WORK MAKE IT RIGHT MAKE IT FAST @THOMASJ

$ sls create -t openwhisk-nodejs $ npm install @tensorflow/tfjs-node $
sls deploy ERROR: DEPLOYMENT PACKAGE TOO LARGE. @THOMASJ Not so fast…

$ du -hd0 node_modules 243M node_modules @THOMASJ Not so fast…

libtensorflow.so = 142M @THOMASJ Not so fast…

let cold_start = false const library = 'libtensorflow.so' if (cold_start)
{ const data = from_object_store(library) write_to_fs(library, data) cold_start = true } // rest of function code… WORK AROUND? @THOMASJ

PRO-TIP:  BUILD CUSTOM TF-JS RUNTIME @THOMASJ

FROM openwhisk/action-nodejs-v8:latest RUN npm install @tensorflow/tfjs-node ^^ Dockerfile $ docker
build -t username/tfjs . $ docker push username/tfjs CUSTOM RUNTIMES @THOMASJ

service: machine-learning provider: name: openwhisk functions: classification: handler: lib.handler image:
"username/tfjs" SERVERLESS.YAML @THOMASJ

$ sls create -t openwhisk-nodejs $ npm install @tensorflow/tfjs-node $
sls deploy SUCCESS! @THOMASJ

SINGLE CONFIGURATION FILE (JSON). ( MULTIPLE WEIGHTS SHARDS (BINARY). LOADING
TENSORFLOW.JS MODELS @THOMASJ

// Pre-trained image recognition model. const mobilenet = require('@tensorflow-models/ mobilenet')
// Load model from Google over HTTP. const model = await mobilenet.load() @THOMASJ MODEL LOADING

ReferenceError: fetch is not defined @THOMASJ // Pre-trained image recognition
model. const mobilenet = require('@tensorflow-models/ mobilenet') // Load model from Google over HTTP. const model = await mobilenet.load()

// Pre-trained image recognition model. const mobilenet = require('@tensorflow-models/ mobilenet')
// Make HTTP client available in runtime global.fetch = require('node-fetch') // Load model from Google over HTTP. const model = await mobilenet.load() ReferenceError: fetch is not defined @THOMASJ

const img = document.getElementById('img'); // Classify the image. const predictions
= await model.classify(img); @THOMASJ MODEL PREDICTIONS

const img = document.getElementById('img'); // Classify the image. const predictions
= await model.classify(img); @THOMASJ ReferenceError: document is not defined

@THOMASJ

TRANSLATE TO RGB VALUES pixel 1 = #f3fa0a pixel 2
= #ea0f16 pixel 3 = #120f0a pixel 4 = #644fab … @THOMASJ

const jpeg = require('jpeg-js') const buffer = fs.readFileSync('pug.jpg') const jpegData
= jpeg.decode(jpegData, true) console.log(jpegData) /* { width: 320, height: 180, data: { '0': 91, '1': 64, ... } } // typed array */ @THOMASJ RGB values

PRO-TIP:  IGNORE THE ALPHA CHANNEL @THOMASJ

const jpeg = require('jpeg-js') const buffer = fs.readFileSync('pug.jpg') const image
= jpeg.decode(buffer, true)   // remove alpha channel from RGBA values const rgb = image.data.filter((el, i) => (++i % 4)) @THOMASJ

= jpeg.decode(buffer, true) const rgb = image.data.filter((el, i) => (++i % 4)) // convert rgb values to typed array const values = new Int32Array(rgb) // shape of tensor is image dimensions const shape = [image.height, image.width, 3] // input tensor from rgb values & image dimensions const input = tf.tensor3d(values, shape, ‘int32’) const predictions = await model.classify(input) @THOMASJ

$ sls invoke -f classifier -p input data {"result": …}
@THOMASJ

$ sls invoke -f classifier -p input data {"result": …} @THOMASJ

$ sls invoke -f classifier -p input data {"result": …} $ sls invoke -f classifier -p input data {"result": …} @THOMASJ

$ sls invoke -f classifier -p input data {"result": …} $ sls invoke -f classifier -p input data {"result": …} $ sls invoke -f classifier -p input data {"error": "OUT OF MEMORY"} @THOMASJ

= jpeg.decode(buffer, true) const rgb = image.data.filter((el, i) => (++i % 4)) const values = new Int32Array(rgb) const shape = [image.height, image.width, 3] const input = tf.tensor3d(values, shape, ‘int32’) const predictions = await model.classify(input) Allocates native memory NOT garbage collected @THOMASJ

PRO-TIP:  USE DISPOSE() TO RELEASE MEMORY @THOMASJ

= jpeg.decode(buffer, true) const rgb = image.data.filter((el, i) => (++i % 4)) const values = new Int32Array(rgb) const shape = [image.height, image.width, 3] const input = tf.tensor3d(values, shape, ‘int32’) const predictions = await model.classify(input) input.dispose() @THOMASJ

How can we improve this? ❄ COLD START ❄ (~8
seconds) WARM START (~5 seconds) @THOMASJ

❄ COLD START ❄ WARM START INITIALISATION 1200 ms 0
ms MODEL LOADING 3200 ms 2000 ms IMAGE LOADING 500 ms x 2 500 ms x 2 FACE DETECTION 700 - 900 ms x 2 700 - 900 ms x 2 EVERYTHING ELSE 1000 ms 500 ms TOTAL DURATION ~ 8 seconds ~ 5 seconds

FROM openwhisk/action-nodejs-v8:latest RUN npm install @tensorflow/tfjs-node COPY weights weights ADDING
WEIGHTS

// load weight files from filesystem global.fetch = async (file)
=> { return { json: () => JSON.parse(fs.readFileSync(file, 'utf8')), arrayBuffer: () => fs.readFileSync(file) } }

ms MODEL LOADING 2700 ms 1500 ms IMAGE LOADING 500 ms x 2 500 ms x 2 FACE DETECTION 700 - 900 ms x 2 700 - 900 ms x 2 EVERYTHING ELSE 1000 ms 500 ms TOTAL DURATION ~ 7.5 seconds ~ 4.5 seconds

@THOMASJ

const faceapi = require(‘face-api.js') let LOADED = false exports.load =
async location => { if (!LOADED) { await faceapi.loadFaceDetectionModel(location) await faceapi.loadFaceRecognitionModel(location) await faceapi.loadFaceLandmarkModel(location) LOADED = true } return faceapi } @THOMASJ MODEL CACHING

ms MODEL LOADING 2700 ms 0 ms IMAGE LOADING 500 ms x 2 500 ms x 2 FACE DETECTION 700 - 900 ms x 2 700 - 900 ms x 2 EVERYTHING ELSE 1000 ms 500 ms TOTAL DURATION ~ 7.5 seconds ~ 3 seconds

const faces = await models.allFaces(input, min_score) const face = faces[0]
console.log(face.descriptor) // face descriptor > Float32Array (length: 128) [-0.1298077404499054, 0.08730170130729675, 0.03973294794559479, 0.03567018359899521, -0.09620543569326401, 0.03842385113239288, ... Cache Values in Redis @THOMASJ

COLD START ❄ + CACHE WARM START INITIALISATION 1200 ms
1200 ms 0 ms MODEL LOADING 2700 ms 2700 ms 0 ms IMAGE LOADING 500 ms x 2 500ms 500 ms FACE DETECTION 700 - 900 ms x 2 700 - 900 ms 700 - 900 ms EVERYTHING ELSE 1000 ms 1000 ms 500 ms TOTAL DURATION ~ 7.5 seconds ~ 6 seconds ~ 2.5 seconds

❄ COLD START ❄ (~8 seconds)  (~7.5 - 6 seconds)
WARM START (~5 seconds) (~ 2.5 seconds) cold start savings = ~6% - 25% warm start savings = ~50% @THOMASJ

in the @THOMASJ

@THOMASJ

PRE-TRAINED MODELS @THOMASJ

PRE-TRAINED MODELS COST @THOMASJ

PRE-TRAINED MODELS COST 1 PERFORMANCE @THOMASJ @THOMASJ

PRE-TRAINED MODELS COST 1 PERFORMANCE EASE OF USE @THOMASJ

PRE-TRAINED MODELS COST 1 PERFORMANCE EASE OF USE ( TRAINING
@THOMASJ

@THOMASJ

PRE-TRAINED MODELS COST 1 PERFORMANCE EASE OF USE ( TRAINING
@THOMASJ

CONCLUSION TF.JS + SERVERLESS = SCALABLE ML IN THE CLOUD
@THOMASJ

@THOMASJ

QUESTIONS? @THOMASJ jamesthom.as jthomas openwhisk.org bluemix.net/openwhisk

Serverless Machine Learning with TensorFlow.js

Serverless Machine Learning with TensorFlow.js

More Decks by James Thomas

Other Decks in Technology

Featured

Transcript