Serving deep learning models through a serverless pipeline

www.neosperience.com | blog.neosperience.com | [email protected] Neosperience Empathy in Technology Serving
deep learning models through a serverless pipeline September, 24th 2020

Luca Bianchi Who am I? github.com/aletheia https://it.linkedin.com/in/lucabianchipavia https://speakerdeck.com/aletheia Chief Technology
Ofﬁcer @ Neosperience Chief Technology Ofﬁcer @ WizKey Serverless Meetup and ServerlessDays Italy co-organizer www.bianchiluca.com @bianchiluca

what makes every customer unique, them in 1:1 experiences and
your customer base. Neosperience Cloud Understand Engage Grow

Neosperience Cloud Cloud Understand Engage Grow

Evaluate marketing images to drive customer attention and make your
campaign memorable An AI-driven image evaluation solution Neosperience Image Memorability is a business driven implementation of memorability deep learning algorithms trained on business speciﬁc dataset, to achieve a memorability score and relevant areas analysis. Image memorability service can be tested at image.neosperience.com

An image with high memorability score does not mean it
is good for marketing How it works? Great authored images are can be misleading when you analyze their memorability map score 9.32 / 10 these areas are the most relevant of the image (red color), so they have the highest stickiness in viewer’s brain this is probably the most interesting part from a brand perspective, but the image does not focus attention on it (cold color)

why is this relevant to us?

Building and deploying a machine learning solution is not an
easy task Start from service requirements - easy to build - easy to deploy and operate - easy to integrate - implement memorability algorithm (score, heatmap) - serverless (never pay for idle, scalability) - cost-eﬀective - response time within seconds

image processing pipeline

How to handle image processing with serverless? Serverless image pipeline
Amazon S3 • no need to use AWS SDK on client • upload is performed by client through a POST request to the signed url AWS Lambda • checks user licence plan (business constraints) • synthesizes S3 signed urls to allow image upload Amazon API Gateway • oﬀers authentication • sends request to lambda to build image upload url

Image Memorability 1.0 — pipeline

what about ML algorithms?

evaluate managed services

A fully managed service for computer vision Amazon Rekognition Amazon
Rekognition • automated AI service with available with pre-trained state-of-the-art deep learning models • ready to analyze images for classiﬁcation, object detection and labelling • additional feature such as logo and celebrity recognition • support for face detection, analysis and identiﬁcation How to leverage? • upload your image to S3 and pass the reference to the Rekognition API, or send a base64 encoded image directly from your client to the API.

Provide business specific labels to Rekognition classifier Amazon Rekognition with
custom labels Amazon Rekognition with custom labels • provide a set of labelled images to fine tune your training • images can be uploaded in batches using Amazon SageMaker manifest format • it solves a classification problem, we need a regression number (score) with something else deeply related to how network performs

Check our requirements Solution evaluation ✓ easy to build ✓
easy to deploy and operate ✓ easy to integrate ✗ implement memorability algorithm (score, heatmap) ✓ serverless (never pay for idle, scalability) ✓ cost-eﬀective ✓ response time within seconds

build a custom model

A GPU powered EC2 instance Amazon Deep Learning Instances EC2
instances for deep learning • GPU instances (p2, p3 instance type) • model training and inference on the same cluster • bring your own deep learning framework (PyTorch, Keras) • can implement any ML model on this architecture How to leverage? • needs an image processing pipeline to handle images • needs a scheduling logic to process data • needs S3 bridge to store and retrieve data EC2 p3 instance Deep Learning   AMI

EC2 instances for deep learning • GPU instances (p2, p3
instance type) • model training and inference on the same cluster • bring your own deep learning framework (PyTorch, Keras) • can implement any ML model on this architecture A GPU powered EC2 instance Amazon Deep Learning Instances EC2 instances for deep learning • cost eﬀective GPU instances (any instance type) • model training and inference on diﬀerent cluster • bring your own deep learning framework (PyTorch, Keras) • can implement any ML model on this architecture How to leverage? • needs an image processing pipeline to handle images • needs a scheduling logic to process data • needs S3 bridge to store and retrieve data EC2 Deep Learning   AMI Elastic Inference

Image Memorability 1.0 on prem servers used to reduce cost
impact

Check our requirements Solution evaluation ✗ easy to build ✗
easy to deploy and operate - easy to integrate ✓ implement memorability algorithm (score, heatmap) ✗ serverless (never pay for idle, scalability) ✗ cost-eﬀective ✓ response time within seconds EC2 Deep Learning   AMI Elastic Inference

build a custom model

build a custom model on a managed service

Fully managed machine learning operation environment Amazon SageMaker Managed Notebook
Instances, training and inference platform • Jupyter Notebooks managed in the cloud • scales up and down training across multiple instances • support to inference endpoint creation and model hosting • testing, debugging, and ﬁne-tuning through SageMaker Studio • allow building custom ML models such as AMNet How to leverage? • needs an image processing pipeline to handle images • needs S3 bridge to store and retrieve data Amazon SageMaker

Image Memorability 1.0

Image Memorability 2.0 reduced the number of servers, but still
needed to reduce costs

Achieving near human performances with AMNet in common applications, memorability
score is usually normalized between [0, 0.68], a score of 1,0 means having an image performance equal to human comparison to state-of-the-art implementation https://arxiv.org/pdf/1804.03115.pdf

easy to deploy and operate - easy to integrate ✓ implement memorability algorithm (score, heatmap) ✓ serverless (never pay for idle, scalability) - cost-eﬀective ✓ response time within seconds EC2 Deep Learning   AMI Elastic Inference

can we improve this result?

Consider your model intrinsics to improve architecture Same model, two
results Image inference with a regression model • memorability score is built with a single evaluation of the network • shrinking down the CNN used has a minimum impact on accuracy (from 0.677 to 0.64) but a huge impact on performances • shrinking the network means being able to adopt on a more cost-eﬀective approach Backpropagation to compute image heatmap • is built back propagating network weights and checking activations • is computationally intensive even on GPU (15s on an nVidia Tesla V100)

Image Memorability 2.0

Introducing Amazon EFS is a huge leap forward Considerations •
source images are shared between Lambdas with no need to upload/download from S3 • S3 interface is maintained to ease image upload • still need Amazon SageMaker to process image heatmap • using Lambda made execution time for score compute rise from 5 seconds to 12 seconds • overall image scoring is improved from an average of 90 seconds to 12 seconds (queue jobs are processed in parallel by Lambdas) • no costs for idle endpoints • costs reduced by an order of magnitude (reduced load on SageMaker endpoints) • could remove on premise servers (SageMaker endpoints + Inference on Lambda is cost eﬀective)

easy to deploy and operate ✓ easy to integrate ✓ implement memorability algorithm (score, heatmap) ✓ serverless (never pay for idle image scoring, scalability improved) ✓ cost-eﬀective ✓ response time within seconds

We were able to build a cost eﬀective image processing
pipeline for a custom ML model Wrap up • custom ML models can be implemented on AWS Lambda (if no GPU is required or the slow down is acceptable) • Amazon SageMaker is a great alternative, but more expensive (you pay for idle at least one instance) • model training is performed on Amazon SageMaker (consider using spot instances for training) • choose the right computing for your machine learning model requirements • Amazon EFS is a game changer in many applications • be aware of AWS Lambda dependency size when using ML libraries

Thank you.

bit.ly/slsdays-20200924

Empathy in Technology

www.neosperience.com | blog.neosperience.com | [email protected]

Serving deep learning models through a serverle...

Serving deep learning models through a serverless pipeline

More Decks by Aletheia

Other Decks in Technology

Featured

Transcript