Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Serving deep learning models through a serverless pipeline

Aletheia
September 22, 2020

Serving deep learning models through a serverless pipeline

A talk about Neosperience Image Memorability architecture, presented at ServerlessDays Zurich on September, 24th 2020

Aletheia

September 22, 2020
Tweet

More Decks by Aletheia

Other Decks in Technology

Transcript

  1. www.neosperience.com | blog.neosperience.com | [email protected] Neosperience Empathy in Technology Serving

    deep learning models through a serverless pipeline September, 24th 2020
  2. Luca Bianchi Who am I? github.com/aletheia https://it.linkedin.com/in/lucabianchipavia https://speakerdeck.com/aletheia Chief Technology

    Officer @ Neosperience Chief Technology Officer @ WizKey Serverless Meetup and ServerlessDays Italy co-organizer www.bianchiluca.com @bianchiluca
  3. what makes every customer unique, them in 1:1 experiences and

    your customer base. Neosperience Cloud Understand Engage Grow
  4. Evaluate marketing images to drive customer attention and make your

    campaign memorable An AI-driven image evaluation solution Neosperience Image Memorability is a business driven implementation of memorability deep learning algorithms trained on business specific dataset, to achieve a memorability score and relevant areas analysis. Image memorability service can be tested at image.neosperience.com
  5. An image with high memorability score does not mean it

    is good for marketing How it works? Great authored images are can be misleading when you analyze their memorability map score 9.32 / 10 these areas are the most relevant of the image (red color), so they have the highest stickiness in viewer’s brain this is probably the most interesting part from a brand perspective, but the image does not focus attention on it (cold color)
  6. Building and deploying a machine learning solution is not an

    easy task Start from service requirements - easy to build - easy to deploy and operate - easy to integrate - implement memorability algorithm (score, heatmap) - serverless (never pay for idle, scalability) - cost-effective - response time within seconds
  7. How to handle image processing with serverless? Serverless image pipeline

    Amazon S3 • no need to use AWS SDK on client • upload is performed by client through a POST request to the signed url AWS Lambda • checks user licence plan (business constraints) • synthesizes S3 signed urls to allow image upload Amazon API Gateway • offers authentication • sends request to lambda to build image upload url
  8. A fully managed service for computer vision Amazon Rekognition Amazon

    Rekognition • automated AI service with available with pre-trained state-of-the-art deep learning models • ready to analyze images for classification, object detection and labelling • additional feature such as logo and celebrity recognition • support for face detection, analysis and identification How to leverage? • upload your image to S3 and pass the reference to the Rekognition API, or send a base64 encoded image directly from your client to the API.
  9. Provide business specific labels to Rekognition classifier Amazon Rekognition with

    custom labels Amazon Rekognition with custom labels • provide a set of labelled images to fine tune your training • images can be uploaded in batches using Amazon SageMaker manifest format • it solves a classification problem, we need a regression number (score) with something else deeply related to how network performs
  10. Check our requirements Solution evaluation ✓ easy to build ✓

    easy to deploy and operate ✓ easy to integrate ✗ implement memorability algorithm (score, heatmap) ✓ serverless (never pay for idle, scalability) ✓ cost-effective ✓ response time within seconds
  11. A GPU powered EC2 instance Amazon Deep Learning Instances EC2

    instances for deep learning • GPU instances (p2, p3 instance type) • model training and inference on the same cluster • bring your own deep learning framework (PyTorch, Keras) • can implement any ML model on this architecture How to leverage? • needs an image processing pipeline to handle images • needs a scheduling logic to process data • needs S3 bridge to store and retrieve data EC2 p3 instance Deep Learning 
 AMI
  12. EC2 instances for deep learning • GPU instances (p2, p3

    instance type) • model training and inference on the same cluster • bring your own deep learning framework (PyTorch, Keras) • can implement any ML model on this architecture A GPU powered EC2 instance Amazon Deep Learning Instances EC2 instances for deep learning • cost effective GPU instances (any instance type) • model training and inference on different cluster • bring your own deep learning framework (PyTorch, Keras) • can implement any ML model on this architecture How to leverage? • needs an image processing pipeline to handle images • needs a scheduling logic to process data • needs S3 bridge to store and retrieve data EC2 Deep Learning 
 AMI Elastic Inference
  13. Check our requirements Solution evaluation ✗ easy to build ✗

    easy to deploy and operate - easy to integrate ✓ implement memorability algorithm (score, heatmap) ✗ serverless (never pay for idle, scalability) ✗ cost-effective ✓ response time within seconds EC2 Deep Learning 
 AMI Elastic Inference
  14. Fully managed machine learning operation environment Amazon SageMaker Managed Notebook

    Instances, training and inference platform • Jupyter Notebooks managed in the cloud • scales up and down training across multiple instances • support to inference endpoint creation and model hosting • testing, debugging, and fine-tuning through SageMaker Studio • allow building custom ML models such as AMNet How to leverage? • needs an image processing pipeline to handle images • needs S3 bridge to store and retrieve data Amazon SageMaker
  15. Achieving near human performances with AMNet in common applications, memorability

    score is usually normalized between [0, 0.68], a score of 1,0 means having an image performance equal to human comparison to state-of-the-art implementation https://arxiv.org/pdf/1804.03115.pdf
  16. Check our requirements Solution evaluation ✓ easy to build ✓

    easy to deploy and operate - easy to integrate ✓ implement memorability algorithm (score, heatmap) ✓ serverless (never pay for idle, scalability) - cost-effective ✓ response time within seconds EC2 Deep Learning 
 AMI Elastic Inference
  17. Consider your model intrinsics to improve architecture Same model, two

    results Image inference with a regression model • memorability score is built with a single evaluation of the network • shrinking down the CNN used has a minimum impact on accuracy (from 0.677 to 0.64) but a huge impact on performances • shrinking the network means being able to adopt on a more cost-effective approach Backpropagation to compute image heatmap • is built back propagating network weights and checking activations • is computationally intensive even on GPU (15s on an nVidia Tesla V100)
  18. Introducing Amazon EFS is a huge leap forward Considerations •

    source images are shared between Lambdas with no need to upload/download from S3 • S3 interface is maintained to ease image upload • still need Amazon SageMaker to process image heatmap • using Lambda made execution time for score compute rise from 5 seconds to 12 seconds • overall image scoring is improved from an average of 90 seconds to 12 seconds (queue jobs are processed in parallel by Lambdas) • no costs for idle endpoints • costs reduced by an order of magnitude (reduced load on SageMaker endpoints) • could remove on premise servers (SageMaker endpoints + Inference on Lambda is cost effective)
  19. Check our requirements Solution evaluation ✓ easy to build ✓

    easy to deploy and operate ✓ easy to integrate ✓ implement memorability algorithm (score, heatmap) ✓ serverless (never pay for idle image scoring, scalability improved) ✓ cost-effective ✓ response time within seconds
  20. We were able to build a cost effective image processing

    pipeline for a custom ML model Wrap up • custom ML models can be implemented on AWS Lambda (if no GPU is required or the slow down is acceptable) • Amazon SageMaker is a great alternative, but more expensive (you pay for idle at least one instance) • model training is performed on Amazon SageMaker (consider using spot instances for training) • choose the right computing for your machine learning model requirements • Amazon EFS is a game changer in many applications • be aware of AWS Lambda dependency size when using ML libraries