Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building Serverless Machine Learning Models in the Cloud

Alex Casalboni
October 01, 2016
230

Building Serverless Machine Learning Models in the Cloud

[ServerlessConf @ Tokyo - 10/01/2016]

Here I describe the main challenges faced by data scientists involved in deploying machine learning models into real production environments.

I included references/examples of Python libraries and multi-model systems requiring advanced features such as A/B testing and high scalability/availability.

While discussing the limitations of traditional deployment strategies, I demonstrate how serverless computing can simplify your deployment workflow.

Alex Casalboni

October 01, 2016
Tweet

Transcript

  1. Building  Serverless   Machine  Learning  models   in  the  Cloud

    clda.co/serverless-­‐tokyo 10/01/2016 |⽇日本
  2. 真のデータサイエンティストとはどんな⼈人か? |⽇日本 What  does  a  real  Data  Scien5st  look  like?

     Data  ScienIst Very  smart  &  curious Numbers  lover  (i.e.  Data) Great  teamwork  skills 40%  analysis,  30%  design,  30%  code clda.co/serverless-­‐tokyo
  3. 機械学習のパイプラインで得られるものは何か? |⽇日本 What’s  the  outcome  of  a  ML  pipeline? +

    + Data   ScienIst Data Time ML   Model Data   VisualisaIon Prototype + + clda.co/serverless-­‐tokyo
  4. 機械学習のパイプラインで得られるものは何か? |⽇日本 What’s  the  outcome  of  a  ML  pipeline? ProducIon

      Code + + Data   ScienIst Data Time ML   Model Data   VisualisaIon Prototype + + clda.co/serverless-­‐tokyo
  5. 機械学習のパイプラインで得られるものは何か? |⽇日本 What’s  the  outcome  of  a  ML  pipeline? +

    + Data   ScienIst Data Time ML   Model Data   VisualisaIon Prototype + + Web   Developer DevOps A  lot  of   Time + + clda.co/serverless-­‐tokyo
  6. オーナーシップの⽋欠如 |⽇日本 The  Lack  of  Ownership != Data  ScienIst DevOps

    Mathema7cal  modeling   Sta7s7cal  analysis   Data  mining (Cloud)  Opera7ons  
 System  administra7on   SoEware  best  prac7ces clda.co/serverless-­‐tokyo
  7. MLaaS  はどうか? |⽇日本 What  about  MLaaS? Machine  Learning  as  a

     Service Data  ExploraIon  does  not  come  for  free Parameters  tuning  becomes  blind  guessing Very  liZle  control  over  your  models You  wouldn’t  need  a  Data  ScienIst  at  all,  but… Not  everybody  likes  black-­‐box  abstracIons clda.co/serverless-­‐tokyo
  8. 伝統的なデプロイ戦略 |⽇日本 Tradi5onal  deployment  strategies 1.  Web-­‐app  controller how  do

     you  update  your  model(s)? same  auth  layer? shared  uWSGI  processes? Simplest  soluIon,  but… clda.co/serverless-­‐tokyo
  9. 伝統的なデプロイ戦略 |⽇日本 Tradi5onal  deployment  strategies 2.  Fleet  of  servers same

     problems  as  before many  more  servers  to  maintain no  elasIcity  (over-­‐provisioning) Bigger  capacity  and  no  code  changes,  but… Load  Balancing clda.co/serverless-­‐tokyo
  10. 伝統的なデプロイ戦略 |⽇日本 Tradi5onal  deployment  strategies 3.  Auto  Scaling sIll  shared

     resources? even  bigger  lack  of  ownership what  about  caching,  versioning  and  auth? ElasIc  and  highly  available,  but… AWS  ELB  +  Auto  Scaling   (or  maybe  ElasIc  Beanstalk?) clda.co/serverless-­‐tokyo
  11. サーバレス機械学習 |⽇日本 Serverless  Machine  Learning Versioning,  staging  &  caching 1

     model  =  1  microservice IntuiIve  RESTful  interface High  Availability  (no  downIme) Very  liZle  operaIonal  effort Transparent  elasIcity  (PAYG) Failure  isolaIon  /  DecentralisaIon Offline  training  phase ProducIon-­‐ready  prototypes A/B  tesIng  through  composiIon clda.co/serverless-­‐tokyo
  12. もっと実践的な例:Cloud  Academyにおけるサーバーレスな機械学習事例 |⽇日本 A  real  example:  Serverless  ML  @  Cloud

     Academy Serverless  ML  @  Cloud  Academy MulI-­‐model  architecture RESTful  interface  for  each  ML  model 1  Lambda  FuncIon  for  each  ML  model S3  +  RDS  for  storage Periodic  training  (offline) clda.co/serverless-­‐tokyo
  13. サーバーレスな機械学習における制約事項 |⽇日本 Limita5ons  of  Serverless  ML clda.co/serverless-­‐tokyo AWS   Lambda

    No  real-­‐Ime  models  (only  pseudo  real-­‐Ime) Deployment  package  management:  size  limit  and  OS  libraries Not  suitable  for  model  training  yet  (5  min  max  execuIon  Ime) Cold  start  Ime  is  long  and  hard  to  avoid Unit/integraIon  tests  help,  but  not  enough