Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building Serverless Machine Learning Models in ...

Alex Casalboni
October 01, 2016
230

Building Serverless Machine Learning Models in the Cloud

[ServerlessConf @ Tokyo - 10/01/2016]

Here I describe the main challenges faced by data scientists involved in deploying machine learning models into real production environments.

I included references/examples of Python libraries and multi-model systems requiring advanced features such as A/B testing and high scalability/availability.

While discussing the limitations of traditional deployment strategies, I demonstrate how serverless computing can simplify your deployment workflow.

Alex Casalboni

October 01, 2016
Tweet

More Decks by Alex Casalboni

Transcript

  1. Building  Serverless   Machine  Learning  models   in  the  Cloud

    clda.co/serverless-­‐tokyo 10/01/2016 |⽇日本
  2. 真のデータサイエンティストとはどんな⼈人か? |⽇日本 What  does  a  real  Data  Scien5st  look  like?

     Data  ScienIst Very  smart  &  curious Numbers  lover  (i.e.  Data) Great  teamwork  skills 40%  analysis,  30%  design,  30%  code clda.co/serverless-­‐tokyo
  3. 機械学習のパイプラインで得られるものは何か? |⽇日本 What’s  the  outcome  of  a  ML  pipeline? +

    + Data   ScienIst Data Time ML   Model Data   VisualisaIon Prototype + + clda.co/serverless-­‐tokyo
  4. 機械学習のパイプラインで得られるものは何か? |⽇日本 What’s  the  outcome  of  a  ML  pipeline? ProducIon

      Code + + Data   ScienIst Data Time ML   Model Data   VisualisaIon Prototype + + clda.co/serverless-­‐tokyo
  5. 機械学習のパイプラインで得られるものは何か? |⽇日本 What’s  the  outcome  of  a  ML  pipeline? +

    + Data   ScienIst Data Time ML   Model Data   VisualisaIon Prototype + + Web   Developer DevOps A  lot  of   Time + + clda.co/serverless-­‐tokyo
  6. オーナーシップの⽋欠如 |⽇日本 The  Lack  of  Ownership != Data  ScienIst DevOps

    Mathema7cal  modeling   Sta7s7cal  analysis   Data  mining (Cloud)  Opera7ons  
 System  administra7on   SoEware  best  prac7ces clda.co/serverless-­‐tokyo
  7. MLaaS  はどうか? |⽇日本 What  about  MLaaS? Machine  Learning  as  a

     Service Data  ExploraIon  does  not  come  for  free Parameters  tuning  becomes  blind  guessing Very  liZle  control  over  your  models You  wouldn’t  need  a  Data  ScienIst  at  all,  but… Not  everybody  likes  black-­‐box  abstracIons clda.co/serverless-­‐tokyo
  8. 伝統的なデプロイ戦略 |⽇日本 Tradi5onal  deployment  strategies 1.  Web-­‐app  controller how  do

     you  update  your  model(s)? same  auth  layer? shared  uWSGI  processes? Simplest  soluIon,  but… clda.co/serverless-­‐tokyo
  9. 伝統的なデプロイ戦略 |⽇日本 Tradi5onal  deployment  strategies 2.  Fleet  of  servers same

     problems  as  before many  more  servers  to  maintain no  elasIcity  (over-­‐provisioning) Bigger  capacity  and  no  code  changes,  but… Load  Balancing clda.co/serverless-­‐tokyo
  10. 伝統的なデプロイ戦略 |⽇日本 Tradi5onal  deployment  strategies 3.  Auto  Scaling sIll  shared

     resources? even  bigger  lack  of  ownership what  about  caching,  versioning  and  auth? ElasIc  and  highly  available,  but… AWS  ELB  +  Auto  Scaling   (or  maybe  ElasIc  Beanstalk?) clda.co/serverless-­‐tokyo
  11. サーバレス機械学習 |⽇日本 Serverless  Machine  Learning Versioning,  staging  &  caching 1

     model  =  1  microservice IntuiIve  RESTful  interface High  Availability  (no  downIme) Very  liZle  operaIonal  effort Transparent  elasIcity  (PAYG) Failure  isolaIon  /  DecentralisaIon Offline  training  phase ProducIon-­‐ready  prototypes A/B  tesIng  through  composiIon clda.co/serverless-­‐tokyo
  12. もっと実践的な例:Cloud  Academyにおけるサーバーレスな機械学習事例 |⽇日本 A  real  example:  Serverless  ML  @  Cloud

     Academy Serverless  ML  @  Cloud  Academy MulI-­‐model  architecture RESTful  interface  for  each  ML  model 1  Lambda  FuncIon  for  each  ML  model S3  +  RDS  for  storage Periodic  training  (offline) clda.co/serverless-­‐tokyo
  13. サーバーレスな機械学習における制約事項 |⽇日本 Limita5ons  of  Serverless  ML clda.co/serverless-­‐tokyo AWS   Lambda

    No  real-­‐Ime  models  (only  pseudo  real-­‐Ime) Deployment  package  management:  size  limit  and  OS  libraries Not  suitable  for  model  training  yet  (5  min  max  execuIon  Ime) Cold  start  Ime  is  long  and  hard  to  avoid Unit/integraIon  tests  help,  but  not  enough