Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Cloud Academy & AWS: how we use Amazon Web Services for machine learning and data collection

Cloud Academy & AWS: how we use Amazon Web Services for machine learning and data collection

Speak with Alex Casalboni, Roberto Turrin and Luca Baroffio in our Engineering team at Cloud Academy, and learn how they use AWS to manage daily challenges and build a machine learning system.

Alex Casalboni

April 27, 2016
Tweet

More Decks by Alex Casalboni

Other Decks in Technology

Transcript

  1. Cloud  Academy  &  AWS:   how  we  use  Amazon  Web

     Services   for  machine  learning  and  data  collec:on cloudacademy.com 4/27/2016
  2. About  us Alex  Casalboni Roberto  Turrin Luca  Baroffio Sr.  SoCware

     Engineer Sr.  Data  Scien:st  (PhD)  Data  Scien:st  (PhD) @alex_casalboni @robytur @lucabaroffio clda.co/webinar-ML
  3. What  is  Machine  Learning  (ML)? Back  to  1959  (A.  Samuel)

    Decision  problems  that     can  be  modeled  from  data clda.co/webinar-ML
  4. Machine  Learning  pipeline Training Predic1on batch real-­‐:me Feature   extrac1on

    batch data informaGon features ML  models clda.co/webinar-ML
  5. ? Machine  Learning  taxonomy classifica3on regression 170
 cm Supervised  

      Learning Unsupervised     Learning clda.co/webinar-ML
  6. Machine  Learning  taxonomy clustering rule  extrac3on group A group B

    A, B C Supervised     Learning Unsupervised     Learning clda.co/webinar-ML
  7. What  problems  can  ML  solve  for  you? Supervised    

    Learning Unsupervised     Learning classifica'on regression clustering rule  extrac'on ? 170
 cm gro gro A, B C clda.co/webinar-ML
  8. What  problems  can  ML  solve  for  you? Supervised    

    Learning Unsupervised     Learning classifica'on regression clustering rule  extrac'on ? fraud  detecGon 170
 cm gro gro A, B C price  of  a  stock  over  Gme purchase  likelihood user  segmentaGon clda.co/webinar-ML
  9. Learning Data Machine Cloud Big Science Information Internet Statistics Technology

    Python Future Mining Social Deep IOT Algorithms Management Storage Petabytes Parallel Network Privacy Million NoSQL PaaS SQL Database Exabytes Billion Dataset Hadoop R clda.co/webinar-ML
  10. Machine  learning  and  Big  data “90%  of  the  data  in

     the  world  today  has  been     created  in  the  last  two  years  alone”  -­‐  IBM “300+  hours  worth  of  video  content  is  being     uploaded  to  the  site  every  minute”  -­‐  Youtube clda.co/webinar-ML
  11. Big  data  challenges clda.co/webinar-ML This  much  data  can’t  be  manually

     inspected Data-­‐driven  decisions Distributed/parallel  compuGng The  curse  of  dimensionality
  12. Why  is  deploying  ML  models  a  challenge? 1.  Prototyping  !=

     ProducGon-­‐ready 2.  We  need  ElasGcity 4.  Avoid  lack  of  ownership clda.co/webinar-ML 3.  Too  many  nice-­‐to-­‐have  features
  13. Where  is  the  lack  of  ownership? clda.co/webinar-ML != Data  ScienGst

    DevOps Machine  Learning   Data  mining   Sta:s:cal  analysis System  administra:on   (Cloud)  Opera:ons   SoCware  engineering
  14. Many  op:ons  and  tools  offered  by  AWS ELB Auto  Scaling

    Elas:c   Beanstalk Amazon   ML ECS EMR Lambda EC2 API   Gateway clda.co/webinar-ML
  15. Serverless  compu:ng  to  the  rescue! Transparent  scalability,  elasGcity  and  availability

    Developer-­‐friendly  maintenance  (versioning  +  aliases) AWS   Lambda Event-­‐driven  approach  &  never  pay  for  idle 1  funcGon  =  1  model clda.co/webinar-ML A/B  tesGng  via  composiGon
  16. How  is  “Serverless”  possible? There is always a server somewhere,


    you just don't have to worry about it :) clda.co/webinar-ML
  17. AWS  Lambda  +  Amazon  API  Gateway + AWS   Lambda

    API   Gateway RESTful  &  auth  layer Global  CDN  and  caching  (CloudFront) Staging  &  versioning  &  mocking API  Decoupling clda.co/webinar-ML
  18. AWS  Lambda  limita:ons clda.co/webinar-ML No  real-­‐Gme  models  (only  pseudo  real-­‐Gme)

    Deployment  package  management:  size  limit  and  OS  libraries Not  suitable  for  model  training  yet  (5  min  max  execuGon  Gme) AWS   Lambda
  19. What  about  Amazon  Machine  Learning? clda.co/webinar-ML Amazon   ML One

     of  the  first  MLaaS  soluGons  (1  year  old) Great  service  for  classificaGon  and  regression Only  linear  models  (linear  &  logisGc  regression  +  SGD) No  support  for  advanced  scenarios  yet     (collaboraGve  recommendaGon,  mulGmedia,  online  learning,  etc.)
  20. Key  Takeaways clda.co/webinar-ML Data-­‐driven  decision  and  user-­‐centered  ML  will  make

     your  product  smarter Maximize  ownership  by  removing  obstacles  btw  prototype  and  producGon Eliminate  tradeoffs  btw  high-­‐scalability  and  nice-­‐to-­‐have  features Go  Serverless  and  stop  worrying  about  Ops MLaaS  makes  your  life  even  simpler,  unless  you  need  more  control