Upgrade to Pro — share decks privately, control downloads, hide ads and more …

AI in the Cloud, So seductive, but so overwhelming !

dicaormu
November 28, 2019

AI in the Cloud, So seductive, but so overwhelming !

L’IA dans le Cloud constitue l’un des principaux shifts technologiques du moment sur la Data Science. Tous les grands providers de Cloud tiennent maintenant la promesse de démocratiser l'accès à la Data Science et de fournir des outils facilitant le développement et l’industrialisation de modèles de Machine Learning, que l’on soit Data Scientist ou non.

Mais force est de constater que tout n’est pas si simple, et que de nombreux pièges se présentent sur notre chemin pour pouvoir utiliser pleinement le potentiel de l’IA dans le Cloud. Durant ce talk, nous vous présenterons ces principaux pièges et vous proposerons des grandes leçons à en tirer pour faire de vos projets d’IA dans la Cloud un succès.

dicaormu

November 28, 2019
Tweet

More Decks by dicaormu

Other Decks in Technology

Transcript

  1. Do machine learning like the great engineer you are, not

    like the great machine learning expert you aren’t. Martin Zinkevich (Research Scientist at Google) 7
  2. 11 We are in the cloud! • Easily create ML

    workflows • Easy to scale up and down depending on service demand • Service availability and cutting out the high cost of hardware • Access to the last technos, machines and updates • Access to managed services that you can use easily • Pay for what you use
  3. Pipeline of a data science project Identify the problem Collect

    & Explore data Train & test models Deploy & use models 13
  4. 1. Do I need cloud all the time ? •

    Is your use case strategic for the business ? • Is it technically feasible ? Do I need to prototype first ? LESSON 1 Don’t be afraid to launch or test your product locally 14
  5. 2. What are my options? • Is there a managed

    service doing what I want to do ? LESSON 2 Explore existing managed services. It is difficult to create new algorithms. It is easier to use existing ones. 18
  6. 2. What are my options? • Am I a provider

    of the service I’m creating ? • Do I want to be cloud-agnostic ? ◦ If so, what do I have to do to implement the same existing thing ? LESSON 2 Explore existing managed services. It is difficult to create new algorithms. It is easier to use existing ones. 20
  7. 25 GPU Type Virtual GPU Mem (G) Price (hour) p3.2xlarge

    8 61 $3.06 p3.8xlarge 32 244 $12.24 p3.16xlarge 64 488 $24.48 p2.xlarge 4 61 $0.90 p2.8xlarge 32 488 $7.20 p2.16xlarge 64 768 $14.40 Examples of AWS machines
  8. 26 GPU Type Mem/GPU GPUs Price/GPU (hour) V100 16 1,8

    $2.48 P100 16 1,2,4 $1.46 K80 12 1,2,4,8 $0.45 Examples of GCP machines with GPU
  9. 28 A new paradigm for choosing your machines Paradigm #1

    : Storage VS compute Dissociating storage from computing capabilities Computing capabilities become ephemeral
  10. 29 A new paradigm for choosing your machines Paradigm #1

    : Storage VS compute Dissociating storage from computing capabilities Computing capabilities become ephemeral Paradigm #2 : Notebook VS training VS inference Dissociating the machine on which runs the notebook instance from the machines needed for training models and inference
  11. 3. Save energy... and money • Are there machines or

    docker images ready to use ? • Do I have to use all services ? • Do I really need to use the last trendy available GPU ? • Do I even need a GPU ? LESSON 3 Reuse existing images. Embrace new paradigms for machines sizing. 30
  12. • Do you have to let all your notebooks open

    ? ◦ You can version your notebooks in Github with SageMaker • Are you going to save all your models ? ◦ You need to iterate to create your models. Storage has a cost. • Stop everything you don't use ! LESSON 4 Versioning is not just about the code. Manage your notebooks and models like the great craftsman you are. 4. Think about versioning 35
  13. • Pre-tested algorithms/models • Optimisation • External infrastructure • Auto-scalable

    Managed VS In-House • Can be installed / migrated on any provider • Total control of the generated models 38 Managed In house • Vendor lock-in • No control of the training dataset • You need to think about infrastructure • More time in the model and infrastructure optimisation
  14. Choose your customisation needs 39 Custom implementations on dedicated machines

    Custom Algorithms within a platform Dedicated algorithms within a platform Fully Managed Services Flexibility Ease of Use
  15. 5. Benefit from Cloud Provider in-house implementations • On being

    vendor lock-in : Choose your fights ! • Specify your customisation needs and select the best according strategy LESSON 5 There is a tradeoff between delegation and implementation. Focus on your strategic advantage. Don’t reinvent the wheel for the rest. 40
  16. Some days later... 44 I can’t access my endpoint Why

    is it so slow? How much is it going to cost me?
  17. 6. Think about observability • Look for your endpoint status

    • Failed jobs • How many invocations do you have? • Latency • Memory, Disk and Cpu • Logs LESSON 6 You should see all the time what is going on with your AI model, with your pipeline and with the resources you are using. 46
  18. Take Away Before starting your project 48 Do I need

    cloud all the time ? Don’t be afraid to launch or test your product locally.
  19. Take Away Before starting your project 49 What are my

    options ? Explore existing managed services. It is difficult to create new algorithms. It is easier to use existing ones.
  20. Take Away While implementing your project 50 Save energy …

    and money Reuse existing images. Embrace new paradigms for machines sizing
  21. Take Away While implementing your project 51 Think about versionning

    Versioning is not just about the code. Manage your notebooks and models like the great craftsman you are.
  22. Take Away While implementing your project 52 Choose your fights

    There is a tradeoff between delegation and implementation. Focus on your strategic advantage. Don’t reinvent the wheel for the rest.
  23. Take Away While running in production 53 Think about observability

    You should see all the time what is going on with your AI model, with your pipeline and with the resources you are using.
  24. AI in the Cloud is very seductive and can unlock

    many problems. ... But AI in the Cloud, without Software Engineering best practices, is just a buzz word. Diana Ortega & Yoann Benoit (XebiCon 2019) 54