Upgrade to Pro — share decks privately, control downloads, hide ads and more …

GCP for data scientists

2d2dbdf5d060b4c1bb238f8f59185cfb?s=47 Giulia
November 08, 2018

GCP for data scientists

15 minute presentation given at DevFest Toulouse 2018.

An overview of GCP services that can be used to do a ML task shown through a use case.

2d2dbdf5d060b4c1bb238f8f59185cfb?s=128

Giulia

November 08, 2018
Tweet

Transcript

  1. GCP pour les Data Scientists @Giuliabianchl 08 nov 2018

  2. @Giuliabianchl giulbia Data Scientist 2

  3. 3

  4. Google announcement Gcp podcast episode 117 4

  5. 5 Democratising the cloud

  6. What's the cloud 6

  7. Twitter 7

  8. What I know about the cloud • Remote physical machines

    • Managed by someone else → no need to care about the infrastructure and architecture • Pay for usage → not for rent 8
  9. Why 9

  10. Why • Access to theoretically unlimited resources • Rapidity of

    provisioning • Reliability → PROCESS DATA FASTER! 10 How reliable is Google Cloud
  11. What 11

  12. GCP tools for machine learning From conceptdraw Cloud AutoML Vision

    12
  13. Cloud ML engine ml-engine-docs 13

  14. How 14

  15. Machine Learning in one slide • Automatically learn a set

    rules from a known dataset → TRAINING • Assess new data → PREDICTION • Data have to be processed before training and prediction → FEATURE ENGINEERING 15
  16. Baby cry project • Train a model to detect a

    baby crying and start a lullaby if need be • Feature engineering and model training locally • Recording, feature engineering and prediction on Raspberry Pi 16 giulbia/baby_cry_detection https://www.youtube.com/watch?v=N-LXrheCIKM
  17. Baby cry project • Train a model to detect a

    baby crying and start a lullaby if need be • Feature engineering and model training locally • Recording, feature engineering and prediction on Raspberry Pi giulbia/baby_cry_detection https://www.youtube.com/watch?v=N-LXrheCIKM 17 It takes about 45 seconds...
  18. The intuition • Recording on Raspberry Pi • Feature engineering

    and prediction in the cloud 18
  19. 19

  20. 20 FE → feature engineering P → prediction Maj. vote

    & Final pred. → final prediction is positive iff at least 3 subsequences have a positive prediction Majority vote Final prediction FE FE FE FE FE P P P P P
  21. 21 FE FE FE FE FE P P P P

    P Majority vote Final prediction QUESTIONS - Dependencies? - How to trigger each step? - How to send the answer back for the Raspberry Pi?
  22. 22 P P P P P FE P Majority vote

    Final prediction MV + FP FE FE FE FE FE
  23. The code that works 23 giulbia/baby_cry_rpi giulbia/gcp-rpi

  24. • Background function • Trigger type → Cloud Storage ◦

    Bucket → parenting-3-recording ◦ Event type → Finalize/Create • Dependencies → requirements.txt • Runtime → Python 3.7 • Easy access to Cloud ML Engine API The code that works 24 giulbia/baby_cry_rpi giulbia/gcp-rpi
  25. 25 giulbia/baby_cry_rpi giulbia/gcp-rpi It takes about 5 seconds!

  26. Conclusion 26

  27. • Huge potential for data science • A data scientist

    can help to exploit it at its most • It’s not data scientist proof yet 27
  28. Les questions c'est dans le couloir, merci Icones from kisspng

    or made by Freepik from flaticon 28 @Giuliabianchl
  29. Annex • Online resources • MOOC Coursera - Data Engineering

    on Google Cloud Platform • GCP documentation • Google Cloud Platform Podcast 29