Slide 1

Slide 1 text

GCP pour les Data Scientists @Giuliabianchl 08 nov 2018

Slide 2

Slide 2 text

@Giuliabianchl giulbia Data Scientist 2

Slide 3

Slide 3 text

3

Slide 4

Slide 4 text

Google announcement Gcp podcast episode 117 4

Slide 5

Slide 5 text

5 Democratising the cloud

Slide 6

Slide 6 text

What's the cloud 6

Slide 7

Slide 7 text

Twitter 7

Slide 8

Slide 8 text

What I know about the cloud ● Remote physical machines ● Managed by someone else → no need to care about the infrastructure and architecture ● Pay for usage → not for rent 8

Slide 9

Slide 9 text

Why 9

Slide 10

Slide 10 text

Why ● Access to theoretically unlimited resources ● Rapidity of provisioning ● Reliability → PROCESS DATA FASTER! 10 How reliable is Google Cloud

Slide 11

Slide 11 text

What 11

Slide 12

Slide 12 text

GCP tools for machine learning From conceptdraw Cloud AutoML Vision 12

Slide 13

Slide 13 text

Cloud ML engine ml-engine-docs 13

Slide 14

Slide 14 text

How 14

Slide 15

Slide 15 text

Machine Learning in one slide ● Automatically learn a set rules from a known dataset → TRAINING ● Assess new data → PREDICTION ● Data have to be processed before training and prediction → FEATURE ENGINEERING 15

Slide 16

Slide 16 text

Baby cry project ● Train a model to detect a baby crying and start a lullaby if need be ● Feature engineering and model training locally ● Recording, feature engineering and prediction on Raspberry Pi 16 giulbia/baby_cry_detection https://www.youtube.com/watch?v=N-LXrheCIKM

Slide 17

Slide 17 text

Baby cry project ● Train a model to detect a baby crying and start a lullaby if need be ● Feature engineering and model training locally ● Recording, feature engineering and prediction on Raspberry Pi giulbia/baby_cry_detection https://www.youtube.com/watch?v=N-LXrheCIKM 17 It takes about 45 seconds...

Slide 18

Slide 18 text

The intuition ● Recording on Raspberry Pi ● Feature engineering and prediction in the cloud 18

Slide 19

Slide 19 text

19

Slide 20

Slide 20 text

20 FE → feature engineering P → prediction Maj. vote & Final pred. → final prediction is positive iff at least 3 subsequences have a positive prediction Majority vote Final prediction FE FE FE FE FE P P P P P

Slide 21

Slide 21 text

21 FE FE FE FE FE P P P P P Majority vote Final prediction QUESTIONS - Dependencies? - How to trigger each step? - How to send the answer back for the Raspberry Pi?

Slide 22

Slide 22 text

22 P P P P P FE P Majority vote Final prediction MV + FP FE FE FE FE FE

Slide 23

Slide 23 text

The code that works 23 giulbia/baby_cry_rpi giulbia/gcp-rpi

Slide 24

Slide 24 text

● Background function ● Trigger type → Cloud Storage ○ Bucket → parenting-3-recording ○ Event type → Finalize/Create ● Dependencies → requirements.txt ● Runtime → Python 3.7 ● Easy access to Cloud ML Engine API The code that works 24 giulbia/baby_cry_rpi giulbia/gcp-rpi

Slide 25

Slide 25 text

25 giulbia/baby_cry_rpi giulbia/gcp-rpi It takes about 5 seconds!

Slide 26

Slide 26 text

Conclusion 26

Slide 27

Slide 27 text

● Huge potential for data science ● A data scientist can help to exploit it at its most ● It’s not data scientist proof yet 27

Slide 28

Slide 28 text

Les questions c'est dans le couloir, merci Icones from kisspng or made by Freepik from flaticon 28 @Giuliabianchl

Slide 29

Slide 29 text

Annex ● Online resources ● MOOC Coursera - Data Engineering on Google Cloud Platform ● GCP documentation ● Google Cloud Platform Podcast 29