Slide 1

Slide 1 text

Getting into Data Science @MarcoBonzanini Hisar Coding Summit 2021

Slide 2

Slide 2 text

Nice to meet you • Data Science consultant: Natural Language Processing, Machine Learning, Data Engineering • Corporate training: Python + Data Science • PyData London chairperson 2

Slide 3

Slide 3 text

My Goals for Today • Answer some questions: What is Data Science? What do Data Scientists do? (and more) • Inspire some of you to learn more about Data Science 3

Slide 4

Slide 4 text

WHAT IS DATA SCIENCE

Slide 5

Slide 5 text

5

Slide 6

Slide 6 text

6

Slide 7

Slide 7 text

Data Value 🦄 7

Slide 8

Slide 8 text

Data Value 🦄 ??? 8

Slide 9

Slide 9 text

Value Insights Decision making Data products { 9

Slide 10

Slide 10 text

10

Slide 11

Slide 11 text

11

Slide 12

Slide 12 text

12

Slide 13

Slide 13 text

Data Value 🦄 ??? 13

Slide 14

Slide 14 text

Coding Math Modelling Visualisation Reporting { 🦄 14

Slide 15

Slide 15 text

Source: Doing Data Science (Cathy O’Neil & Rachel Schutt, 2013) Raw
 Data Processing
 Data Clean
 Data Exploratory
 Analysis Models &
 Algorithms Communicate
 Visualise
 Report Data
 Product Decision
 Making 15

Slide 16

Slide 16 text

Source: https://medium.com/hackernoon/the-ai-hierarchy-of-needs-18f111fcc007 16

Slide 17

Slide 17 text

Source: http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram 17

Slide 18

Slide 18 text

Computer Science? 18

Slide 19

Slide 19 text

Software Engineering? 19

Slide 20

Slide 20 text

Business Intelligence? 20

Slide 21

Slide 21 text

Do you need 3 PhD? 21

Slide 22

Slide 22 text

🤝 Computer
 Science Statistics Domain Expertise 22

Slide 23

Slide 23 text

ARE WE ALL DATA SCIENTISTS?

Slide 24

Slide 24 text

Source: https://medium.com/hackernoon/the-ai-hierarchy-of-needs-18f111fcc007 24

Slide 25

Slide 25 text

25 Software Engineers Data Engineers

Slide 26

Slide 26 text

26 ML Engineers

Slide 27

Slide 27 text

27 Data Analysts
 Business Analysts

Slide 28

Slide 28 text

28 Data Scientists

Slide 29

Slide 29 text

29 Research Scientists

Slide 30

Slide 30 text

30 Data Scientists
 at small corp

Slide 31

Slide 31 text

DATA SCIENCE APPLICATIONS

Slide 32

Slide 32 text

Weather 32

Slide 33

Slide 33 text

33 https://www.youtube.com/watch?v=_3sVA-_zIrc

Slide 34

Slide 34 text

Healthcare 34

Slide 35

Slide 35 text

35 https://www.youtube.com/watch?v=B5n8Uavhl00

Slide 36

Slide 36 text

Biology 36

Slide 37

Slide 37 text

37 https://www.youtube.com/watch?v=_9x4cmQWZ6g

Slide 38

Slide 38 text

Journalism 38

Slide 39

Slide 39 text

39 https://www.youtube.com/watch?v=yPJhj855tvQ

Slide 40

Slide 40 text

Product Recommendations 40

Slide 41

Slide 41 text

41 https://www.youtube.com/watch?v=qpbELUmbDIk

Slide 42

Slide 42 text

Other Cool Stuff 42

Slide 43

Slide 43 text

43 https://www.youtube.com/watch?v=3k96HLqvhc0

Slide 44

Slide 44 text

44 https://www.youtube.com/watch?v=S9PcPbtTcPc

Slide 45

Slide 45 text

45 https://www.youtube.com/watch?v=UzGZtgu3PBM

Slide 46

Slide 46 text

DATA SCIENCE SKILLS

Slide 47

Slide 47 text

Getting into Data Science 47

Slide 48

Slide 48 text

Getting into Data Science 48 You 🦄 Data Scientist

Slide 49

Slide 49 text

Getting into Data Science 49 You 🦄 Data Scientist What they tell you

Slide 50

Slide 50 text

Getting into Data Science 50 You 🦄 Data Scientist How it feels like

Slide 51

Slide 51 text

Getting into Data Science 51 “It depends”

Slide 52

Slide 52 text

Getting into Data Science 52 Where are you? Where do you want to go?

Slide 53

Slide 53 text

🤝 Computer
 Science Statistics Domain Expertise 53

Slide 54

Slide 54 text

Computer Science 54

Slide 55

Slide 55 text

Computer Science 55 • Basic coding in 1 language (e.g. Python) • Data Manipulation • Optional: out-of-the-box Machine Learning • Database technologies (e.g. SQL, NoSQL, etc) • “Behind the scenes” of Machine Learning • More programming languages (R, Scala, …), data processing tools (Spark, Elasticsearch, …), and other shiny toys Start Next

Slide 56

Slide 56 text

Math / Stats 56

Slide 57

Slide 57 text

Math / Stats 57 • Basic descriptive statistics • Data visualisation techniques • Linear algebra (vector/matrix computation) • Calculus • Mathematical optimisation • More advanced probability / stats Start Next

Slide 58

Slide 58 text

Domain Expertise 58

Slide 59

Slide 59 text

Domain Expertise 59 • Speak the language • Basic data analysis • Deeper domain understanding • Communicate with business stakeholders (non-technical roles) Start Next

Slide 60

Slide 60 text

Soft Skills 60

Slide 61

Slide 61 text

Soft Skills • They should be called “Core Skills” really • Communication • Story telling • Problem solving • Learning to learn 61

Slide 62

Slide 62 text

What’s Next 62

Slide 63

Slide 63 text

What’s Next 1. Find a topic you like 2. Find a dataset about the topic * 3. 63 🦄 * links at the end

Slide 64

Slide 64 text

SUMMARY

Slide 65

Slide 65 text

Summary • Data Science lets you work in any domain • What kind of data scientist do you want to be? • You don’t need to be an expert in everything 65

Slide 66

Slide 66 text

Resources • Datasets: kaggle.com • Datasets: archive.ics.uci.edu • Datasets: “awesome data” on GitHub.com • Book: Doing Data Science (O’Neil and Schute) • Videos: youtube.com/user/PyDataTV 66

Slide 67

Slide 67 text

Thank You • Twitter: @MarcoBonzanini • Blog: marcobonzanini.com • Newsletter: marcobonzanini.com/newsletter • Questions? 67