Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Getting into Data Science @ HisarCS 2021

Getting into Data Science @ HisarCS 2021

Slides for my introduction to Data Science given at the Hisar Coding Summit 2021: http://event.hisarcs.com/en.html

Talk description:
This is a brief introduction to Data Science with an overview on some interesting problems that it can tackle. The talk is mainly aimed at students who are interested in knowing more about Data Science, and it will try to answer the question "what do data scientists do all day?", in order to offer some insights on whether you should consider a career in Data Science and how to start building your Data Science skill set.

Aa38bb7a9c35bc414da6ec7dcd8d7339?s=128

Marco Bonzanini

April 16, 2021
Tweet

Transcript

  1. Getting into Data Science @MarcoBonzanini Hisar Coding Summit 2021

  2. Nice to meet you • Data Science consultant: Natural Language

    Processing, Machine Learning, Data Engineering • Corporate training: Python + Data Science • PyData London chairperson 2
  3. My Goals for Today • Answer some questions: What is

    Data Science? What do Data Scientists do? (and more) • Inspire some of you to learn more about Data Science 3
  4. WHAT IS DATA SCIENCE

  5. 5

  6. 6

  7. Data Value 🦄 7

  8. Data Value 🦄 ??? 8

  9. Value Insights Decision making Data products { 9

  10. 10

  11. 11

  12. 12

  13. Data Value 🦄 ??? 13

  14. Coding Math Modelling Visualisation Reporting { 🦄 14

  15. Source: Doing Data Science (Cathy O’Neil & Rachel Schutt, 2013)

    Raw
 Data Processing
 Data Clean
 Data Exploratory
 Analysis Models &
 Algorithms Communicate
 Visualise
 Report Data
 Product Decision
 Making 15
  16. Source: https://medium.com/hackernoon/the-ai-hierarchy-of-needs-18f111fcc007 16

  17. Source: http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram 17

  18. Computer Science? 18

  19. Software Engineering? 19

  20. Business Intelligence? 20

  21. Do you need 3 PhD? 21

  22. 🤝 Computer
 Science Statistics Domain Expertise 22

  23. ARE WE ALL DATA SCIENTISTS?

  24. Source: https://medium.com/hackernoon/the-ai-hierarchy-of-needs-18f111fcc007 24

  25. 25 Software Engineers Data Engineers

  26. 26 ML Engineers

  27. 27 Data Analysts
 Business Analysts

  28. 28 Data Scientists

  29. 29 Research Scientists

  30. 30 Data Scientists
 at small corp

  31. DATA SCIENCE APPLICATIONS

  32. Weather 32

  33. 33 https://www.youtube.com/watch?v=_3sVA-_zIrc

  34. Healthcare 34

  35. 35 https://www.youtube.com/watch?v=B5n8Uavhl00

  36. Biology 36

  37. 37 https://www.youtube.com/watch?v=_9x4cmQWZ6g

  38. Journalism 38

  39. 39 https://www.youtube.com/watch?v=yPJhj855tvQ

  40. Product Recommendations 40

  41. 41 https://www.youtube.com/watch?v=qpbELUmbDIk

  42. Other Cool Stuff 42

  43. 43 https://www.youtube.com/watch?v=3k96HLqvhc0

  44. 44 https://www.youtube.com/watch?v=S9PcPbtTcPc

  45. 45 https://www.youtube.com/watch?v=UzGZtgu3PBM

  46. DATA SCIENCE SKILLS

  47. Getting into Data Science 47

  48. Getting into Data Science 48 You 🦄 Data Scientist

  49. Getting into Data Science 49 You 🦄 Data Scientist What

    they tell you
  50. Getting into Data Science 50 You 🦄 Data Scientist How

    it feels like
  51. Getting into Data Science 51 “It depends”

  52. Getting into Data Science 52 Where are you? Where do

    you want to go?
  53. 🤝 Computer
 Science Statistics Domain Expertise 53

  54. Computer Science 54

  55. Computer Science 55 • Basic coding in 1 language (e.g.

    Python) • Data Manipulation • Optional: out-of-the-box Machine Learning • Database technologies (e.g. SQL, NoSQL, etc) • “Behind the scenes” of Machine Learning • More programming languages (R, Scala, …), data processing tools (Spark, Elasticsearch, …), and other shiny toys Start Next
  56. Math / Stats 56

  57. Math / Stats 57 • Basic descriptive statistics • Data

    visualisation techniques • Linear algebra (vector/matrix computation) • Calculus • Mathematical optimisation • More advanced probability / stats Start Next
  58. Domain Expertise 58

  59. Domain Expertise 59 • Speak the language • Basic data

    analysis • Deeper domain understanding • Communicate with business stakeholders (non-technical roles) Start Next
  60. Soft Skills 60

  61. Soft Skills • They should be called “Core Skills” really

    • Communication • Story telling • Problem solving • Learning to learn 61
  62. What’s Next 62

  63. What’s Next 1. Find a topic you like 2. Find

    a dataset about the topic * 3. 63 🦄 * links at the end
  64. SUMMARY

  65. Summary • Data Science lets you work in any domain

    • What kind of data scientist do you want to be? • You don’t need to be an expert in everything 65
  66. Resources • Datasets: kaggle.com • Datasets: archive.ics.uci.edu • Datasets: “awesome

    data” on GitHub.com • Book: Doing Data Science (O’Neil and Schute) • Videos: youtube.com/user/PyDataTV 66
  67. Thank You • Twitter: @MarcoBonzanini • Blog: marcobonzanini.com • Newsletter:

    marcobonzanini.com/newsletter • Questions? 67