Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How to successfully become a data scientist

How to successfully become a data scientist

Talk presented at Code.Talks Hamburg & Global DevSlam Dubai

Recording: https://youtu.be/LbBmvOj4o7M?si=gLF3Gs2Lk_qJSGe_

It’s been 15 years since the term data scientist has become one of the most sought-after professions. Nevertheless, if you ask a lot of data scientists what their profession is you will get very different answers, which will mostly depend on the kinds of companies they work for.

So how does one learn Data Science when the definition of the field is open to interpretation? Or, when putting together all the job descriptions, in order to become a Data Scientist one would need to know all the theory, and new approaches and be able to use hundreds of tools.

In this talk, we will explore the fundamentals needed to be a data scientist, from the perspective of theory, tooling, and approaches. We will talk about some of the common misconceptions people starting to learn data science have. And about some of the reframing that I have seen successful learners have done on their path to data science.

And what role do large language models play in the future of learning data science?

Avatar for Tereza Iofciu

Tereza Iofciu

November 11, 2023
Tweet

More Decks by Tereza Iofciu

Other Decks in Technology

Transcript

  1. HELP WANTED - SURVEY TIME WHICH SKILLS DID YOU NOTICE

    IN DATA PEOPLE YOU ENJOYED WORKING WITH? HTTP://ETC.CH/Q7YT
  2. AN “OFFICIAL" PROFESSION SINCE 2008 WHAT IS A DATA SCIENTIST?

    DJ Patil and Je ff Hammerbacher of LinkedIn and Facebook made "Data Scientist" an o ff icial buzzword. They were looking for a job title that didn’t sound too Wall Street (Data Analyst) nor too academic (researcher). HTTP://ETC.CH/Q7YT
  3. AN “OFFICIAL" PROFESSION SINCE 2008 WHAT IS A DATA SCIENTIST?

    DJ Patil and Je ff Hammerbacher of LinkedIn and Facebook made "Data Scientist" an o ff icial buzzword. They were looking for a job title that didn’t sound too Wall Street (Data Analyst) nor too academic (researcher). “THE TITLE SOUNDS SOPHISTICATED AND JUST VAGUE ENOUGH TO TRANSCEND INDUSTRIES AND BE TAKEN SERIOUSLY, EVEN BY PEOPLE WHO HAVE NO IDEA WHAT IT IS.”
  4. FIRST USED IN 1974 IN “THE CONCISE SURVEY OF COMPUTER

    METHODS” WHAT IS DATA SCIENCE? Peter Naur de f ined Data Science as "THE USEFULNESS OF DATA AND DATA PROCESSES DERIVES FROM THEIR APPLICATION IN BUILDING AND HANDLING MODELS OF REALITY.”
  5. SUCCESSFULLY BECOMING A DATA SCIENTIST DO NOT TRY TO BE

    A UNICORN… NO MATTER WHAT THE T-SHIRTS SAY
  6. HELP WANTED - SURVEY TIME WHICH SKILLS DID YOU NOTICE

    IN DATA PEOPLE YOU ENJOYED WORKING WITH? HTTP://ETC.CH/Q7YT
  7. /IN/TEREZA-IOFCIU SOME PEOPLE THINK… YOU NEED TO BE A THEORETICAL

    EXPERT Lots of interviews still ask theoretical questions … Conceptual understanding & generalist mindset -> more useful on the job
  8. /IN/TEREZA-IOFCIU SOME PEOPLE THINK… YOU NEED TO BE A PROGRAMMING

    EXPERT - The speed of new tools in DS space is only increasing - Focus on the basic - If you learned those from scratch -> helpful with knowledge transfer to another stack
  9. /IN/TEREZA-IOFCIU SOME PEOPLE THINK… YOU NEED TO BE A UNICORN,

    EXCELLENT AT PROGRAMMING, STATISTICS AND LINEAR ALGEBRA AND DATABASES/SQL, VISUALISATION AND KNOW A LOT ABOUT INDUSTRY - not really.. be good at 1-2 skills to get started and know you need to care about all. DS is a team e ff ort. - you can seem mediocre, you do need though good communication skills and examples of how you are e ff ective in a team or facing a new problem. Practice working in team projects.
  10. /IN/TEREZA-IOFCIU SOME PEOPLE THINK… YOU NEED TO TRY 100 MODELS

    TO PRODUCE A USEFUL SOLUTION This kind of work can really be replaced by autoML or LLM based tools in the future Practice interpreting your data and your charts, build and use domain and statistics knowledge by asking yourself questions
  11. /IN/TEREZA-IOFCIU SOME PEOPLE THINK… YOU NEED TO SHOW ALL THE

    THINGS YOU KNOW - Hardly ever.. you need to show how you understand and tackle a problem with one approach, create a useful baseline, work iteratively towards a better solution - In an interview, you have limited time, ask what would be most important: speed, impact, tech debt and suggest a solution addressing that
  12. Understand distributions Understand insights from other specialists, develop your analytic

    business knowledge You often start with getting data, cleaning data, doing EDA, and feature engineering. You get better at these when you understand your data and the business You might need to design an AB test to validate your solution /IN/TEREZA-IOFCIU THEORY - DATA STATISTICS & PROBABILITY
  13. Really understand the basic models: linear and logistic regression, random

    forest, gradient boosting, ridge regression .. rather than rushing to LLMs and NN Decision-making in industry is mostly performed based on tabular data.. needing models that are not just accurate, but also e ff icient and interpretable /IN/TEREZA-IOFCIU THEORY - ML BASICS ARE ESSENTIAL
  14. Strong 🐍 python skills .. beyond the notebook, including code

    testing and code reviews Essential e ff ective pandas, numpy, scipy and matplotlib or other viz libraries /IN/TEREZA-IOFCIU TOOLING BASICS ARE ESSENTIAL
  15. Data is cross-functional. No matter what, you will need to

    be able to • Clearly gather requirements and scope your problem • Use storytelling and viz to demonstrate impact • Get projects/budgets approved /IN/TEREZA-IOFCIU TOOLING COMMUNICATION IS 🔑
  16. PICK AN APPROACH THAT WORKS FOR YOU AND YOUR EXPERIENCE

    / OR A COMBINATION GETTING STARTED Short programs Long programs / degrees Self - learning
  17. THANK YOU Special thanks to Noa Tamir for helping out

    with this talk 🫶 @VIS.SOCIAL@TEREZAIF /IN/TEREZA-IOFCIU 📧 [email protected] ICONS FROM THENOUNPROJECT & ICONS8