Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Getting a job in data science

lanzani
January 18, 2017

Getting a job in data science

The 21th century sexiest job sure seems... sexy. Data scientists are more in demand than ever and it's hard to find a cv of a newcomer that doesn't say Kaggle or Machine Learning (Udacity/Coursera/Insert your own MOOC here). But what do companies really need and how can you exceed their need? A hand wavy introduction to the job market in applied data science (in the Dutch landscape)

lanzani

January 18, 2017
Tweet

More Decks by lanzani

Other Decks in Technology

Transcript

  1. SO YOU WANT TO WORK IN DATA
    SCIENCE
    Giovanni Lanzani
    @gglanzani

    View Slide

  2. WHO AM I
    • Imported from Italy ca. 2006
    • Master & PhD in Theoretical Physics in Leiden (2006-2012)
    • Consultant Software Quality @ KPMG (2012-2013)
    • Data Whisperer @ GoDataDriven (2013-2016)
    • Chief Science Officer @ GoDataDriven (2016-…)
    • Father of 5

    View Slide

  3. View Slide

  4. WHAT DO YOU WANT
    • Stimulating environment
    • Great team
    • Space to experiment and to grow
    • Develop yourself
    • Learn new things
    • Salary (?)

    View Slide

  5. WHAT COMPANIES (THOUGHT THEY)
    WANTED
    • All the things big data
    • Predictive modeling & Advanced Analytics
    • Make moar money
    • Do all the cool things the others are doing

    View Slide

  6. THREE KIND OF COMPANIES
    • Heavy R&D department
    • Tech company, software driven, internet first (only)
    • All the others
    • Speak about the latter (majority)

    View Slide

  7. WHAT COMPANIES GOT
    • A lot of POCs
    • A lot of screenshots/presentations/dashboards on a laptop
    • Extra mouths to feed with no returns
    • Nice stories to tell to their network, about those screenshots and especially
    those dashboards
    • Headaches with data and infra even more scattered

    View Slide

  8. BUT…
    • We got a data scientist working on trees, and forests
    • Neural networks
    • Deeply convoluted neural networks
    • Deep learning!
    • All the above, and more, is taught in popular MOOCs

    View Slide

  9. WHAT DO COMPANIES ACTUALLY NEED
    • Put things into production
    • They don’t teach that in any data science MOOC (that I know)

    View Slide

  10. JOB MARKET 2016: US
    • Ask HN: What's the state of the job market in data science and machine learning?
    • https://news.ycombinator.com/item?id=13232883
    • The supply-demand dynamics have changed a lot in the last couple years.
    • Two groups: people with work experience + strong software development skills, and those without
    • The first group is in higher demand than ever
    • The second group has gotten extremely crowded [from people] […] who have completed MOOCs or
    bootcamps
    • Supply keeps growing while demand is flat or shrinking
    • especially as executives get burned by “data scientists” who don't know how to help them build
    things of value

    View Slide

  11. JOB MARKET 2016: US
    • The biggest differentiator I've seen is to be able to participate in actually building
    production quality systems vs being proficient enough in R or python to hack
    together a prototype on a very small dataset

    View Slide

  12. JOB MARKET 2017: NL
    • I am seeing the same things happening
    • We (GoDataDriven) are definitely only interested in these profiles (people
    who are already there, or that are getting there)
    • Many of our clients are in the same position

    View Slide

  13. WHICH COMPANIES CAN REALLY DO
    APPLIED DATA SCIENCE
    These are the companies you should be aiming to work for!
    • Business case for TP, TN & Cost of FP and FN
    • Data {insert something here} should be pro grade

    View Slide

  14. WHAT THESE COMPANIES EXPECT FROM
    YOU
    • Good software (code + non functional)
    • Monitor your models

    View Slide

  15. GOOD SOFTWARE?
    • Testable (and tested)
    • Modular (otherwise you cannot test it)
    • DRY
    • Efficient
    • Performant
    • Maintainable (clear code!)

    View Slide

  16. INTERMEZZO: BEST SOFTWARE
    • Demo here! Some real Python code!

    View Slide

  17. I’VE TOLD YOU SO
    • https://blog.godatadriven.com/production-ready-ds
    • Many data scientists approach the problem at hand with a Kaggle-like mentality:
    delivering the best model in absolute terms, no matter what the practical implications
    are.
    • In reality it's not the best model that we implement, but the one that combines
    quality and practicality.
    • Netflix competition

    View Slide

  18. BUT WAIT, I DON’T WANT TO DO THAT
    • There is a simple solution to this: companies should hire Machine Learning
    Engineers: help the data scientists productionizing ML/DS
    • The role currently doesn’t (really) exist
    • That means (almost) nobody has them!

    View Slide

  19. BUT WAIT, I DON’T WANT TO DO THAT
    • Intermezzo
    • What’s your experience
    • You can tell me how (much) I’m wrong
    • Are you hiring/are you searching?

    View Slide

  20. OK, HOW CAN I LEARN THAT
    • Code, code, code
    • If already in industry, start project/working with developers
    • OS contributions
    • Mini sales pitch, you can leave for beer already!
    • GoDataDriven offers the data science accelerator program
    • 12 + 12 (or 5 + 5) modules, with lecture + hands-on day
    • Hands-on day helps you put in practice how you can really use what you learn
    • Also productionizing your code!
    • /salespitch

    View Slide

  21. OK, BUT I REALLY CAN’T LEARN THAT
    • Don’t despair
    • Find your niche (finance, biology, marine, energy, etc.)
    • Find a voice
    • Explorative Data Analysis with Story Telling
    • Convincing stakeholders is still one of the most important skills (not only DS)

    View Slide

  22. TECH COMPANY, SOFTWARE DRIVEN,
    INTERNET FIRST
    • Where there’s a mature, professional, software engineering culture, the DS
    don’t have to worry about all the above (still about a lot though)
    • But they still have to code to be understood
    • No SE will save you from being lazy about good code
    • Booking.com, Bol.com, Marktplaats/eBay
    • Probably many others as well

    View Slide

  23. HEAVY R&D
    • There the focus is much more on research
    • Productionizing is in the (distant) future
    • Domain expertise is (often) more important than ML/DS
    • People working there have the chops to learn ML/DS
    • (Hint: They don’t always do it properly)

    View Slide

  24. FINAL NOTE
    • If you have no previous experience, you won’t likely be called every week with
    job offers
    • You might land a data science job, but instead of doing ML, you might end up
    doing ETL, glue code, write SQL, etc)
    • Hang in there: learn the skills to get where you want
    • After the first 1-2 y of experience, it’s usually downhill

    View Slide

  25. QUESTIONS?
    • We’re hiring
    • Data scientists & Machine Learning Engineers!
    [email protected]

    View Slide