Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Getting a job in data science

lanzani
January 18, 2017

Getting a job in data science

The 21th century sexiest job sure seems... sexy. Data scientists are more in demand than ever and it's hard to find a cv of a newcomer that doesn't say Kaggle or Machine Learning (Udacity/Coursera/Insert your own MOOC here). But what do companies really need and how can you exceed their need? A hand wavy introduction to the job market in applied data science (in the Dutch landscape)

lanzani

January 18, 2017
Tweet

More Decks by lanzani

Other Decks in Technology

Transcript

  1. SO YOU WANT TO WORK IN DATA
    SCIENCE
    Giovanni Lanzani
    @gglanzani

    View full-size slide

  2. WHO AM I
    • Imported from Italy ca. 2006
    • Master & PhD in Theoretical Physics in Leiden (2006-2012)
    • Consultant Software Quality @ KPMG (2012-2013)
    • Data Whisperer @ GoDataDriven (2013-2016)
    • Chief Science Officer @ GoDataDriven (2016-…)
    • Father of 5

    View full-size slide

  3. WHAT DO YOU WANT
    • Stimulating environment
    • Great team
    • Space to experiment and to grow
    • Develop yourself
    • Learn new things
    • Salary (?)

    View full-size slide

  4. WHAT COMPANIES (THOUGHT THEY)
    WANTED
    • All the things big data
    • Predictive modeling & Advanced Analytics
    • Make moar money
    • Do all the cool things the others are doing

    View full-size slide

  5. THREE KIND OF COMPANIES
    • Heavy R&D department
    • Tech company, software driven, internet first (only)
    • All the others
    • Speak about the latter (majority)

    View full-size slide

  6. WHAT COMPANIES GOT
    • A lot of POCs
    • A lot of screenshots/presentations/dashboards on a laptop
    • Extra mouths to feed with no returns
    • Nice stories to tell to their network, about those screenshots and especially
    those dashboards
    • Headaches with data and infra even more scattered

    View full-size slide

  7. BUT…
    • We got a data scientist working on trees, and forests
    • Neural networks
    • Deeply convoluted neural networks
    • Deep learning!
    • All the above, and more, is taught in popular MOOCs

    View full-size slide

  8. WHAT DO COMPANIES ACTUALLY NEED
    • Put things into production
    • They don’t teach that in any data science MOOC (that I know)

    View full-size slide

  9. JOB MARKET 2016: US
    • Ask HN: What's the state of the job market in data science and machine learning?
    • https://news.ycombinator.com/item?id=13232883
    • The supply-demand dynamics have changed a lot in the last couple years.
    • Two groups: people with work experience + strong software development skills, and those without
    • The first group is in higher demand than ever
    • The second group has gotten extremely crowded [from people] […] who have completed MOOCs or
    bootcamps
    • Supply keeps growing while demand is flat or shrinking
    • especially as executives get burned by “data scientists” who don't know how to help them build
    things of value

    View full-size slide

  10. JOB MARKET 2016: US
    • The biggest differentiator I've seen is to be able to participate in actually building
    production quality systems vs being proficient enough in R or python to hack
    together a prototype on a very small dataset

    View full-size slide

  11. JOB MARKET 2017: NL
    • I am seeing the same things happening
    • We (GoDataDriven) are definitely only interested in these profiles (people
    who are already there, or that are getting there)
    • Many of our clients are in the same position

    View full-size slide

  12. WHICH COMPANIES CAN REALLY DO
    APPLIED DATA SCIENCE
    These are the companies you should be aiming to work for!
    • Business case for TP, TN & Cost of FP and FN
    • Data {insert something here} should be pro grade

    View full-size slide

  13. WHAT THESE COMPANIES EXPECT FROM
    YOU
    • Good software (code + non functional)
    • Monitor your models

    View full-size slide

  14. GOOD SOFTWARE?
    • Testable (and tested)
    • Modular (otherwise you cannot test it)
    • DRY
    • Efficient
    • Performant
    • Maintainable (clear code!)

    View full-size slide

  15. INTERMEZZO: BEST SOFTWARE
    • Demo here! Some real Python code!

    View full-size slide

  16. I’VE TOLD YOU SO
    • https://blog.godatadriven.com/production-ready-ds
    • Many data scientists approach the problem at hand with a Kaggle-like mentality:
    delivering the best model in absolute terms, no matter what the practical implications
    are.
    • In reality it's not the best model that we implement, but the one that combines
    quality and practicality.
    • Netflix competition

    View full-size slide

  17. BUT WAIT, I DON’T WANT TO DO THAT
    • There is a simple solution to this: companies should hire Machine Learning
    Engineers: help the data scientists productionizing ML/DS
    • The role currently doesn’t (really) exist
    • That means (almost) nobody has them!

    View full-size slide

  18. BUT WAIT, I DON’T WANT TO DO THAT
    • Intermezzo
    • What’s your experience
    • You can tell me how (much) I’m wrong
    • Are you hiring/are you searching?

    View full-size slide

  19. OK, HOW CAN I LEARN THAT
    • Code, code, code
    • If already in industry, start project/working with developers
    • OS contributions
    • Mini sales pitch, you can leave for beer already!
    • GoDataDriven offers the data science accelerator program
    • 12 + 12 (or 5 + 5) modules, with lecture + hands-on day
    • Hands-on day helps you put in practice how you can really use what you learn
    • Also productionizing your code!
    • /salespitch

    View full-size slide

  20. OK, BUT I REALLY CAN’T LEARN THAT
    • Don’t despair
    • Find your niche (finance, biology, marine, energy, etc.)
    • Find a voice
    • Explorative Data Analysis with Story Telling
    • Convincing stakeholders is still one of the most important skills (not only DS)

    View full-size slide

  21. TECH COMPANY, SOFTWARE DRIVEN,
    INTERNET FIRST
    • Where there’s a mature, professional, software engineering culture, the DS
    don’t have to worry about all the above (still about a lot though)
    • But they still have to code to be understood
    • No SE will save you from being lazy about good code
    • Booking.com, Bol.com, Marktplaats/eBay
    • Probably many others as well

    View full-size slide

  22. HEAVY R&D
    • There the focus is much more on research
    • Productionizing is in the (distant) future
    • Domain expertise is (often) more important than ML/DS
    • People working there have the chops to learn ML/DS
    • (Hint: They don’t always do it properly)

    View full-size slide

  23. FINAL NOTE
    • If you have no previous experience, you won’t likely be called every week with
    job offers
    • You might land a data science job, but instead of doing ML, you might end up
    doing ETL, glue code, write SQL, etc)
    • Hang in there: learn the skills to get where you want
    • After the first 1-2 y of experience, it’s usually downhill

    View full-size slide

  24. QUESTIONS?
    • We’re hiring
    • Data scientists & Machine Learning Engineers!
    [email protected]

    View full-size slide