Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Getting a job in data science

lanzani
January 18, 2017

Getting a job in data science

The 21th century sexiest job sure seems... sexy. Data scientists are more in demand than ever and it's hard to find a cv of a newcomer that doesn't say Kaggle or Machine Learning (Udacity/Coursera/Insert your own MOOC here). But what do companies really need and how can you exceed their need? A hand wavy introduction to the job market in applied data science (in the Dutch landscape)

lanzani

January 18, 2017
Tweet

More Decks by lanzani

Other Decks in Technology

Transcript

  1. WHO AM I • Imported from Italy ca. 2006 •

    Master & PhD in Theoretical Physics in Leiden (2006-2012) • Consultant Software Quality @ KPMG (2012-2013) • Data Whisperer @ GoDataDriven (2013-2016) • Chief Science Officer @ GoDataDriven (2016-…) • Father of 5
  2. WHAT DO YOU WANT • Stimulating environment • Great team

    • Space to experiment and to grow • Develop yourself • Learn new things • Salary (?) • <Insert more here>
  3. WHAT COMPANIES (THOUGHT THEY) WANTED • All the things big

    data • Predictive modeling & Advanced Analytics • Make moar money • Do all the cool things the others are doing
  4. THREE KIND OF COMPANIES • Heavy R&D department • Tech

    company, software driven, internet first (only) • All the others • Speak about the latter (majority)
  5. WHAT COMPANIES GOT • A lot of POCs • A

    lot of screenshots/presentations/dashboards on a laptop • Extra mouths to feed with no returns • Nice stories to tell to their network, about those screenshots and especially those dashboards • Headaches with data and infra even more scattered
  6. BUT… • We got a data scientist working on trees,

    and forests • Neural networks • Deeply convoluted neural networks • Deep learning! • All the above, and more, is taught in popular MOOCs
  7. WHAT DO COMPANIES ACTUALLY NEED • Put things into production

    • They don’t teach that in any data science MOOC (that I know)
  8. JOB MARKET 2016: US • Ask HN: What's the state

    of the job market in data science and machine learning? • https://news.ycombinator.com/item?id=13232883 • The supply-demand dynamics have changed a lot in the last couple years. • Two groups: people with work experience + strong software development skills, and those without • The first group is in higher demand than ever • The second group has gotten extremely crowded [from people] […] who have completed MOOCs or bootcamps • Supply keeps growing while demand is flat or shrinking • especially as executives get burned by “data scientists” who don't know how to help them build things of value
  9. JOB MARKET 2016: US • The biggest differentiator I've seen

    is to be able to participate in actually building production quality systems vs being proficient enough in R or python to hack together a prototype on a very small dataset
  10. JOB MARKET 2017: NL • I am seeing the same

    things happening • We (GoDataDriven) are definitely only interested in these profiles (people who are already there, or that are getting there) • Many of our clients are in the same position
  11. WHICH COMPANIES CAN REALLY DO APPLIED DATA SCIENCE These are

    the companies you should be aiming to work for! • Business case for TP, TN & Cost of FP and FN • Data {insert something here} should be pro grade
  12. WHAT THESE COMPANIES EXPECT FROM YOU • Good software (code

    + non functional) • Monitor your models
  13. GOOD SOFTWARE? • Testable (and tested) • Modular (otherwise you

    cannot test it) • DRY • Efficient • Performant • Maintainable (clear code!)
  14. I’VE TOLD YOU SO • https://blog.godatadriven.com/production-ready-ds • Many data scientists

    approach the problem at hand with a Kaggle-like mentality: delivering the best model in absolute terms, no matter what the practical implications are. • In reality it's not the best model that we implement, but the one that combines quality and practicality. • Netflix competition
  15. BUT WAIT, I DON’T WANT TO DO THAT • There

    is a simple solution to this: companies should hire Machine Learning Engineers: help the data scientists productionizing ML/DS • The role currently doesn’t (really) exist • That means (almost) nobody has them!
  16. BUT WAIT, I DON’T WANT TO DO THAT • Intermezzo

    • What’s your experience • You can tell me how (much) I’m wrong • Are you hiring/are you searching?
  17. OK, HOW CAN I LEARN THAT • Code, code, code

    • If already in industry, start project/working with developers • OS contributions • Mini sales pitch, you can leave for beer already! • GoDataDriven offers the data science accelerator program • 12 + 12 (or 5 + 5) modules, with lecture + hands-on day • Hands-on day helps you put in practice how you can really use what you learn • Also productionizing your code! • /salespitch
  18. OK, BUT I REALLY CAN’T LEARN THAT • Don’t despair

    • Find your niche (finance, biology, marine, energy, etc.) • Find a voice • Explorative Data Analysis with Story Telling • Convincing stakeholders is still one of the most important skills (not only DS)
  19. TECH COMPANY, SOFTWARE DRIVEN, INTERNET FIRST • Where there’s a

    mature, professional, software engineering culture, the DS don’t have to worry about all the above (still about a lot though) • But they still have to code to be understood • No SE will save you from being lazy about good code • Booking.com, Bol.com, Marktplaats/eBay • Probably many others as well
  20. HEAVY R&D • There the focus is much more on

    research • Productionizing is in the (distant) future • Domain expertise is (often) more important than ML/DS • People working there have the chops to learn ML/DS • (Hint: They don’t always do it properly)
  21. FINAL NOTE • If you have no previous experience, you

    won’t likely be called every week with job offers • You might land a data science job, but instead of doing ML, you might end up doing ETL, glue code, write SQL, etc) • Hang in there: learn the skills to get where you want • After the first 1-2 y of experience, it’s usually downhill