Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A Starter Data Science Process for Software Engineers

3d644406158b4d440111903db1f62622?s=47 ianozsvald
June 15, 2019

A Starter Data Science Process for Software Engineers

From my talk at PyLondinium 2019 (https://ianozsvald.com/2019/06/15/a-starter-data-science-process-for-software-engineers-talk-at-pylondinium-2019/), we look at what's required for a valuable data science project, how to approach it (make a spec!), then step into a live demo using Jupyter, Altair & matplotlib for visualisations, a Widget driving predictions for interactivity and Voila to serve it up.

3d644406158b4d440111903db1f62622?s=128

ianozsvald

June 15, 2019
Tweet

Transcript

  1. A starter data science process for software engineers @IanOzsvald –

    ianozsvald.com Ian Ozsvald PyLondinium 2019
  2.  Interim Chief Data Scientist  19+ years experience 

    Quickly build strategic data science plans  Team coaching & public courses Introductions By [ian]@ianozsvald[.com] Ian Ozsvald
  3.  Numerate management ask good data-driven questions  You have

    suitable data  Well defined achievable outcomes are defined  Change is enabled by these projects Data Science shows value when... By [ian]@ianozsvald[.com] Ian Ozsvald
  4.  What’s the driver? Is there a fire under it?

     Joonatan’s example from PyDataLT – OCR  Cost/benefit estimate accepting uncertainty  Automatable Checking business need By [ian]@ianozsvald[.com] Ian Ozsvald
  5.  States a clearly defined problem  Guesses at unknowns

    (and project torpedoes!)  Proposed milestones and Gold Standard/metrics  Clear “definition of done”  Story from 10 years back You need a Project Specification By [ian]@ianozsvald[.com] Ian Ozsvald
  6.  Want to automate “MPG estimates” to help engineers 

    It only needs to be good enough for ranking, to assist the team in prioritising their investigations  We need to gain the team’s trust in stages  Pandas, sklearn, Yellowbrick, custom estimator A pretend example & live demo By [ian]@ianozsvald[.com] Ian Ozsvald
  7. “Software Engineering for Data Scientists” - early July Resources By

    [ian]@ianozsvald[.com] Ian Ozsvald
  8.  Your organisers are volunteers  Thank all volunteers &

    speakers please  Get a free signed book around 3.30pm Thank your organisers By [ian]@ianozsvald[.com] Ian Ozsvald
  9.  Automate parts of a high value problem  Deliver

    value incrementally  Communicate early & often  Join my thoughts+jobs list for tips and my training list  Lots of past talks on ianozsvald.com Summary By [ian]@ianozsvald[.com] Ian Ozsvald