Upgrade to Pro — share decks privately, control downloads, hide ads and more …

On the Delivery of Data Science Projects

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
Avatar for ianozsvald ianozsvald
February 25, 2019

On the Delivery of Data Science Projects

Talk at Business, Analytics and Data Science (2019-02) based on my training course on the topics you should focus on to improve the deliverability and impact of your data science projects: https://www.meetup.com/Business-Analytics-and-Data-Science/events/258531525/

Avatar for ianozsvald

ianozsvald

February 25, 2019
Tweet

More Decks by ianozsvald

Other Decks in Technology

Transcript

  1. On the Delivery of Data Science Projects @IanOzsvald – ianozsvald.com

    Ian Ozsvald Business, Analytics and Data Science meetup 2019-02
  2.  Interim Chief Data Scientist  19+ years experience 

    Quickly build strategic data science plans  Team coaching & public courses Introductions By [ian]@ianozsvald[.com] Ian Ozsvald
  3.  Numerate management ask good data-driven questions  You have

    suitable data  Well defined achievable outcomes are defined  Change is enabled by these projects Data Science shows value when... By [ian]@ianozsvald[.com] Ian Ozsvald
  4.  “Make us more [money/…]” - give me magic! 

    Desire over need – vanity projects!  Lack of technical leadership – poor/missing specs  Bad data – lies, mistakes and confusion  Lack of client buy-in – no burning need Common delivery problems By [ian]@ianozsvald[.com] Ian Ozsvald
  5.  States a clearly defined problem  Guesses at unknowns

    (and project torpedoes!)  Proposed milestones and Gold Standard/metrics  Clear “definition of done”  Story from 10 years back You need a Project Specification By [ian]@ianozsvald[.com] Ian Ozsvald
  6.  Do you understand your data? – What’s good and

    bad? – What relationships exist?  Build exportable Notebook as html artefact  Read Bertil’s piece on Medium “Data Story” By [ian]@ianozsvald[.com] Ian Ozsvald
  7.  Reduce mental load for common decisions – Cookiecutter data-science

    – Watermark – Pandas-profiling – Anaconda Standardised Approaches By [ian]@ianozsvald[.com] Ian Ozsvald
  8.  Encode assumptions using asserts  Refactor to modules 

    Add unit-tests  Visual reports with analyst interpretations  Diagnostics e.g. yellowbrick for sklearn Continuously improving code quality By [ian]@ianozsvald[.com] Ian Ozsvald
  9.  Code review (with a check-list & PEP8)  nbdime

    for diffs  “Data Defences” - regular critiques by colleagues on your project Continuously improving project quality By [ian]@ianozsvald[.com] Ian Ozsvald
  10.  Exposure to new processes  Enforced clear communication 

    Balanced consumption & contribution  You’re more visible & valuable Contributing to Open Source gets you By [ian]@ianozsvald[.com] Ian Ozsvald
  11.  Easy first deliveries – reports  Get to a

    minimal working delivery as soon as possible  Consider papermill for deployable Notebooks Continuous delivery to clients By [ian]@ianozsvald[.com] Ian Ozsvald
  12. My “Successfully Delivering Data Science Projects” course – sold out

    – join my training list via ianozsvald.com Resources By [ian]@ianozsvald[.com] Ian Ozsvald
  13.  Derisk early and often  Communicate visually, all the

    time  Honesty throughout your work  Strive to continuous improvement  Consider speaking at PyDataLondon 2019 July 12-14 Summary By [ian]@ianozsvald[.com] Ian Ozsvald