Upgrade to Pro — share decks privately, control downloads, hide ads and more …

"Creating correct and capable classifiers" at P...

"Creating correct and capable classifiers" at PyDataAmsterdam 2018

My thoughts on starting with baseline estimators, visualising their capabilities, diagnosing where they might not work so well and digging to see if we trust the underlying data.

Blog: http://ianozsvald.com/2018/05/26/creating-correct-and-capable-classifiers-at-pydataamsterdam-2018/

ianozsvald

May 25, 2018
Tweet

More Decks by ianozsvald

Other Decks in Technology

Transcript

  1. [email protected] @IanOzsvald[.com] PyDataAmsterdam 2018 Introductions • I’m an engineering data

    scientist • Consulting in AI + Data Science for 15+ years Blog->IanOzsvald.com
  2. [email protected] @IanOzsvald[.com] PyDataAmsterdam 2018 NumFOCUS • Have you thanked a

    speaker, a volunteer and a NumFOCUS organiser yet? Lots of volunteered time – please say thanks • Thank contributors too!
  3. [email protected] @IanOzsvald[.com] PyDataAmsterdam 2018 Goals today • Get a baseline

    model • Visualise errors & diagnose problem areas • Explain decisions • Github for examples:
  4. [email protected] @IanOzsvald[.com] PyDataAmsterdam 2018 TSNE by features Features for this

    cluster - lots of imputed ages! We’ve filtered by x, y region on the TSNE plot
  5. [email protected] @IanOzsvald[.com] PyDataAmsterdam 2018 Closing... • Diagnose your ML just

    like you debug your code – explain its working to colleagues • Do you want training on topics like this? • Write-up + more: http://ianozsvald.com/ • Questions in exchange for beer :-) • Learnt something? Please send me a postcard! • See my longer diagnosis Notebook on github:
  6. [email protected] @IanOzsvald[.com] PyDataAmsterdam 2018 Appendix • Ian’s “Machine Learning Libraries

    You’d Wish You’d Knew” @ PyConUK 2017 • Ian’s “Using Machine Learning to solve a classification problem with scikit-learn” @ PyConUK 2016 • Gael Varoquaux’s tutorial “Understanding and diagnosing your machine-learning models” @ PyDataLondon 2018 http://gael-varoquaux.info/interpreting_ml_tuto/ • Also see Kat Jarmul’s keynote @ PyDataWarsaw 2017: https://blog.kjamistan.com/towards-interpretable-reliable-model s • Michał Łopuszyński @ PyDataWarsaw https://www.slideshare.net/lopusz/debugging-machinelearning