Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Debugging Classifiers

Avatar for Lekan Lekan
November 17, 2018

Debugging Classifiers

Avatar for Lekan

Lekan

November 17, 2018
Tweet

More Decks by Lekan

Other Decks in Programming

Transcript

  1. Classifier Failure Modes 1. Classifier does not work at all

    2. Classifier works but it can be better 3. Classifier works on test data but fails in production 4. Classifier deteriorates over time.
  2. Classifier does not work at all • Check everything •

    Check your code • Read the docs • Check your data • Check your data quality • Data providers tend to overestimate the quality of their data • Understand your data • Do an exploration • Visualize • Talk to domain experts • Check for outliers • Make sure your data is • Correct • Complete • Coherent, • Stationary • De-duplicated • Up to date • Representative* • Ask Questions [Stackoverflow, Twitter, Forums]
  3. Classifier works well but it can be better • The

    Featurists • The Speculators • The Localizers • The “Convolutors” • The “Trainalyzers”
  4. Featurists • Find features important to your model • Do

    some feature selections • Tools : • eli5 • lime • skopt • yellowbrick • pandas-profiling
  5. Classifier works on test data but fails in production •

    Check for overfitting • Check for data leakage • Check for covariate shift