Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Evaluating Machine Learning Models

Evaluating Machine Learning Models

Ike Okonkwo

March 01, 2016
Tweet

More Decks by Ike Okonkwo

Other Decks in Programming

Transcript

  1. Evaluating Machine Learning Models Ike Okonkwo - @ikeondata Data Scientist

    @6senseInc 3.1.16 Metis Data Science Speaker Series
  2. About Me • Data Scientist • 6sense - B2B multichannel

    predictive intelligence engine for marketing and sales • Data Science mentor and writer • Background • Physics / Electrical Engineering • Industrial & Systems Engineering
  3. Evaluation Metrics Model evaluation answers the question - how do

    I choose between different models that reflect a particular use case in an objective manner? Generalization >>> Memorization
  4. Evaluation Metrics • Classification : Accuracy, Confusion Matrix, ROC/ AUC,

    Logloss, Decile Chart, Fixed Bucket Decile, Lift • Regression : R^2, MSE, RMSE • Ranking : NDCG
  5. Classification • Accuracy : measures how often the classifier make

    the correct prediction • PROS : easy to calculate • CONS : It doesn’t tell us anything about the distribution of the dependent values, it also doesn’t tell what type of errors your classifier is making • accuracy > null accuracy (baseline)
  6. Classification • Accuracy : TN + TP / (TP +

    TN + FP +FN) • Error Rate : 1 - Accuracy • Sensitivity (TPR / Recall) : When the actual value is +ve, how often is the prediction correct. TP / (FN + TP) • Specificity : When the actual value is -ve, how often is the prediction correct. TN / (TN + FP) • FPR (1 - Specificity ) : When the actual value is -ve, how often is the prediction incorrect. FP / (TN + FP) • Precision : When a +ve value is predicted, how often is the prediction correct TP / (FP + TP) • f1-score : harmonic mean of Precision and Recall. (2*P*R) / (P + R ) • MCC : correlation coefficient between observed / predicted results [-1, +1]
  7. Classification • Confusion Matrix • PROS : allows you to

    calculate other metrics, useful for multi-class problems, allows expected value (cost) calculations (Type I [FP] vs Type II [FN] errors)
  8. Classification • ROC curve : a plot of TPR (sensitivity)

    vs FPR (1 - Specificity) for every possible classification threshold • PROS : single graph summary of classifier performance, also useful for cases of high class imbalance, enables you to understand the tradeoff in classifier performance • CONS : less interpretable for multi-class problems , sometimes doesn’t tell the entire story
  9. Classification • AUC : the area under the ROC curve

    • PROS : single number summary of classifier performance, also useful for cases of high class imbalance • CONS : less interpretable for multi-class problems
  10. Classification • Logloss : measure of accuracy that incorporates probabilistic

    confidence. • PROS : adds a measure to gauge extra noise that comes from using a prediction vs true labels • CONS : Predictions must be probabilities
  11. Classification • Lift : the performance of a classifier over

    random guessing • profit / lift curves : 2x..5x
  12. Regression • RMSE : square root of the average squared

    distance between actual and predicted values • normalized euclidean distance on number of data points • not robust since it’s an average, distance on average of a data point from the fitted line • MSE : average squared distance between actual and predicted values • distance of a data point from the fitted line • R^2 : proportion of variability in Y that can be explained by the model • measure of correlation
  13. Ranking • NDCG : Normalized Discounted Cumulative Gain. Sums up

    relevance of top k ranked items • ex. search engine results. The top few answers matter more /are more relevant than those that are lower down the list • important in information retrieval where positioning of the returned items is very important
  14. References • An Introduction to ROC Analysis - Tom Fawcett

    : https:// ccrma.stanford.edu/workshops/mir2009/references/ROCintro.pdf • Simple guide to Confusion Matrix Terminology : http:// www.dataschool.io/simple-guide-to-confusion-matrix-terminology/ • Model Evaluation : http://scikit-learn.org/stable/modules/ model_evaluation.html
  15. Data Science @ 6sense ETL Layer : Hive, Presto, Feature

    Eng. Data Layer : HDFS, Hadoop Automation Layer : Python, R, Shell ML Layer : h2o, Python, R, Shell, SQL SAAS Layer : REST API R&D : Python, R, Scala, C++ , Shell, Java, UDFs, Scaling / Distributing models, etc
  16. Interview Tips • Get good at doing data take home

    challenges • Get really good at doing data take home challenges • Take home challenges are the new Phone Interview • Network.. network : meetups, linkedin, etc • Build a portfolio of interesting data science projects • Really understand how most of the major ML algorithms work under the hood • Work on open source data related libraries.. if you don’t find any interesting ones start writing yours • Become more visible : blog, contribute to open source ML, etc