Evaluating Machine Learning Models

Slide 1

Slide 1 text

Evaluating Machine Learning Models Ike Okonkwo - @ikeondata Data Scientist @6senseInc 3.1.16 Metis Data Science Speaker Series

Slide 2

Slide 2 text

About Me • Data Scientist • 6sense - B2B multichannel predictive intelligence engine for marketing and sales • Data Science mentor and writer • Background • Physics / Electrical Engineering • Industrial & Systems Engineering

Slide 3

Slide 3 text

Agenda • Evaluation Metrics • Classification • Regression • Ranking

Slide 4

Slide 4 text

Evaluation Metrics Model evaluation answers the question - how do I choose between different models that reflect a particular use case in an objective manner? Generalization >>> Memorization

Slide 5

Slide 5 text

Evaluation Metrics • Classification : Accuracy, Confusion Matrix, ROC/ AUC, Logloss, Decile Chart, Fixed Bucket Decile, Lift • Regression : R^2, MSE, RMSE • Ranking : NDCG

Slide 6

Slide 6 text

Classification • Accuracy : measures how often the classifier make the correct prediction • PROS : easy to calculate • CONS : It doesn’t tell us anything about the distribution of the dependent values, it also doesn’t tell what type of errors your classifier is making • accuracy > null accuracy (baseline)

Slide 7

Slide 7 text

Classification • Confusion Matrix : table that summarizes performance of classification tasks

Slide 8

Slide 8 text

Classification • Accuracy : TN + TP / (TP + TN + FP +FN) • Error Rate : 1 - Accuracy • Sensitivity (TPR / Recall) : When the actual value is +ve, how often is the prediction correct. TP / (FN + TP) • Specificity : When the actual value is -ve, how often is the prediction correct. TN / (TN + FP) • FPR (1 - Specificity ) : When the actual value is -ve, how often is the prediction incorrect. FP / (TN + FP) • Precision : When a +ve value is predicted, how often is the prediction correct TP / (FP + TP) • f1-score : harmonic mean of Precision and Recall. (2*P*R) / (P + R ) • MCC : correlation coefficient between observed / predicted results [-1, +1]

Slide 9

Slide 9 text

Classification • Confusion Matrix • PROS : allows you to calculate other metrics, useful for multi-class problems, allows expected value (cost) calculations (Type I [FP] vs Type II [FN] errors)

Slide 10

Slide 10 text

Classification • ROC curve : a plot of TPR (sensitivity) vs FPR (1 - Specificity) for every possible classification threshold • PROS : single graph summary of classifier performance, also useful for cases of high class imbalance, enables you to understand the tradeoff in classifier performance • CONS : less interpretable for multi-class problems , sometimes doesn’t tell the entire story

Slide 11

Slide 11 text

Classification • AUC : the area under the ROC curve • PROS : single number summary of classifier performance, also useful for cases of high class imbalance • CONS : less interpretable for multi-class problems

Slide 12

Slide 12 text

Classification • Logloss : measure of accuracy that incorporates probabilistic confidence. • PROS : adds a measure to gauge extra noise that comes from using a prediction vs true labels • CONS : Predictions must be probabilities

Slide 13

Slide 13 text

Classification • Decile Chart

Slide 14

Slide 14 text

Classification • Fixed Bucket Decile Chart

Slide 15

Slide 15 text

Classification • Lift : the performance of a classifier over random guessing • profit / lift curves : 2x..5x

Slide 16

Slide 16 text

Regression • RMSE : square root of the average squared distance between actual and predicted values • normalized euclidean distance on number of data points • not robust since it’s an average, distance on average of a data point from the fitted line • MSE : average squared distance between actual and predicted values • distance of a data point from the fitted line • R^2 : proportion of variability in Y that can be explained by the model • measure of correlation

Slide 17

Slide 17 text

Ranking • NDCG : Normalized Discounted Cumulative Gain. Sums up relevance of top k ranked items • ex. search engine results. The top few answers matter more /are more relevant than those that are lower down the list • important in information retrieval where positioning of the returned items is very important

Slide 18

Slide 18 text

References • An Introduction to ROC Analysis - Tom Fawcett : https:// ccrma.stanford.edu/workshops/mir2009/references/ROCintro.pdf • Simple guide to Confusion Matrix Terminology : http:// www.dataschool.io/simple-guide-to-confusion-matrix-terminology/ • Model Evaluation : http://scikit-learn.org/stable/modules/ model_evaluation.html

Slide 19

Slide 19 text

No content

Slide 20

Slide 20 text

Data Science @ 6sense ETL Layer : Hive, Presto, Feature Eng. Data Layer : HDFS, Hadoop Automation Layer : Python, R, Shell ML Layer : h2o, Python, R, Shell, SQL SAAS Layer : REST API R&D : Python, R, Scala, C++ , Shell, Java, UDFs, Scaling / Distributing models, etc

Slide 21

Slide 21 text

Interview Tips • Get good at doing data take home challenges • Get really good at doing data take home challenges • Take home challenges are the new Phone Interview • Network.. network : meetups, linkedin, etc • Build a portfolio of interesting data science projects • Really understand how most of the major ML algorithms work under the hood • Work on open source data related libraries.. if you don’t find any interesting ones start writing yours • Become more visible : blog, contribute to open source ML, etc