How to Design and Build a Recommendation Pipeline in Python

Slide 1

Slide 1 text

How to Design and Build a Recommendation Pipeline in Python Jill Cates November 10th, 2018 PyCon Canada

Slide 2

Slide 2 text

Overview of the Recommender Pipeline 1. Pre-processing 2. Hyperparameter Tuning 3. Model Training and Prediction 4. Post-processing 5. Evaluation

Slide 3

Slide 3 text

Spotify Discover Weekly

Slide 4

Slide 4 text

Netﬂix “Because you watched this TV show…”

Slide 5

Slide 5 text

Amazon “Frequently bought together” “Customers who bought this item also bought”

Slide 6

Slide 6 text

OkCupid “Finding your best match”

Slide 7

Slide 7 text

Recommender Systems in the Wild Spotify Discover Weekly Amazon Customers who bought this item also bought Netﬂix Because you watched this show… OkCupid Finding your best match LinkedIn Jobs recommended for you New York Times Recommended Articles for You Medicine Facilitating clinical decision making GitHub Repos “based on your interest”

Slide 8

Slide 8 text

Things were sold exclusively in brick-and-mortar stores… Before e-commerce limited inventory mainstream products

Slide 9

Slide 9 text

Things were sold exclusively in brick-and-mortar stores… Before e-commerce limited inventory mainstream products unlimited inventory niche products unlimited inventory niche products E-commerce

Slide 10

Slide 10 text

Things were sold exclusively in brick-and-mortar stores… Before e-commerce limited inventory mainstream products unlimited inventory niche products unlimited inventory niche products E-commerce

Slide 11

Slide 11 text

Recommender Systems in the Wild The Tasting Booth Experiment 6 jam samples 24 jam samples vs.

Slide 12

Slide 12 text

Recommender Systems in the Wild The Tasting Booth Experiment 6 jam samples 24 jam samples vs. “30% of the consumers in the limited-choice condition subsequently purchased a jar of jam; in contrast, only 3% of the consumers in the extensive-choice condition did so”

Slide 13

Slide 13 text

Machine Learning Model Recommender Crash Course Data Predictions

Slide 14

Slide 14 text

Recommender System Recommender Crash Course User Preferences Recommendations

Slide 15

Slide 15 text

Recommender System Recommender Crash Course User Preferences Recommendations predicting future behaviour explicit feedback implicit feedback

Slide 16

Slide 16 text

Recommender System Collaborative ﬁltering Content-based ﬁltering Recommender Crash Course User Preferences Recommendations predicting future behaviour similar users like similar things item user considers items/users features explicit feedback implicit feedback John Jim Anne Liz Erica

Slide 17

Slide 17 text

Recommender System Collaborative ﬁltering Recommender Crash Course User Preferences Recommendations predicting future behaviour similar users like similar things item user explicit feedback implicit feedback John Jim Anne Liz Erica • “Because you watched Movie X” • “Customers who bought this item also bought” 3

Slide 18

Slide 18 text

Recommender System Recommender Crash Course User Preferences Recommendations predicting future behaviour explicit feedback implicit feedback Content-based ﬁltering user and item features • user features: age, gender, spoken language • item features: movie genre, year of release, cast users John Jim Anne Liz Erica items scary funny family anime drama indie age gender country lang kids? religion

Slide 19

Slide 19 text

Overview of the Recommender Pipeline 1. Pre-processing 2. Hyperparameter Tuning 3. Model Training and Prediction 4. Post-processing 5. Evaluation

Slide 20

Slide 20 text

Slide 21

Slide 21 text

Pre-processing Hyperparameter Tuning Model Training Post-processing Evaluation user_id movie_id rating 2 439 4.0 10 368 4.5 14 114 5.0 19 371 1.0 2 371 3.0 19 114 4.5 3 439 3.5 54 421 2.0 32 114 3.0 10 369 1.0 Step 1: Data Pre-processing 1.5 2.0 3.5 4.5 2.0 3.0 5.0 4.5 2.0 1.0 3.0 2.5 4.0 3.0 3.0 4.5 5.0 items users Transform original data to user-item (utility) matrix

Slide 22

Slide 22 text

Pre-processing Hyperparameter Tuning Model Training Post-processing Evaluation user_id movie_id rating 2 439 4.0 10 368 4.5 14 114 5.0 19 371 1.0 2 371 3.0 19 114 4.5 3 439 3.5 54 421 2.0 32 114 3.0 10 369 1.0 Step 1: Data Pre-processing items users 1.5 2.0 3.5 4.5 2.0 3.0 5.0 4.5 2.0 1.0 3.0 2.5 4.0 3.0 3.0 4.5 5.0 scipy.sparse.csr_matrix

Slide 23

Slide 23 text

Pre-processing Hyperparameter Tuning Model Training Post-processing Evaluation user_id movie_id rating 2 439 4.0 10 368 4.5 14 114 5.0 19 371 1.0 2 371 3.0 19 114 4.5 3 439 3.5 54 421 2.0 32 114 3.0 10 369 1.0 Step 1: Data Pre-processing items users 1.5 2.0 3.5 4.5 2.0 3.0 5.0 4.5 2.0 1.0 3.0 2.5 4.0 3.0 3.0 4.5 5.0 sparsity = # ratings total # elements Calculate Matrix Sparsity

Slide 24

Slide 24 text

Pre-processing Hyperparameter Tuning Model Training Post-processing Evaluation Step 1: Data Pre-processing Normalization • Optimists → rate everything 4 or 5 • Pessimists → rate everything 1 or 2 • Need to normalize ratings by accounting for user and item bias • Mean normalization - subtract from each user’s rating for given item bui = μ + bi + bu global avg user-item rating bias item’s avg rating user’s avg rating bi i

Slide 25

Slide 25 text

Pick a Model Matrix Factorization • factorize the user-item matrix to get 2 latent factor matrices: - user-factor matrix - item-factor matrix • missing ratings are predicted from the inner product of these two factor matrices Xmn ≈ Pmk × QT nk = ̂ X user item user K K item X ≈

Slide 26

Slide 26 text

Pick a Model • Algorithms that perform matrix factorization: - Alternating Least Squares (ALS) - Stochastic Gradient Descent (SGD) - Singular Value Decomposition (SVD) Matrix Factorization Xmn ≈ Pmk × QT nk = ̂ X user item user K K item X ≈

Slide 27

Slide 27 text

Slide 28

Slide 28 text

• Of the top K recommendations, what proportion are relevant to the user? Pick an Evaluation Metric Precision@K

Slide 29

Slide 29 text

• Of the top 10 recommendations, what proportion are relevant to the user? Pick an Evaluation Metric Precision@10

Slide 30

Slide 30 text

Pre-processing Hyperparameter Tuning Model Training Post-processing Evaluation What is a hyperparameter? Step 2: Hyperparameter Tuning model hyperparameters configuration that is external to the model

Slide 31

Slide 31 text

Pre-processing Hyperparameter Tuning Model Training Post-processing Evaluation Alternating Least Square’s Hyperparameters Step 2: Hyperparameter Tuning Goal: ﬁnd the hyperparameters that give the best precision@10 * (or any other evaluation metric that you want to optimize) • (# of factors) • (regularization parameter) λ k

Slide 32

Slide 32 text

Pre-processing Hyperparameter Tuning Model Training Post-processing Evaluation Step 2: Hyperparameter Tuning Grid Search source: blog.kaggle.com Random Search # factors # factors regularization regularization λ λ sklearn.model_selection.GridSearchCV sklearn.model_selection.RandomizedSearchCV

Slide 33

Slide 33 text

Pre-processing Hyperparameter Tuning Model Training Post-processing Evaluation Step 2: Hyperparameter Tuning scikit-optimize (skopt) hyperopt Metric Optimization Engine (MOE) Sequential Model-Based Optimization # factors regularization # factors regularization

Slide 34

Slide 34 text

Pre-processing Hyperparameter Tuning Model Training Post-processing Evaluation Step 3: Model Training 1.5 2.0 3.5 4.5 2.0 3.0 5.0 4.5 2.0 1.0 3.0 2.5 4.0 3.0 3.0 4.5 5.0 items users 1.5 2.0 3.5 4.5 2.0 3.0 5.0 4.5 2.0 1.0 3.0 2.5 4.0 3.0 3.0 4.5 5.0 4.74 3.12 1.22 4.39 2.75 0 -1.27 0 0 0 0 0 0 0 3.99 1.73 2.93 -2.15 4.79 0.82 5.61 3.28 4.95 -0.21 -1.84 4.77 5.10 3.08 5.19 -2.46 3.27 items users AlternatingLeastSquares(k=8, regularization=0.001)

Slide 35

Slide 35 text

Pre-processing Hyperparameter Tuning Model Training Post-processing Evaluation Step 4: Post-processing • Sort predicted ratings and get top N • Filter out items that a user has already purchased, watched, interacted with • Item-item recommendations - Use a similarity metric (e.g., cosine similarity) -“Because you watched Movie X”

Slide 36

Slide 36 text

Pre-processing Hyperparameter Tuning Model Training Post-processing Evaluation Step 5: Evaluation How do we evaluate recommendations? Traditional ML Recommendation Systems

Slide 37

Slide 37 text

Step 5: Evaluation Metrics RMSE = ΣN i=1 (y − ̂ y)2 N precision = TP TP + FP recall = TP TP + FN F1 = 2 ⋅ precision ⋅ recall precision + recall Pre-processing Hyperparameter Tuning Model Training Post-processing Evaluation

Slide 38

Slide 38 text

Precision@K Of the top k recommendations, what proportion are actually “relevant”? Recall@K Proportion of items that were found in the top k recommendations. True negative False negative Reality Predicted liked did not like liked did not like precision = TP TP + FP recall = TP TP + FN True positive False positive Step 5: Evaluation Pre-processing Hyperparameter Tuning Model Training Post-processing Evaluation

Slide 39

Slide 39 text

Slide 40

Slide 40 text

Important Considerations •Interpretability •Eﬃciency and scalability •Diversity •Serendipity

Slide 41

Slide 41 text

• import surprise (@NicolasHug) • import implicit (@benfred) • import LightFM (@lyst) • import pyspark.mlib.recommendation Python Tools

Slide 42

Slide 42 text

Thank you! Jill Cates twitter: @jillacates github: @topspinj [email protected]