Timeline Post Recommender System

> Jihong Lee / Data Science Dev LINE Timeline Post
Recommender System

Agenda > What is a Recommender System? > Problems, Solutions,
and Lessons Learned • User Embeddings • Feedback Loop • Importance of Evaluation • Model Architecture • Increasing Post Pool Size

What Is a Recommender System?

LINE Timeline

Background Knowledge > Collaborative filtering and content-based filtering > Two
big approaches to Recommender Systems

Collaborative Filtering Movie A Movie B Movie C Movie D
5 4 5 2 3 4 4 2 1 1 Ratings 5 - Excellent 4 - Good 3 - Average 2 - Not Bad 1 - Bad

Collaborative Filtering Movie A Movie B Movie C Movie D
5 4 5 2 3 4 4 2 1 1 Ratings 5 - Excellent 4 - Good 3 - Average 2 - Not Bad 1 - Bad Recommend!

Problem Description > Given: A user and a post with
context > GOAL: Predict the probability that the user will click the post

User Interactions Model Deployment Model Training Preprocess Data Model Training
Pipeline Raw Data

Model Training Pipeline User Interactions Model Deployment Model Training Preprocess
Data Raw Data

Model Training Pipeline User Interactions Model Deployment Model Training Raw
Data Preprocess Data

Raw Data User View History Log User Post Author Time
User Click History Log User Post Author Time Join!

Raw Data Labeled Data User Post Author Time Label 1
0 1 1 1 - Positive Label (Clicked) 0 - Negative Label (Not Clicked)

Model Training Pipeline Simple! User Interactions Model Deployment Model Training
Raw Data Preprocess Data User Interactions Model Deployment Model Training Raw Data Preprocess Data

Recommender System 101: Embeddings Item 1 3 0 2 0
… Item 2 2 1 2 4 … Item 3 3 4 1 0 … features dimensionality too high! (# of columns) no meaning of categories

Recommender System 101: Embeddings Item 1 3 0 2 0
… Item 2 2 1 2 4 … Item 3 3 4 1 0 … features dimensionality too high! (# of columns) Item 1 [ -0.242 0.218 0.848 -0.887 … ] Item 1 [ -0.242 0.218 0.848 -0.887 … ] Item 1 [ -0.242 0.218 0.848 -0.887 … ] embedding representation no meaning of categories reduced dimensionality has category meanings

Problems & Solutions > User Embeddings

Creating Embeddings Post 1 Post 2 Post 3 Post 4
0 1 1 1 0 1 0 1 1 Post Embedding Vectors Post 1 [ -0.242 0.218 0.848 -0.887 … ] Post 2 [ 0.581 -0.859 0.006 -0.598 … ] Post 3 [ 0.344 -0.834 -0.651 0.524 … ] Post 4 [ 0.255 0.963 -0.127 -0.959 … ] … [ … ] User Embedding Vectors [ 0.324 -0.192 -0.453 0.004 … ] [ -0.187 0.394 -0.225 0.022 … ] [ 0.177 0.718 -0.239 -0.422 … ] [ -0.725 0.090 -0.353 0.228 … ] … [ … ]

Creating Embeddings Post 1 Post 2 Post 3 Post 4
0 1 1 1 0 1 0 1 1 Post Embedding Vectors Post 1 [ -0.242 0.218 0.848 -0.887 … ] Post 2 [ 0.581 -0.859 0.006 -0.598 … ] Post 3 [ 0.344 -0.834 -0.651 0.524 … ] Post 4 [ 0.255 0.963 -0.127 -0.959 … ] … [ … ] User Embedding Vectors [ 0.324 -0.192 -0.453 0.004 … ] [ -0.187 0.394 -0.225 0.022 … ] [ 0.177 0.718 -0.239 -0.422 … ] [ -0.725 0.090 -0.353 0.228 … ] … [ … ] Extremely Sparse!! Density < 0.0001%

Mitigate the Issue Post 1 Post 2 Post 3 Post
4 Post 1 3 0 2 0 Post 2 1 1 3 2 Post 3 1 0 2 1 Post 4 0 1 4 1 Post Embedding Vectors Post 1 [ -0.242 0.218 0.848 -0.887 … ] Post 2 [ 0.581 -0.859 0.006 -0.598 … ] Post 3 [ 0.344 -0.834 -0.651 0.524 … ] Post 4 [ 0.255 0.963 -0.127 -0.959 … ] … [ … ] User Embedding Vectors [ linear combination of user history ] [ linear combination of user history ] [ linear combination of user history ] [ linear combination of user history ] … [ … ] Worked Much Better!

Lesson Learned > Different algorithms are suitable for different types
of data > Must understand the nature of your data!

Problems & Solutions > Feedback Loop

Feedback Loop User Interactions Model Deployment Model Training Raw Data
Preprocess Data User Interactions Model Deployment Model Training Raw Data Preprocess Data

Feedback Loop Problematic! User Interactions Model Deployment Model Training Raw
Data Preprocess Data User Interactions Model Deployment Model Training Raw Data Preprocess Data

Feedback Loop train recommendations for user user interacts Caused by
Model 0 Training Data Is Biased!! Model 0 Model 1

User Interactions Model Deployment Model Training Preprocess Data Feedback Loop
User Interactions Model Deployment Model Training Raw Data Preprocess Data Raw Data Caused by Friends’ Shares User Interactions Model Deployment Model Training Raw Data Preprocess Data Not Used in Training!

Problems & Solutions > Importance of Evaluation

AUROC (Area Under ROC Curve) True Positive Rate False Positive
Rate True Positive Rate: Proportion of Correctly Classified Positive Labels False Positive Rate: Proportion of Negative Labels Incorrectly Classified as Positive : Trained Classifier : Random Classifier

Global AUROC User Post Pr(Click) A 0.992 B 0.981 A
0.977 C 0.964 B 0.951 A 0.924 C 0.918 A 0.908 C 0.905 C 0.900 B 0.898 B 0.891 … … … Label 1 0 1 1 0 0 1 1 0 1 1 0 … Calculate AUROC

Average AUROC per User Label 1 0 1 … User
Post Pr(Click) A 0.992 B 0.981 C 0.977 … … … User Post Pr(Click) A 0.990 C 0.985 D 0.972 … … … User Post Pr(Click) B 0.991 D 0.970 C 0.967 … … … Label 1 0 1 … Label 1 0 1 … Calculate AUROC for each user then average

Problems & Solutions > Model Architecture

Problems With a Sole Ranking Model Computationally expensive Concentrated post
distribution

Problems With a Sole Ranking Model Concentrated post distribution Computationally
expensive

Problems With a Sole Ranking Model Computationally expensive Concentrated post
distribution

Pareto Optimality Preference Criterion A Preference Criterion B a state
of maximum efficiency in the allocation of resources Pareto frontier any point on the Pareto frontier is Pareto optimal

Current State Personalization KPI Inside Pareto Frontier Converging at KPI’s
Local Maxima A B

In Search of Pareto Frontier Personalization KPI Need a change
in model architecture

Candidate Generation and Ranking Raw Data User Interactions Model Deployment
Candidate Generation Ranking Preprocess Data Preprocess Data

Candidate Generation Post Embedding Vectors Post 1 [ -0.242 0.218
0.848 -0.887 … ] Post 2 [ 0.581 -0.859 0.006 -0.598 … ] Post 3 [ 0.344 -0.834 -0.651 0.524 … ] Post 4 [ 0.255 0.963 -0.127 -0.959 … ] … [ … ] train post embeddings = Post 1 Post 2 Post 3 Post 4 Post 1 3 0 2 0 Post 2 1 1 3 2 Post 3 1 0 2 1 Post 4 0 1 4 1 co-occurrence matrix Candidate Generation Training

Candidate Generation Candidate Generation Inference user ID = interaction history
Item Embedding Vectors Post 1 [ -0.242 0.218 0.848 -0.887 … ] Post 2 [ 0.581 -0.859 0.006 -0.598 … ] Post 3 [ 0.344 -0.834 -0.651 0.524 … ] Post 4 [ 0.255 0.963 -0.127 -0.959 … ] … [ … ] linear combination of history User ID [ -0.242 0.218 0.848 -0.887 … ] user vector nearest neighbor search candidates Candidates Post 1 Post 2 Post 3 Post 4 …

Inference Pipeline User ID Query Candidate Generation Ranking Recommendations

Problems & Solutions > Increasing Post Pool

Aligning Embeddings Trained in Batches Candidate Generation Post Embedding Vectors
Post 1 [ -0.242 0.218 0.848 -0.887 … ] Post 2 [ 0.581 -0.859 0.006 -0.598 … ] Post 3 [ 0.344 -0.834 -0.651 0.524 … ] Post 4 [ 0.255 0.963 -0.127 -0.959 … ] … [ … ] Candidate Generation Post Embedding Vectors Post 3 [ -0.242 0.218 0.848 -0.887 … ] Post 4 [ 0.581 -0.859 0.006 -0.598 … ] Post 5 [ 0.344 -0.834 -0.651 0.524 … ] Post 6 [ 0.255 0.963 -0.127 -0.959 … ] … [ … ] Candidate Generation Post Embedding Vectors Post 5 [ -0.242 0.218 0.848 -0.887 … ] Post 6 [ 0.581 -0.859 0.006 -0.598 … ] Post 7 [ 0.344 -0.834 -0.651 0.524 … ] Post 8 [ 0.255 0.963 -0.127 -0.959 … ] … [ … ] t=0 t=1 t=2 Each batch has a different post pool

Aligning Embeddings Trained in Batches Candidate Generation Post Embedding Vectors
Post 1 [ -0.242 0.218 0.848 -0.887 … ] Post 2 [ 0.581 -0.859 0.006 -0.598 … ] Post 3 [ 0.344 -0.834 -0.651 0.524 … ] Post 4 [ 0.255 0.963 -0.127 -0.959 … ] … [ … ] Candidate Generation Post Embedding Vectors Post 3 [ -0.242 0.218 0.848 -0.887 … ] Post 4 [ 0.581 -0.859 0.006 -0.598 … ] Post 5 [ 0.344 -0.834 -0.651 0.524 … ] Post 6 [ 0.255 0.963 -0.127 -0.959 … ] … [ … ] Candidate Generation Post Embedding Vectors Post 5 [ -0.242 0.218 0.848 -0.887 … ] Post 6 [ 0.581 -0.859 0.006 -0.598 … ] Post 7 [ 0.344 -0.834 -0.651 0.524 … ] Post 8 [ 0.255 0.963 -0.127 -0.959 … ] … [ … ] t=0 t=1 t=2 We want everything Need to align!

Orthogonal Procrustes Problem find orthogonal matrix W same size Given
A B , such that ∥BW − A∥2 F is minimized

Orthogonal Procrustes Problem A B

Orthogonal Procrustes Problem A B 1-to-1 correspondence of points find
rotation and/or reflection matrix that maps B into A

Orthogonal Procrustes Problem 1-to-1 correspondence of points A B find
rotation and/or reflection matrix that maps B into A

Orthogonal Procrustes Problem Aligned A & B! How can we
use this method? B A

Orthogonal Procrustes Problem Post Embedding Vectors Post 1 [ -0.242
0.218 0.848 -0.887 … ] Post 2 [ 0.581 -0.859 0.006 -0.598 … ] Post 3 [ 0.344 -0.834 -0.651 0.524 … ] Post 4 [ 0.255 0.963 -0.127 -0.959 … ] Post 5 [ 0.239 -0.646 0.002 -0.702 … ] Post 6 [ -0.612 -0.408 0.052 0.064 … ] Post 7 [ 0.139 0.118 -0.142 -0.157 … ] … [ … ] Post Embedding Vectors Post 3 [ -0.242 0.218 0.848 -0.887 … ] Post 4 [ 0.581 -0.859 0.006 -0.598 … ] Post 5 [ 0.344 -0.834 -0.651 0.524 … ] Post 6 [ 0.255 0.963 -0.127 -0.959 … ] Post 7 [ -0.299 0.808 0.677 -0.604 … ] Post 8 [ 0.992 -0.795 0.062 -0.490 … ] Post 9 [ 0.855 0.622 -0.793 -0.329 … ] … [ … ] t=0 t=1 A B 1-to-1 correspondence of points find orthogonal matrix W that maps B into A

0.218 0.848 -0.887 … ] Post 2 [ 0.581 -0.859 0.006 -0.598 … ] Post 3 [ 0.344 -0.834 -0.651 0.524 … ] Post 4 [ 0.255 0.963 -0.127 -0.959 … ] Post 5 [ 0.239 -0.646 0.002 -0.702 … ] Post 6 [ -0.612 -0.408 0.052 0.064 … ] Post 7 [ 0.139 0.118 -0.142 -0.157 … ] … [ … ] Post Embedding Vectors Post 3 [ -0.242 0.218 0.848 -0.887 … ] Post 4 [ 0.581 -0.859 0.006 -0.598 … ] Post 5 [ 0.344 -0.834 -0.651 0.524 … ] Post 6 [ 0.255 0.963 -0.127 -0.959 … ] Post 7 [ -0.299 0.808 0.677 -0.604 … ] Post 8 [ 0.992 -0.795 0.062 -0.490 … ] Post 9 [ 0.855 0.622 -0.793 -0.329 … ] … [ … ] t=0 t=1 1-to-1 correspondence of points find orthogonal matrix W that maps B into A transform whole t=1 embedding matrix using W

0.218 0.848 -0.887 … ] Post 2 [ 0.581 -0.859 0.006 -0.598 … ] Post 3 [ 0.344 -0.834 -0.651 0.524 … ] Post 4 [ 0.255 0.963 -0.127 -0.959 … ] Post 5 [ 0.239 -0.646 0.002 -0.702 … ] Post 6 [ -0.612 -0.408 0.052 0.064 … ] Post 7 [ 0.139 0.118 -0.142 -0.157 … ] … [ … ] Post Embedding Vectors Post 3 [ 0.344 -0.834 -0.651 0.524 … ] Post 4 [ 0.255 0.963 -0.127 -0.959 … ] Post 5 [ 0.239 -0.646 0.002 -0.702 … ] Post 6 [ -0.612 -0.408 0.052 0.064 … ] Post 7 [ 0.139 0.118 -0.142 -0.157 … ] Post 8 [ 0.992 -0.795 0.062 -0.490 … ] Post 9 [ 0.855 0.622 -0.793 -0.329 … ] … [ … ] t=0 t=1′ t=0 and t=1′ are in same vector space add embeddings only in t=1′ to t=0 Aligned Embeddings!

Summary > Understand the nature of your data > Dual
importance of quantitative and qualitative evaluation > “Perfection is not attainable. But if we chase perfection, we can catch excellence.” - Vince Lombardi > Model architecture is essential > Understanding your evaluation metric

Thank You

Timeline Post Recommender System

Timeline Post Recommender System

More Decks by LINE Developers Taiwan

Other Decks in Programming

Featured

Transcript