Hybrid Recommender Systems at PyData Amsterdam 2016

HYBRID RECOMMENDER SYSTEMS IN PYTHON THE WHYS AND WHEREFORES

@maciej_kula I'M MACIEJ

I mainly build recommendations, but have dabbled in other systems
I'M A DATA SCIENTIST AT LYST

I'M GOING TO TALK ABOUR HYBRID RECOMMENDERS What they are,
and Why you might want one.

COLLABORATIVE FILTERING IS THE WORKHORSE OF RECOMMENDER SYSTEMS Use historical
data on co-purchasing behaviour 'Users who bought X also bought...'

USER-ITEM INTERACTIONS AS A SPARSE MATRIX I = ⎛ ⎝
⎜ ⎜ ⎜ ⎜ ⎜ 1.0 0.0 ⋮ 1.0 0.0 1.0 ⋮ 1.0 ⋯ ⋯ ⋱ ⋯ 1.0 0.0 ⋮ 1.0 ⎞ ⎠ ⎟ ⎟ ⎟ ⎟ ⎟

IN THE SIMPLEST CASE, THAT'S ENOUGH TO MAKE RECOMMENDATIONS find
similar users by calculating the distance between the rows that represent them recommend items similar users have bought, weighted by the degree of similarity

Represent as a product of two reduced-rank matrices and .
MOST APPLICATIONS USE SOME FORM OF MATRIX FACTORIZATION I U P

THIS WORKS REMARKABLY WELL IF YOU HAVE A LOT OF
DATA domain-agnostic: don't need to know anything about the users and items easy to understand and implement chief component of the Netflix-prize-winning ensemble MF yields nice, low-dimensional item representations, useful if you want to do related products

BUT WHAT IF YOUR DATA IS SPARSE? large product inventory
short-lived products lots of new users

CAN'T COMPUTE SIMILARITIES most users haven't bought most items haven't
been bought

PERFORMS NO BETTER THAN RANDOM

CONTENT-BASED MODELS TO THE RESCUE collect metadata about items construct
a classifier for each user

PROBLEMS need to have plenty of data for each user
no information sharing across users doesn't provide compact representations for item similarity

'Gucci Evening Dress' and 'Givenchy Ball Gown' DOESN'T CAPTURE SIMILARITY

SOLUTION: USE A HYBRID MODEL

It's called LightFM. DISCLAIMER: THIS IS WHERE I TRY TO
CONVINCE YOU TO USE MY RECOMMENDER PACKAGE

A VARIANT OF MATRIX FACTORIZATION Instead of estimating a latent
vector per user and item, estimate latent vectors for user and item metadata. User and items ids can also be included if you have enough data.

The representation for 'Givenchy Ball Gown' is the element- wise
sum of representations for 'givenchy', 'ball', and 'gown'. The representation for a female user with id 100 is the element-wise sum of representations for 'female' and 'ID 100'.

The prediction for a user-item pair is given by the
inner product of their representations.

Two independent fully-connected layers, one with user, the other with
item features as inputs, connected via a dot product. NEURAL NETWORK PERSPECTIVE

BENEFITS fewer parameters to estimate can make predictions for new
items and new users captures synonymy produces nice dense item representations reduces to a standard MF model as a special case

EXAMPLE: CROSS-VALIDATED Try to predict which questions will users answer
A ranking task, measure AUC

PURE COLLABORATIVE FILTERING AUC of 0.43 worse than random little
data, lots of parameters massive overfitting

PURE CONTENT-BASED SOLUTION fit a separate logistic regression model for
each user AUC of 0.66 a lot better

HYBRID SOLUTION AUC of 0.71 best result get tag embeddings
as an extra benefit

TAG SIMILARITY 'bayesian': 'mcmc', 'variational-bayes' 'survival': 'cox-model', 'odds-ratio', 'kaplan-meier'

Both are essentially matrix factorization algorithms SIMILAR TO WORD2VEC

If you have lots of new users or new items,
you will benefit from a hybrid algorithm IN SUMMARY

Even if you don't face cold-start, you might still want
to use LightFM.

EASY TO USE from lightfm import LightFM model = LightFM(loss='warp',
learning_rate=0.01, learning_schedule='adagrad', no_components=30) model.fit(interactions, item_features=item_features, user_features=user_features, num_threads=4, epochs=epochs)

FAST Written in Cython Supports multicore training via Hogwild

LEARNING-TO-RANK Supports learning-to-rank objectives BPR WARP

ASIDE: LEARNING-TO-RANK IS A GREAT IDEA A Siamese network with
triplet loss in NN parlance WARP is especially effective

Adagrad and Adadelta PER-PARAMETER LEARNING RATES

pip install lightfm github.com/lyst/lightfm

Hybrid Recommender Systems at PyData Amsterdam ...

Hybrid Recommender Systems at PyData Amsterdam 2016

More Decks by Maciej Kula

Other Decks in Programming

Featured

Transcript