Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Hybrid Recommender Systems at PyData Amsterdam ...

Hybrid Recommender Systems at PyData Amsterdam 2016

Maciej Kula

March 13, 2016
Tweet

More Decks by Maciej Kula

Other Decks in Programming

Transcript

  1. COLLABORATIVE FILTERING IS THE WORKHORSE OF RECOMMENDER SYSTEMS Use historical

    data on co-purchasing behaviour 'Users who bought X also bought...'
  2. USER-ITEM INTERACTIONS AS A SPARSE MATRIX I = ⎛ ⎝

    ⎜ ⎜ ⎜ ⎜ ⎜ 1.0 0.0 ⋮ 1.0 0.0 1.0 ⋮ 1.0 ⋯ ⋯ ⋱ ⋯ 1.0 0.0 ⋮ 1.0 ⎞ ⎠ ⎟ ⎟ ⎟ ⎟ ⎟
  3. IN THE SIMPLEST CASE, THAT'S ENOUGH TO MAKE RECOMMENDATIONS find

    similar users by calculating the distance between the rows that represent them recommend items similar users have bought, weighted by the degree of similarity
  4. Represent as a product of two reduced-rank matrices and .

    MOST APPLICATIONS USE SOME FORM OF MATRIX FACTORIZATION I U P
  5. THIS WORKS REMARKABLY WELL IF YOU HAVE A LOT OF

    DATA domain-agnostic: don't need to know anything about the users and items easy to understand and implement chief component of the Netflix-prize-winning ensemble MF yields nice, low-dimensional item representations, useful if you want to do related products
  6. BUT WHAT IF YOUR DATA IS SPARSE? large product inventory

    short-lived products lots of new users
  7. PROBLEMS need to have plenty of data for each user

    no information sharing across users doesn't provide compact representations for item similarity
  8. It's called LightFM. DISCLAIMER: THIS IS WHERE I TRY TO

    CONVINCE YOU TO USE MY RECOMMENDER PACKAGE
  9. A VARIANT OF MATRIX FACTORIZATION Instead of estimating a latent

    vector per user and item, estimate latent vectors for user and item metadata. User and items ids can also be included if you have enough data.
  10. The representation for 'Givenchy Ball Gown' is the element- wise

    sum of representations for 'givenchy', 'ball', and 'gown'. The representation for a female user with id 100 is the element-wise sum of representations for 'female' and 'ID 100'.
  11. The prediction for a user-item pair is given by the

    inner product of their representations.
  12. Two independent fully-connected layers, one with user, the other with

    item features as inputs, connected via a dot product. NEURAL NETWORK PERSPECTIVE
  13. BENEFITS fewer parameters to estimate can make predictions for new

    items and new users captures synonymy produces nice dense item representations reduces to a standard MF model as a special case
  14. PURE COLLABORATIVE FILTERING AUC of 0.43 worse than random little

    data, lots of parameters massive overfitting
  15. If you have lots of new users or new items,

    you will benefit from a hybrid algorithm IN SUMMARY
  16. EASY TO USE from lightfm import LightFM model = LightFM(loss='warp',

    learning_rate=0.01, learning_schedule='adagrad', no_components=30) model.fit(interactions, item_features=item_features, user_features=user_features, num_threads=4, epochs=epochs)
  17. ASIDE: LEARNING-TO-RANK IS A GREAT IDEA A Siamese network with

    triplet loss in NN parlance WARP is especially effective