Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Amazon's Item-to-item Recommendation Algorithm

Amazon's Item-to-item Recommendation Algorithm

Recommendation algorithms are best known for their use on e-commerce Web sites, where they use input about a customer’s interests to generate a list of recommended items. Many applications use only the items that customers purchase and explicitly rate to represent their interests, but they can also use other attributes, including items viewed, demographic data, subject interests, and favorite artists. At Amazon.com, recommendation algorithms are used to personalize the online store for each customer

Roberto Zen

May 07, 2015
Tweet

More Decks by Roberto Zen

Other Decks in Technology

Transcript

  1. Item-to-item Collaborative Filtering Recommendation Algorithm Zamboni Luca Zen Roberto Zamboni

    Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 1 / 21
  2. Topics Recommendation Algorithms: targets and problems Collaborative Filtering Recommendation Algorithms

    Memory-Based and Model-Based Item-to-item Recommendation Algorithm Experimental results Conclusion Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 2 / 21
  3. Recommendation Algorithms Recommendation Algorithms are used in E-commerce, web-sites and

    email advertising. They apply data analysis techniques to the problem of helping users to find the items they would like to purchase by producing a predicted likeliness score or a list of Top-N recommended items. At Amazon.com recommendation algorithms are used to personalize online store for each costumer. Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 3 / 21
  4. Recommendation Algorithms Challenges: Improve the scalability: the demands of modern

    systems are to search tens of millions of potential neighbors. Improve the quality: users need recommentadions they can trust to help them find items they will like. Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 4 / 21
  5. Collaborative Filtering Recommendation Algoritms Use a database about user preferences

    to predict additional topics or products a user (active user) might like. Build under the assumption that a good way to find interesting content is to find other people who have similar interests, and then recommend titles that those similar users like. i1 i2 .. in u1 R .. R u2 R .. : : : .:. : um R R .. CF-based algorithms can be divided into two main categories: Memory-based Model-based Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 5 / 21
  6. Memory-based systems These systems utilize the entire user-item database to

    generate a prediction. The main idea is to calculate and use the similarities between users and/or items and use them as weight to predict a rating for a user-item pair. Advantages: The quality of predictions is high. Relatively simple algorithm, allow database updates. Disadvantages: Very slow: they use the entire database every time it makes a prediction (even in memory). Not fast and scalable as we would like them to be in case of very large datasets. Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 6 / 21
  7. Model-based systems Their main idea is the same as the

    one of Memory-Based. However, it overcomes the Memory-based drawbacks by building a model based on the dataset of ratings. The model building process is performed by different machine learing algorithms such as Bayesian networks, clustering and rule-based approaches. Advantages: Scalability: models are much smaller than the actual dataset. Prediction speed is high with respect to the time required to query the model. Disadvantages: Quality of predictions depends a lot on the way the model is built. The model is not flexible to database updates. Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 7 / 21
  8. Item-to-item Recommendation Algorithm They avoid the bottleneck of searching for

    neighbors by exploring the relationships between items first, rather than the relationships between users. Recommendations for users are computed by finding items that are similar to other items the active user has liked. Similarity Computation Prediction Computation Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 8 / 21
  9. Item Similarity Computation Cosine-based Similarity: Two items i,j are thought

    of as two vectors in the m dimensional user-space. sim(i, j) = cos(i, j) = i · j i ∗ j Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 10 / 21
  10. Item Similarity Computation Correlation-based Similarity: Also called Pearson-r correlation. Only

    users that rated both items i,j are considered. sim(i, j) = corri,j = u∈U (Ru,i − ¯ Ri )(Ru,j − ¯ Rj ) u∈U (Ru,i − ¯ Ri )2 u∈U (Ru,j − ¯ Rj )2 Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 11 / 21
  11. Item Similarity Computation Adjusted Cosine Similarity: The difference in rating

    scale between users is now taken into account. sim(i, j) = u∈U (Ru,i − ¯ Ru)(Ru,j − ¯ Ru) u∈U (Ru,i − ¯ Ru)2 u∈U (Ru,j − ¯ Ru)2 Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 12 / 21
  12. Performance implications The similarity computation is the performance bottleneck. The

    similarity table can be computed offline and results can be stored in a table that requires O(n2) space. To compute a prediction on a particular item, only a small set of similar items is needed. For each item only the k most similar items are stored (k << n) and the space required is O(n). We term k as the model size. Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 13 / 21
  13. Prediction Computation: Weighted Sum The most important step in Collaborative

    Filtering system is to generate the output interface in terms of prediction. Once we identified the most similar items we can compute the prediction of a pair user-item as follows. Pu,i = j∈SimilarItems (si,j ∗ Ru,j ) j∈SimilarItems (|si,j |) Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 14 / 21
  14. Experimental Results Dataset: 43000 users. 3000 movies. Only users that

    had rated at least 20 movies have been considered. Training/test ratio x = 0.8. Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 15 / 21
  15. Experimental Results Mean Absolute Error (MAE): For each ratings-prediction pair

    this metric treats the absolute error between them. MAE = N i=1 |pi − qi | N Note: the lower the MAE, the more accurately the recommendation engine predicts user ratings. Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 16 / 21
  16. Conclusion The Item-to-item Collaborative Filtering Algorithm used by Amazon provides

    the same precticions’ quality as the user-user k-nearest neighbor. The item neighborhood is fairly static, so it can be pre-computed offline which results in very high on-line performance also among large data sets. Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 19 / 21
  17. THANK YOU FOR YOUR ATTENTION Zamboni Luca, Zen Roberto Item-to-item

    Collaborative Filtering Alg. May 7, 2015 20 / 21
  18. References I [1] Item-based collaborative filtering recommendation algorithms. Badrul Sarwar,

    George Karypis, Joseph Konstan, and John Riedl. 2001. In Proceedings of the 10th international conference on World Wide Web (WWW ’01). ACM, New York, NY, USA, 285-295. [2] Amazon.com recommendations: item-to-item collaborative filtering. G. Linden, B. Smith, J. York. 2003. Internet Computing, IEEE (Volume:7 , Issue: 1 ). 76 - 80. [3] Empirical analysis of predictive algorithms for collaborative filtering. John S. Breese, David Heckerman, and Carl Kadie. 1998. In Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence (UAI’98), Gregory F. Cooper and Serafn Moral (Eds.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 43-52. [4] http://www.cs.carleton.edu/cs_comps/0607/recommend/ recommender/index.html Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 21 / 21