Slide 1

Slide 1 text

Item-to-item Collaborative Filtering Recommendation Algorithm Zamboni Luca Zen Roberto Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 1 / 21

Slide 2

Slide 2 text

Topics Recommendation Algorithms: targets and problems Collaborative Filtering Recommendation Algorithms Memory-Based and Model-Based Item-to-item Recommendation Algorithm Experimental results Conclusion Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 2 / 21

Slide 3

Slide 3 text

Recommendation Algorithms Recommendation Algorithms are used in E-commerce, web-sites and email advertising. They apply data analysis techniques to the problem of helping users to find the items they would like to purchase by producing a predicted likeliness score or a list of Top-N recommended items. At Amazon.com recommendation algorithms are used to personalize online store for each costumer. Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 3 / 21

Slide 4

Slide 4 text

Recommendation Algorithms Challenges: Improve the scalability: the demands of modern systems are to search tens of millions of potential neighbors. Improve the quality: users need recommentadions they can trust to help them find items they will like. Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 4 / 21

Slide 5

Slide 5 text

Collaborative Filtering Recommendation Algoritms Use a database about user preferences to predict additional topics or products a user (active user) might like. Build under the assumption that a good way to find interesting content is to find other people who have similar interests, and then recommend titles that those similar users like. i1 i2 .. in u1 R .. R u2 R .. : : : .:. : um R R .. CF-based algorithms can be divided into two main categories: Memory-based Model-based Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 5 / 21

Slide 6

Slide 6 text

Memory-based systems These systems utilize the entire user-item database to generate a prediction. The main idea is to calculate and use the similarities between users and/or items and use them as weight to predict a rating for a user-item pair. Advantages: The quality of predictions is high. Relatively simple algorithm, allow database updates. Disadvantages: Very slow: they use the entire database every time it makes a prediction (even in memory). Not fast and scalable as we would like them to be in case of very large datasets. Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 6 / 21

Slide 7

Slide 7 text

Model-based systems Their main idea is the same as the one of Memory-Based. However, it overcomes the Memory-based drawbacks by building a model based on the dataset of ratings. The model building process is performed by different machine learing algorithms such as Bayesian networks, clustering and rule-based approaches. Advantages: Scalability: models are much smaller than the actual dataset. Prediction speed is high with respect to the time required to query the model. Disadvantages: Quality of predictions depends a lot on the way the model is built. The model is not flexible to database updates. Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 7 / 21

Slide 8

Slide 8 text

Item-to-item Recommendation Algorithm They avoid the bottleneck of searching for neighbors by exploring the relationships between items first, rather than the relationships between users. Recommendations for users are computed by finding items that are similar to other items the active user has liked. Similarity Computation Prediction Computation Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 8 / 21

Slide 9

Slide 9 text

Item Similarity Computation Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 9 / 21

Slide 10

Slide 10 text

Item Similarity Computation Cosine-based Similarity: Two items i,j are thought of as two vectors in the m dimensional user-space. sim(i, j) = cos(i, j) = i · j i ∗ j Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 10 / 21

Slide 11

Slide 11 text

Item Similarity Computation Correlation-based Similarity: Also called Pearson-r correlation. Only users that rated both items i,j are considered. sim(i, j) = corri,j = u∈U (Ru,i − ¯ Ri )(Ru,j − ¯ Rj ) u∈U (Ru,i − ¯ Ri )2 u∈U (Ru,j − ¯ Rj )2 Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 11 / 21

Slide 12

Slide 12 text

Item Similarity Computation Adjusted Cosine Similarity: The difference in rating scale between users is now taken into account. sim(i, j) = u∈U (Ru,i − ¯ Ru)(Ru,j − ¯ Ru) u∈U (Ru,i − ¯ Ru)2 u∈U (Ru,j − ¯ Ru)2 Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 12 / 21

Slide 13

Slide 13 text

Performance implications The similarity computation is the performance bottleneck. The similarity table can be computed offline and results can be stored in a table that requires O(n2) space. To compute a prediction on a particular item, only a small set of similar items is needed. For each item only the k most similar items are stored (k << n) and the space required is O(n). We term k as the model size. Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 13 / 21

Slide 14

Slide 14 text

Prediction Computation: Weighted Sum The most important step in Collaborative Filtering system is to generate the output interface in terms of prediction. Once we identified the most similar items we can compute the prediction of a pair user-item as follows. Pu,i = j∈SimilarItems (si,j ∗ Ru,j ) j∈SimilarItems (|si,j |) Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 14 / 21

Slide 15

Slide 15 text

Experimental Results Dataset: 43000 users. 3000 movies. Only users that had rated at least 20 movies have been considered. Training/test ratio x = 0.8. Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 15 / 21

Slide 16

Slide 16 text

Experimental Results Mean Absolute Error (MAE): For each ratings-prediction pair this metric treats the absolute error between them. MAE = N i=1 |pi − qi | N Note: the lower the MAE, the more accurately the recommendation engine predicts user ratings. Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 16 / 21

Slide 17

Slide 17 text

Experimental Results Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 17 / 21

Slide 18

Slide 18 text

Experimental Results Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 18 / 21

Slide 19

Slide 19 text

Conclusion The Item-to-item Collaborative Filtering Algorithm used by Amazon provides the same precticions’ quality as the user-user k-nearest neighbor. The item neighborhood is fairly static, so it can be pre-computed offline which results in very high on-line performance also among large data sets. Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 19 / 21

Slide 20

Slide 20 text

THANK YOU FOR YOUR ATTENTION Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 20 / 21

Slide 21

Slide 21 text

References I [1] Item-based collaborative filtering recommendation algorithms. Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2001. In Proceedings of the 10th international conference on World Wide Web (WWW ’01). ACM, New York, NY, USA, 285-295. [2] Amazon.com recommendations: item-to-item collaborative filtering. G. Linden, B. Smith, J. York. 2003. Internet Computing, IEEE (Volume:7 , Issue: 1 ). 76 - 80. [3] Empirical analysis of predictive algorithms for collaborative filtering. John S. Breese, David Heckerman, and Carl Kadie. 1998. In Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence (UAI’98), Gregory F. Cooper and Serafn Moral (Eds.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 43-52. [4] http://www.cs.carleton.edu/cs_comps/0607/recommend/ recommender/index.html Zamboni Luca, Zen Roberto Item-to-item Collaborative Filtering Alg. May 7, 2015 21 / 21