Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Recommender Systems Part 2 - 2021.12.20

059fb717431a8cd2b509ffebc57d905a?s=47 Y. Yamamoto
December 10, 2021

Recommender Systems Part 2 - 2021.12.20

1. Programming assignments review
2. Problems on user-based collaborative filtering
3. Item-based collaborative filtering
4. Programming work

059fb717431a8cd2b509ffebc57d905a?s=128

Y. Yamamoto

December 10, 2021
Tweet

More Decks by Y. Yamamoto

Other Decks in Science

Transcript

  1. Item-based Collaborative Filtering Yusuke Yamamoto Associate Professor, Faculty of Informatics

    yusuke_yamamoto@acm.org Data Engineering (Recommender Systems 2) 2021.12.20
  2. 0 2 Programming assignment review for the last lecture

  3. 3 Visit the following URL: https://recsys2021.hontolab.org/

  4. Click this link to see my sample answers

  5. 1 5 Problems on User-based Collaborative Filtering

  6. User-based Collaborative Filtering 6 Predicts a target user’s rating for

    an item based on rating tendency of similar users Item5 sim Average Rating Alice ? 1 4 User1 3 0.85 2.4 User2 5 0.71 3.8 Similar users 𝑝𝑟𝑒𝑑𝑖𝑐𝑡 𝑢! , 𝑖 = 𝑟"! + ∑""∈$" 𝑠𝑖𝑚(𝑢! , 𝑢% ) 1 (𝑟"",' − 𝑟"" ) ∑""∈$" 𝑠𝑖𝑚(𝑢! , 𝑢% )
  7. Computation of similarity between users 7 Pearson’s correlation coefficient 𝑠𝑖𝑚

    𝑢! , 𝑢" = ∑#∈% (𝑟&#,# − 𝑟&# )(𝑟&$,# − 𝑟&$ ) ∑ #∈% 𝑟&#,# − 𝑟&# ( ∑ #∈% 𝑟&$,# − 𝑟&$ ( Item1 Item2 Item3 Item4 Alice 5 3 4 4 User1 3 1 2 3 User2 4 3 4 3 User3 3 3 1 5 User4 1 5 5 2 sim=0.71 sim=-0.79
  8. Problems on User-based Collaborative Filtering (1/2) 8 Item1 Item2 Item3

    Item4 item5 item6 Bob 3 2 User1 3 1 2 3 User2 4 3 4 3 User3 3 3 1 5 User4 1 5 5 2 5 • It is rare that two users rate the same item • User similarity drastically changes if a few ratings are added Impossible to compute similarity Is it possible to compute precise user similarity by using rating scores for only one common item? If users haven’t rate the same items yet, user similarity cannot be computed
  9. Problems on User-based Collaborative Filtering (2/2) 9 #Users >> #Items

    • In general, the number of users are much bigger than that of items • Big computational cost of nearest neighbors (similar users) Unstable user preference User preferences (user features) often change, while item features do not often change
  10. 2 10 Item-based Collaborative Filtering

  11. Idea about Item-based Collaborative Filtering 11 Item1 Item2 Item3 Item4

    Item5 Alice 5 3 4 4 ? User1 3 1 2 3 3 User2 4 3 4 3 5 User3 3 3 1 5 4 User4 1 5 5 2 1 similar Predicts unknown scores based on rating tendency for similar items similar
  12. Advantages of Item-based Collaborative Filtering 12 Computational cost In general,

    the number of items is much less than that of users, and so the item-based CF’s computational cost is much smaller than the user- based CF’s Stable similarity computation • Item features (vectors) do not often change and are stable • Compared to user features (vectors) on a rating matrix, features (vectors) have less N/A dimensions. • It is possible to compute similarity between items by using enough information
  13. Computation of Similarity between Items (1/2) 13 Cosine similarity 𝑠𝑖𝑚

    𝑖! , 𝑖" = cos 𝜃 = 𝒗#) + 𝒗#* 𝒗#) ∗ |𝒗#* | • Focuses on the angle between two vectors • The similarity ranges between -1 and 1 • Best performance for item similarity calculation :Item a, b 𝑖! , 𝑖" :Item a, b’s rating vector 𝒗#! , 𝒗#! 0 :Angle between 𝒗#! , 𝒗#! 𝜃 :Vector 𝒗’s length |𝒗|
  14. Computation of Similarity between Items (2/2) 14 Item1 Item2 Item3

    Item4 Item5 Alice 5 3 4 4 ? User1 3 1 2 3 3 User2 4 3 4 3 5 User3 3 3 1 5 4 User4 1 5 5 2 1 sim=? 𝑠𝑖𝑚 𝑖! , 𝑖" = 3×3 + 4×5 + 3×4 + 1×1 3# + 4# + 3# + 1#× 3# + 5# + 4# + 1# = 0.99
  15. Problem of using basic cosine similarity 15 0 1 2

    3 4 5 6 Item1 Item2 Item3 Item4 Alice User1 Rating score Basic cosine similarity does not take the difference in the average rating behavior of the users into account Alice rates easily, and User1 rates strictly. However, if considering the difference from the average, the ratings for each item do not vary between Alice and User 1
  16. Adjusted Cosine Similarity (1/3) 16 Item1 Item2 Item3 Item4 Item5

    Avg. Alice 5 3 4 4 ? 4 User1 3 1 2 3 3 2.4 User2 4 3 4 3 5 3.8 User3 3 3 1 5 4 3.2 User4 1 5 5 2 1 2.8 Subtracts the average of rating scores from each rating and calculates cosine similarity using the adjusted rating matrix
  17. Adjusted Cosine Similarity (2/3) 17 Subtracts the user average from

    the ratings and calculates cosine similarity using the adjusted rating matrix Item1 Item2 Item3 Item4 Item5 Avg. Alice 5 3 4 4 ? 4 User1 3 1 2 3 3 2.4 User2 4 3 4 3 5 3.8 User3 3 3 1 5 4 3.2 User4 1 5 5 2 1 2.8 -4 -4 -4 -4 -2.4 -2.4 -2.4 -2.4 -3.8 -3.8 -3.8 -3.8 -3.2 -3.2 -3.2 -3.2 -2.8 -2.8 -2.8 -2.8 -2.4 -3.8 -3.2 -2.8
  18. Adjusted Cosine Similarity (3/3) 18 Subtracts the user average from

    the ratings and calculates cosine similarity using the adjusted rating matrix 𝑠𝑖𝑚 𝑖! , 𝑖" = 0.6×0.6 + 0.2×1.2 + (−0.2)×0.8 + (−1.8)×(−1.8) 0.6# + 0.2# + (−0.2)#+(−1.8)#× 0.6# + 1.2# + 0.8# + (−1.8)# = 0.80 Item1 Item2 Item3 Item4 Item5 Avg. Alice 1.0 -1.0 0.0 0.0 ? 4 User1 0.6 -1.4 -0.4 0.6 0.6 2.4 User2 0.2 -0.8 0.2 -0.8 1.2 3.8 User3 -0.2 -0.2 -2.2 2.8 0.8 3.2 User4 -1.8 2.2 2.2 -0.8 -1.8 2.8
  19. Rating Prediction based on Item Similarity 19 Prediction Function (predicted

    scores are adjusted) 𝑝𝑟𝑒𝑑𝑖𝑐𝑡 𝑢! , 𝑖$ = ∑#∈&+ 𝑠𝑖𝑚(𝑖$ , 𝑖) + 𝑟'),# ∑#∈&+ 𝑠𝑖𝑚(𝑖$ , 𝑖) : target user a 𝑢! 𝑟$,# : rating score of user u for item i 𝑖& : target item t 𝐼' : a set of similar items for a target item
  20. Selection of Similar Item (nearest neighbor items) 20 Set a

    threshold for item similarity Focus on top K similar items (kNN method) If an item has higher similarity than a threshold, it can be regarded as a “similar” item • If an item ranks at the top K similarity, it can be regarded as a similar item • K is often set to between 50 〜 200
  21. Summary of Item-based Collaborative Filtering 21 Basic Approach • Item

    similarities are obtained from a rating matrix • Based on rating scores of similar items, systems predict a rating score of target user for a target item Similarity Calculation Cosine similarity is known best in practice Selection of Similar Items Top K items with high similarity are often selected as similar items
  22. 3 Programming Work 22

  23. 23 Visit the following URL: https://recsys2021.hontolab.org/

  24. Click this link to learn today’s contents