Y. Yamamoto
December 10, 2021
48

# Recommender Systems Part 2 - 2021.12.20

1. Programming assignments review
2. Problems on user-based collaborative filtering
3. Item-based collaborative filtering
4. Programming work

## Y. Yamamoto

December 10, 2021

## Transcript

1. ### Item-based Collaborative Filtering Yusuke Yamamoto Associate Professor, Faculty of Informatics

yusuke_yamamoto@acm.org Data Engineering （Recommender Systems 2） 2021.12.20

6. ### User-based Collaborative Filtering 6 Predicts a target user’s rating for

an item based on rating tendency of similar users Item5 sim Average Rating Alice ? 1 4 User1 3 0.85 2.4 User2 5 0.71 3.8 Similar users 𝑝𝑟𝑒𝑑𝑖𝑐𝑡 𝑢! , 𝑖 = 𝑟"! + ∑""∈\$" 𝑠𝑖𝑚(𝑢! , 𝑢% ) 1 (𝑟"",' − 𝑟"" ) ∑""∈\$" 𝑠𝑖𝑚(𝑢! , 𝑢% )
7. ### Computation of similarity between users 7 Pearson’s correlation coefficient 𝑠𝑖𝑚

𝑢! , 𝑢" = ∑#∈% (𝑟&#,# − 𝑟&# )(𝑟&\$,# − 𝑟&\$ ) ∑ #∈% 𝑟&#,# − 𝑟&# ( ∑ #∈% 𝑟&\$,# − 𝑟&\$ ( Item1 Item2 Item3 Item4 Alice 5 3 4 4 User1 3 1 2 3 User2 4 3 4 3 User3 3 3 1 5 User4 1 5 5 2 sim=0.71 sim=-0.79
8. ### Problems on User-based Collaborative Filtering (1/2) 8 Item1 Item2 Item3

Item4 item5 item6 Bob 3 2 User1 3 1 2 3 User2 4 3 4 3 User3 3 3 1 5 User4 1 5 5 2 5 • It is rare that two users rate the same item • User similarity drastically changes if a few ratings are added Impossible to compute similarity Is it possible to compute precise user similarity by using rating scores for only one common item? If users haven’t rate the same items yet, user similarity cannot be computed
9. ### Problems on User-based Collaborative Filtering (2/2) 9 #Users >> #Items

• In general, the number of users are much bigger than that of items • Big computational cost of nearest neighbors (similar users) Unstable user preference User preferences (user features) often change, while item features do not often change

11. ### Idea about Item-based Collaborative Filtering 11 Item1 Item2 Item3 Item4

Item5 Alice 5 3 4 4 ? User1 3 1 2 3 3 User2 4 3 4 3 5 User3 3 3 1 5 4 User4 1 5 5 2 1 similar Predicts unknown scores based on rating tendency for similar items similar
12. ### Advantages of Item-based Collaborative Filtering 12 Computational cost In general,

the number of items is much less than that of users, and so the item-based CF’s computational cost is much smaller than the user- based CF’s Stable similarity computation • Item features (vectors) do not often change and are stable • Compared to user features (vectors) on a rating matrix, features (vectors) have less N/A dimensions. • It is possible to compute similarity between items by using enough information
13. ### Computation of Similarity between Items (1/2) 13 Cosine similarity 𝑠𝑖𝑚

𝑖! , 𝑖" = cos 𝜃 = 𝒗#) + 𝒗#* 𝒗#) ∗ |𝒗#* | • Focuses on the angle between two vectors • The similarity ranges between -1 and 1 • Best performance for item similarity calculation ：Item a, b 𝑖! , 𝑖" ：Item a, b’s rating vector 𝒗#! , 𝒗#! 0 :Angle between 𝒗#! , 𝒗#! 𝜃 ：Vector 𝒗’s length |𝒗|
14. ### Computation of Similarity between Items (2/2) 14 Item1 Item2 Item3

Item4 Item5 Alice 5 3 4 4 ? User1 3 1 2 3 3 User2 4 3 4 3 5 User3 3 3 1 5 4 User4 1 5 5 2 1 sim=? 𝑠𝑖𝑚 𝑖! , 𝑖" = 3×3 + 4×5 + 3×4 + 1×1 3# + 4# + 3# + 1#× 3# + 5# + 4# + 1# = 0.99
15. ### Problem of using basic cosine similarity 15 0 1 2

3 4 5 6 Item1 Item2 Item3 Item4 Alice User1 Rating score Basic cosine similarity does not take the difference in the average rating behavior of the users into account Alice rates easily, and User1 rates strictly. However, if considering the difference from the average, the ratings for each item do not vary between Alice and User 1
16. ### Adjusted Cosine Similarity (1/3) 16 Item1 Item2 Item3 Item4 Item5

Avg. Alice 5 3 4 4 ? 4 User1 3 1 2 3 3 2.4 User2 4 3 4 3 5 3.8 User3 3 3 1 5 4 3.2 User4 1 5 5 2 1 2.8 Subtracts the average of rating scores from each rating and calculates cosine similarity using the adjusted rating matrix
17. ### Adjusted Cosine Similarity (2/3) 17 Subtracts the user average from

the ratings and calculates cosine similarity using the adjusted rating matrix Item1 Item2 Item3 Item4 Item5 Avg. Alice 5 3 4 4 ? 4 User1 3 1 2 3 3 2.4 User2 4 3 4 3 5 3.8 User3 3 3 1 5 4 3.2 User4 1 5 5 2 1 2.8 -4 -4 -4 -4 -2.4 -2.4 -2.4 -2.4 -3.8 -3.8 -3.8 -3.8 -3.2 -3.2 -3.2 -3.2 -2.8 -2.8 -2.8 -2.8 -2.4 -3.8 -3.2 -2.8
18. ### Adjusted Cosine Similarity (3/3) 18 Subtracts the user average from

the ratings and calculates cosine similarity using the adjusted rating matrix 𝑠𝑖𝑚 𝑖! , 𝑖" = 0.6×0.6 + 0.2×1.2 + (−0.2)×0.8 + (−1.8)×(−1.8) 0.6# + 0.2# + (−0.2)#+(−1.8)#× 0.6# + 1.2# + 0.8# + (−1.8)# = 0.80 Item1 Item2 Item3 Item4 Item5 Avg. Alice 1.0 -1.0 0.0 0.0 ? 4 User1 0.6 -1.4 -0.4 0.6 0.6 2.4 User2 0.2 -0.8 0.2 -0.8 1.2 3.8 User3 -0.2 -0.2 -2.2 2.8 0.8 3.2 User4 -1.8 2.2 2.2 -0.8 -1.8 2.8
19. ### Rating Prediction based on Item Similarity 19 Prediction Function (predicted

scores are adjusted) 𝑝𝑟𝑒𝑑𝑖𝑐𝑡 𝑢! , 𝑖\$ = ∑#∈&+ 𝑠𝑖𝑚(𝑖\$ , 𝑖) + 𝑟'),# ∑#∈&+ 𝑠𝑖𝑚(𝑖\$ , 𝑖) ： target user a 𝑢! 𝑟\$,# ： rating score of user u for item i 𝑖& ： target item t 𝐼' ： a set of similar items for a target item
20. ### Selection of Similar Item (nearest neighbor items) 20 Set a

threshold for item similarity Focus on top K similar items （kNN method） If an item has higher similarity than a threshold, it can be regarded as a “similar” item • If an item ranks at the top K similarity, it can be regarded as a similar item • K is often set to between 50 〜 200
21. ### Summary of Item-based Collaborative Filtering 21 Basic Approach • Item

similarities are obtained from a rating matrix • Based on rating scores of similar items, systems predict a rating score of target user for a target item Similarity Calculation Cosine similarity is known best in practice Selection of Similar Items Top K items with high similarity are often selected as similar items