PM K-LightGCN: Optimizing for Accuracy and Popularity Match in Course Recommendation

PM K-LightGCN: Optimizing for Accuracy and Popularity Match in Course
Recommendation Yiding Ran, Hengchang Hu, Min-Yen Kan MORS-2022

2 Most course recommenders optimize for … Accuracy

Study on course recommender designed for serendipity1: 3 Does accuracy
optimization optimizes user experience? [1]: Zachary A. Pardos and Weijie Jiang. 2020. Designing for serendipity in a university course recommendation system. In Proceedings of the Tenth International Conference on Learning Analytics & Knowledge (LAK '20). Introduce relevant courses that are previously unknown to students Accuracy is not the best metric for user experience. When given more information, students improve their experience by adjusting selections.

Accuracy Optimization Metrics: HR@K, NDCG@K User Experience Optimization Metrics: 4
Objective 1 Multi-Objective Course Recommender Objective 2

5 Small-scale user study Students demonstrated disparate interests towards elective
courses: Some consistently preferred popular courses. Others favored niche ones. Students Elective Courses Student satisfaction is related to whether course popularity matches their interests.

Accuracy Optimization Metrics: HR@K, NDCG@K User Experience Optimization Metrics: 6
Objective 1 Multi-Objective Course Recommender Objective 2 Preference-Popularity Match Preference-Popularity Mismatch@k

Objective 1: Accuracy Optimization Motivation Accuracy Optimization Preference-Popularity Match Conclusion

8 Dataset Course ID wing_modede529dfcbb2907e9760eea0875cdd12 Student ID wing_mod412b5c6d4a88a03e91dfc16dd4d494ff Faculty School
of Computing Interaction Semester 1910 Enrollment Semester 1710 Anonymized listing of per-semester course-taking histories of graduated undergraduates from the year 2010 to 2020 - 41,304 unique students - 5,179 unique courses - 1.4M enrollments Masked due to privacy concerns Motivation Accuracy Optimization Preference-Popularity Match Conclusion

9 Preliminary Study Model HR@10 NDCG@10 ItemKNN 0.7762 0.3337 UserKNN
0.7294 0.2521 MLP 0.6013 0.1946 NeuMF 0.6458 0.2156 LightGCN 0.7008 0.2542 We experimented with non-domain specific models as baselines that take in only course enrollment records. It is possible the nature of course consumption aligns with the underlying inductive bias of KNN models. Uncommon superiority of KNN models. Motivation Accuracy Optimization Preference-Popularity Match Conclusion

10 ItemKNN LightGCN Preliminary Study We compared the structure of
ItemKNN and LightGCN to identify the reasons behind ItemKNN’s strength*. 1. Importance of pairwise relations 2. Focus on close neighbors *: For more information, see this deck’s appendix slides 25-29. Motivation Accuracy Optimization Preference-Popularity Match Conclusion

K-LightGCN Step 1: compute LightGCN similarity matrix 11 𝑐𝑐! 𝑐𝑐"
Sum Neighbors of 𝑠𝑠# Layer 1 𝑠𝑠$ 𝑠𝑠! 𝑠𝑠" 𝑝𝑝%! ('($) 𝑝𝑝%" ('($) 𝑝𝑝%# ('($) Sum Neighbors of 𝑐𝑐* Layer 1 𝑞𝑞+" ('($) 𝑞𝑞+# ('($) p%$ ($) 𝑞𝑞+% ($) Student-student similarity matrix 𝑠𝑠$ 𝑠𝑠! 𝑠𝑠" 𝑠𝑠$ 𝑠𝑠! 𝑠𝑠" 𝑠𝑠, 𝑠𝑠, Item- Item similarity matrix by ItemKNN User- User similarity matrix by UserKNN Legend: Supervise user-user & item-item similarity Compute the pairwise similarity matrix for students and courses using embeddings learnt by LightGCN 𝑐𝑐$ 𝑐𝑐! 𝑐𝑐" 𝑐𝑐$ 𝑐𝑐! 𝑐𝑐" Course-course similarity matrix ` `

Compute the differences between LightGCN similarity matrix and KNN similarity
matrix 12 𝑚𝑚! 𝑚𝑚" Sum Neighbors of 𝑠𝑠# Layer 1 𝑠𝑠$ 𝑠𝑠! 𝑠𝑠" 𝑝𝑝%! ('($) 𝑝𝑝%" ('($) 𝑝𝑝%# ('($) Sum Neighbors of 𝑚𝑚* Layer 1 𝑞𝑞-" ('($) 𝑞𝑞-# ('($) p%$ ($) 𝑞𝑞+% ($) Student-student similarity matrix 𝑠𝑠$ 𝑠𝑠! 𝑠𝑠" 𝑠𝑠$ 𝑠𝑠! 𝑠𝑠" 𝑠𝑠, 𝑠𝑠, Item- Item similarity matrix by ItemKNN User- User similarity matrix by UserKNN Legend: Supervise user-user & item-item similarity 𝑐𝑐$ 𝑐𝑐! 𝑐𝑐" 𝑐𝑐$ 𝑐𝑐! 𝑐𝑐" Course-course similarity matrix ` K-LightGCN Step 2: compute differences between two sets of similarity matrix

𝑠𝑠$ 𝑠𝑠! 𝑠𝑠" 𝑝𝑝%! ('($) 𝑝𝑝%" ('($) 𝑝𝑝%# ('($) Su
m Neighbors of 𝑚𝑚* Layer 1 Layer 2 Layer 3 𝑚𝑚! 𝑚𝑚" 𝑞𝑞-" ('($) 𝑞𝑞-# ('($) Su m Neighbors of 𝑠𝑠# Layer 1 Layer 2 Layer 3 p%$ (") p%$ (!) p%$ ($) p%$ (.) ⊕ 𝑞𝑞+% (") 𝑞𝑞+% (!) 𝑞𝑞+% ($) 𝑞𝑞+% (.) ⊕ ⨂LightGCN Prediction Layer Combination 13 Item- Item similarity matrix y * y User- User similarity matrix x * x Legend: Graph convolution Supervise user-user & item-item similarity Add in differences between LightGCN similarity matrix and KNN similarity matrix to the LightGCN BPR loss. 𝐿𝐿/01 234 = 𝐿𝐿/01 K-LightGCN Step 3: modify pairwise loss

14 K-LightGCN Step 4: combine revised LightGCN and ItemKNN

15 K-LightGCN for accuracy optimization Model HR@10 NDCG@10 ItemKNN 0.7762
0.3337 UserKNN 0.7294 0.2521 MLP 0.6013 0.1946 NeuMF 0.6458 0.2156 LightGCN 0.7008 0.2542 K-LightGCN 0.7905 (+2%) 0.3346 (+0.3%) With focus on both pairwise relations and close neighbors, K-LightGCN outperforms all in terms of both accuracy metrics. Motivation Accuracy Optimization Preference-Popularity Match Conclusion

Objective 2: Preference-Popularity Match Motivation Accuracy Optimization Preference-Popularity Match Conclusion

17 Quantify popularity and preference We proposed continuous measures for
course popularity and student preference. Motivation Accuracy Optimization Preference-Popularity Match Conclusion Course popularity: Log of the average enrollment of a course. Student preference: Average of the course popularity taken previously. Legend: Top 25 percentile Bottom 75 percentile

18 Measure preference-popularity match We proposed a loss that measures
the mismatch between student preference and popularity of recommended courses. Popularity-Preference Mismatch (PP-Mismatch@K) 𝖳𝖳𝗁𝗁𝖾𝖾 𝖺𝖺𝗏𝗏𝖾𝖾𝗋𝗋𝖺𝖺𝗀𝗀𝖾𝖾 𝖽𝖽𝗂𝗂𝖿𝖿𝖿𝖿𝖾𝖾𝗋𝗋𝖾𝖾𝗇𝗇𝖼𝖼𝖾𝖾 𝖻𝖻𝖾𝖾𝗍𝗍𝗐𝗐𝖾𝖾𝖾𝖾𝗇𝗇 𝗍𝗍𝗁𝗁𝖾𝖾 𝗍𝗍𝖺𝖺𝗋𝗋𝗀𝗀𝖾𝖾𝗍𝗍 𝗌𝗌𝗍𝗍𝗎𝗎𝖽𝖽𝖾𝖾𝗇𝗇𝗍𝗍’𝗌𝗌 𝗉𝗉𝗋𝗋𝖾𝖾𝖿𝖿𝖾𝖾𝗋𝗋𝖾𝖾𝗇𝗇𝖼𝖼𝖾𝖾 𝖺𝖺𝗇𝗇𝖽𝖽 𝗉𝗉𝗈𝗈𝗉𝗉𝗎𝗎𝗅𝗅𝖺𝖺𝗋𝗋𝗂𝗂𝗍𝗍𝗒𝗒 𝗈𝗈𝖿𝖿 𝗍𝗍𝗈𝗈𝗉𝗉 K courses 𝗋𝗋𝖾𝖾𝖼𝖼𝗈𝗈𝗆𝗆𝗆𝗆𝖾𝖾𝗇𝗇𝖽𝖽𝖾𝖾𝖽𝖽. PP-Mismatch@K = Motivation Accuracy Optimization Preference-Popularity Match Conclusion

19 𝑠𝑠" 1. Take top 50 recommendations by K-LightGCN for
target student 𝑠𝑠" 2. Sort the top 50 recommendations based on the difference between popularity of the recommended courses and student preference Top 50 recommendations 3. Keep only the top 10 courses with popularity closest to target student’s preference Preference-Match K-LightGCN (PM K-LightGCN) Same model structure as K-LightGCN With a selection component to mitigate popularity mismatch Objective 1: Accuracy Optimization Objective 2: Preference-popularity match

20 PM K-LightGCN as multi-objective course recommender Model PP-Mismatch@10 HR@10
NDCG@10 ItemKNN 1.050 0.7762 0.3337 UserKNN 1.071 0.7294 0.2521 LightGCN 1.077 0.7008 0.2542 K-LightGCN 1.109 0.7905 0.3346 PM K-LightGCN 0.920 (-17%) 0.7570 (-4%) 0.3000 (-10%) PM K-LightGCN achieves a 17% reduction in preference-popularity mismatch at the sacrifice of only 4% in HR@10. The fall in accuracy can improve user satisfaction as enrollment records may not optimize student experience due to their limited knowledge. Motivation Accuracy Optimization Preference-Popularity Match Conclusion

21 Conclusion Special nature of course recommendation - Importance of
pairwise relation - Focus on close neighbors PM K-LightGCN: Multi-objective Course Recommender - Optimizes for accuracy and preference-popularity match - Lightweight design allows incorporation of additional criteria in the future. Alternative representations of student preference - What if the current measure underestimates student preference as they are not aware of other niche courses? - Take into consideration variations in the popularity of courses taken by the student More extensive user study - Need for a larger scale user study to test the relation between preference-popularity match and user satisfaction - Conduct user study for better model evaluation from users’ perspective Role of course recommender in tertiary education - Cater to students’ preferences or expose them to courses the educators think are useful? - What is the bigger picture?

Thank You! Yiding Ran ([email protected]) Hengchang Hu ([email protected]) Min-Yen Kan
([email protected]) 22

23 References [1]: Zachary A. Pardos and Weijie Jiang. 2020.
Designing for serendipity in a university course recommendation system. In Proceedings of the Tenth International Conference on Learning Analytics & Knowledge (LAK '20).

Appendix

25 Compare the structure of ItemKNN and LightGCN We identified
structural differences between ItemKNN and LightGCN to check whether they can explain ItemKNN’s strength. ItemKNN LightGCN

26 Structural Difference#1: Importance of Pairwise Relation ItemKNN LightGCN Considers
only pairwise relations using user/item pairwise similarity matrix. 𝑠𝑠( 𝑠𝑠) 𝑠𝑠* 𝑠𝑠+ 𝑐𝑐( 𝑐𝑐) 𝑐𝑐* 𝑐𝑐+ 𝑐𝑐, Layer #1 Layer #2 Layer #3 At layer#2, pairwise relation is considered. Hypothesis: A deep LightGCN structure is not necessary in course recommendation.

27 Structural Difference#1: Importance of Pairwise Relation #layers HR@10 NDCG@10
1 0.6588 0.2260 2 0.6950 0.2502 3 0.6938 0.2491 4 0.6897 0.2434 6 0.6798 0.2365 We experimented with LightGCN with different number of layers. Capturing pairwise relation is critical to accuracy optimization.

28 Structural Difference#2: Focus on close neighbors ItemKNN LightGCN Considers
only top K neighbors. 𝑠𝑠( 𝑠𝑠) 𝑠𝑠* 𝑠𝑠+ 𝑐𝑐( 𝑐𝑐) 𝑐𝑐* 𝑐𝑐+ 𝑐𝑐, Layer #1 Layer #2 Layer #3 Information from all neighbors contributes to embedding learning and final recommendation. Hypothesis: By considering all neighbors, LightGCN embedding is affected by noise in the data.

29 Structural Difference#2: Focus on close neighbors We restrain LightGCN
at the 2nd layer to only perform neighbor propagation using closest K neighbors identified by KNN models. This revised LightGCN is called Constrain-Neighbor LightGCN (CN-LightGCN). Neighborhood information is important but it should be used selectively. Model HR@10 NDCG@10 LightGCN 0.7008 0.2542 CN-LightGCN 0.7287 (+3.98%) 0.2896 (+13.9%)

PM K-LightGCN: Optimizing for Accuracy and Popu...

PM K-LightGCN: Optimizing for Accuracy and Popularity Match in Course Recommendation

wing.nus

More Decks by wing.nus

Other Decks in Education

Featured

Transcript