Slide 1

Slide 1 text

Yuya Matsumura @yu-ya4 ʲ࿦จ঺հʳLatent Factor Models and Aggregation Operators for Collaborative Filtering in Reciprocal Recommender Systems RecSys2019 ࿦จಡΈձ 5 Oct. 2019 J Neve, I Palomares.2019. Proceedings of the 13th ACM Conference on Recommender Systems, 219-227.

Slide 2

Slide 2 text

✓ Yuya Matsumura ✓ Wantedly, Inc. Recommendation Team ✓ Data Scientist, Team Lead ✓ Interested in Information Retrieval, Machine Learning Self-Introduction @yu-ya4 @yu__ya4

Slide 3

Slide 3 text

✓ User-to-User recommendation (not User-to-Item recommendation) ✓ Aim to create good match between two users ✓ Relatively unexplored area ✓ Little progression past kNN implementations What is Reciprocal Recommender System (RRSs) ?

Slide 4

Slide 4 text

R LuizPizzato,TomekRej,ThomasChung,IrenaKoprinska,andJudyKay.2010. RECON: a reciprocal recommender for online dating. Proceedings of the fourth ACM conference on Recommender systems P. 207-214. Definition of RRSs

Slide 5

Slide 5 text

Overview of RRSs User A -> User B Preference Score User B -> User A Preference Score Preference Score Aggregation Bidirectional Preference Relation

Slide 6

Slide 6 text

✓ RECON, a content-based filtering algorithm for reciprocal recommendation ‣ The first work to distinguish RSSs and define its properties. ‣ Extract implicit preferences by looking at attributes in common amongst those whom a given user has sent messages to. (e.g. Body Shape, Personality, Education…) ‣ However, physical appearances is often the main factor taken into consideration… ✓ Reciprocal Collaborative Filtering (RCF), a collaborative filtering based algorithm ‣ Use a nearest-neighbor collaborative filtering ‣ Significantly improve on RECON’s results ‣ SoTA!! ‣ However, limitation in computational complexity of calculating similarities between all pairs of users Related Works Peng Xia, Benyuan Liu, Yizhou Sun, and Cindy Chen. 2015. Reciprocal Rec- ommendation System for Online Dating. Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining P. 234-241. 


Slide 7

Slide 7 text

1. Reciprocal recommendation based on latent factor models (often used in conventional RSs) 2. Efficient recommendation for large datasets (realistic used in real time systems) 3. Exploring different aggregation functions Contributions

Slide 8

Slide 8 text

Latent Factor Model for Reciprocal Recommendation (LFRR) ✓ Dataset has over 10 million users, and most users will only see a small fraction of other users, and give an opinion on an even smaller fraction. ✓ Imputing unknown data was likely to introduce a great deal of inaccuracy into the model ✓ Using a learning algorithm to generate latent factors using only the known data was likely to produce much more accurate results. ➡ Use Stochastic Gradient Descent (SGD) , which tends to give better results with sparse, explicit data ➡ Not other common methods( such as Alternating Least Squares (ALS), which tends to give better results with dense, implicit data.)

Slide 9

Slide 9 text

System Architecture Latent Factor Model for Reciprocal Recommendation (LFRR)

Slide 10

Slide 10 text

Preference Aggregation 1. Arithmetic Mean 2. Geometric Mean 3. Harmonic Mean 4. Cross-Ratio Uninorm (1) (2) (3) (4)

Slide 11

Slide 11 text

Preference Aggregation 1. Arithmetic Mean 2. Geometric Mean 3. Harmonic Mean 4. Cross-Ratio Uninorm (1) (2) (3) (4) Previous works used only (3)

Slide 12

Slide 12 text

Preference Aggregation 1. Arithmetic Mean 2. Geometric Mean 3. Harmonic Mean 4. Cross-Ratio Uninorm (1) (2) (3) (4) (i) two low values are aggregated to produce a lower value (conjunctive behaviour) (ii) two high values are aggregated to produce a higher value (disjunctive behaviour) (iii) a high and a low value are aggregated to a value that lies in between both (averaging behaviour)

Slide 13

Slide 13 text

Evaluation ✓ Dataset: Pairs (10M users, over 8 years, 9M Likes per week) ✓ Preference: Like • a binary value • a little effort to send ✓ Limitations 1. Users who live in Tokyo and the surrounding areas. These users represent a significant majority of Pairs users. 2. Users between 18 and 30 years of age, for the same reason as above. Users outside this age range are outliers in the user base. 3. Users who have sent at least 10 Likes. ✓ Metrics • Effectiveness Evaluation: ROC curve, F1-value • Efficiency Evaluation

Slide 14

Slide 14 text

Results (ROC Curves)

Slide 15

Slide 15 text

Results (F1 Scores)

Slide 16

Slide 16 text

Results (Efficiency)

Slide 17

Slide 17 text

✓ Latent factor models produce similar precision and recall to kNN ɹ while reducing execution time (So a more realistic algorithm) ✓ Aggregation functions significantly impact the results of RRSs. Conclusions