Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Моделирование пользовательских предпочтений в мультимодальных данных. Hady W. Lauw, Максим Ткаченко (Singapore Management University)

AvitoTech
April 18, 2018

Моделирование пользовательских предпочтений в мультимодальных данных. Hady W. Lauw, Максим Ткаченко (Singapore Management University)

AvitoTech

April 18, 2018
Tweet

More Decks by AvitoTech

Other Decks in Science

Transcript

  1. 2

  2. 3

  3. 4

  4. 5

  5. Singapore Management University • 1 of 4 major research universities

    in Singapore • City university in downtown Singapore • Established in 2000 • 10 thousand students (20% postgraduates) 6
  6. Web Mining Group 8 designing algorithms for mining user-generated data

    of various modalities for understanding the behaviors and preferences of users, individually and collectively, and applying the mined knowledge to develop user-centric applications Hady Максим
  7. So many choices … 133,520 results for ”Men’s Shoes" 220,721

    results for ”iPhone 7 case" 1,627,213 results for ”Kindle eBooks"
  8. Rating-based Preferences • Models – Matrix Factorization • Gaussian •

    Poisson – Probabilistic Latent Semantic Analysis – Restricted Boltzmann Machines – Neighborhood-based recommendation • Sparsity – Most users have very little recorded interactions – Newly launched items have no history • Over-reliance on pointwise observations – Model overfitting – “More of the same” problem 15 Strategy: Going beyond ratings
  9. Multi-Modal Preference Signals 16 User metadata (structured text) review (unstructured

    text) rating (numerical) social network similarity photos (e.g., instagram) similarity collabo- rative filtering images
  10. Preferred.AI Preferences and Recommendations from Data & AI 17 Data

    Infrastructure & Representation Learning • Focused crawling framework • Unified product catalogue • Pre-trained features & resources Preference Learning Algorithms • Multi-modal • Multi-relational • Multi-faceted Recommendation Retrieval Engine • Real-time personalization • Indexable representations • Sessionization Apps ThriftCity global search engine for offers FoodRecce food recommendation for groups end-to-end recommendation framework 5-year funding from Singapore National Research Foundation (NRF) Fellowship
  11. LEARNING USER PREFERENCES FROM MULTI-MODAL DATA Preference Signal from Review

    Images Preference Signal from Review Text Preference Signal from Social Networks 18 review text images network Reference: • Quoc-Tuan Truong and Hady W. Lauw, “Visual Sentiment Analysis for Review Images with Item-Oriented and User-Oriented CNN”, ACM Multimedia (ACM MM'17), Oct 2017
  12. Preference Signal from Sentiment 20 Sentiment Analysis Akamaru Modern with

    Kakuni (braised pork belly) topping - Hands down THE best bowl of ramen I've had in my life! Positive Negative or Visual Sentiment Analysis Positive Negative or Image Classification Problem
  13. VS-CNN Architecture 227 227 55 55 3 48 128 192

    192 128 2048 2048 2048 2048 2 27 27 13 13 13 13 13 13 input conv1 conv2 conv3 conv4 conv5 fc6 fc7 fc8 Visual Sentiment CNN (VS-CNN) Architecture
  14. Experiments on Yelp Dataset • Size: – 96 thousand images

    – 8 thousand businesses – 27 thousand users • Coverage: – Boston, Chicago, Houston, Los Angeles, New York, San Francisco, Seattle • Sentiment classes: – Negative: ratings 1 and 2 – Positive: ratings 4 and 5 Random Naïve Bayes VS-CNN Pointwise Accuracy 0.500 0.539 0.544 Pairwise Accuracy 0.500 0.551 0.572 27
  15. Item-oriented Parameters Convolutional Layer VS- CNN Item-oriented VS-CNN conv1 conv3

    conv5 Pointwise Accuracy 0.544 0.563 0.610 0.612 Pairwise Accuracy 0.572 0.592 0.655 0.660 227 227 55 55 3 48 128 192 192 128 2048 2048 2048 2048 2 27 27 13 13 13 13 13 13 input conv1 conv2 conv3 conv4 conv5 fc6 fc7 fc8
  16. Item-oriented Parameters Fully-Connected Layer VS- CNN Item-oriented VS-CNN conv1 conv3

    conv5 fc7 Pointwise Accuracy 0.544 0.563 0.610 0.612 0.620 Pairwise Accuracy 0.572 0.592 0.655 0.660 0.678 227 227 55 55 3 48 128 192 192 128 2048 2048 2048 2048 2 27 27 13 13 13 13 13 13 input conv1 conv2 conv3 conv4 conv5 fc6 fc7 fc8
  17. User-oriented Parameters VS-CNN User-oriented VS-CNN conv1 conv3 conv5 fc7 Pointwise

    Accuracy 0.539 0.596 0.638 0.646 0.649 Pairwise Accuracy 0.556 0.639 0.686 0.706 0.743
  18. LEARNING USER PREFERENCES FROM MULTI-MODAL DATA Preference Signal from Review

    Images Preference Signal from Review Text Preference Signal from Social Networks 33 review text images network Reference: • Maksim Tkachenko and Hady W. Lauw, "Comparative Relation Generative Model,” IEEE Transactions on Knowledge and Data Engineering (TKDE), 2017
  19. Turn to Review Text 35 Identify and interpret comparisons expressed

    in texts “Compared to the Canon 7D the Nikon D300s gives sharper pictures with less noise and great details over iso 400.”
  20. Questions Given a set of comparative sentences: • Each about

    two products (e.g., 7D vs. D300S) • On a specific aspect (e.g., image quality) 1. How can we understand the comparative direction in each sentence? 2. Overall, taking into account all sentences, which entity is better? 37
  21. Insight: Better Together 38 Assignment: complete the ranking if you

    do not know what “superior” means. Corpus of Comparisons:
  22. Generative Model for Comparative Sentences 39 Generation of comparison outcomes

    (which entity is better) Generation of words describing the comparison. Related to Competition Models Related to Naïve Bayes
  23. Relation to Competition Model • Player has latent ability #

    • Probability that wins over in a match: 40 • In our context: – Each comparative sentence simulates a match between two entities (players), with the outcome that one entity wins (is better). – The outcome itself is not given. It needs to be determined. – The outcome depends on the text of the comparative sentence. Bradley-Terry-Luce (BTL) ≻ = (# − +) = σ
  24. Relation to Naïve Bayes 41 #2 is favored #1 ..

    #2 .. better #1 .. #2 .. sharper #1 is favored #1 .. better .. #2 #1 .. sharper .. #2 • The meaning of a sentence changes if: – Words are different (better vs. worse) – Word order is different • “A is better than B” vs. “B is better than A” • We distinguish whether a word appears before the first-mentioned entity (#1), in between, or after the second-mentioned entity (#2):
  25. CompareGem 42 Generation of comparative sentences Latent parameters: • Entity

    rank: # • Comparison direction: 1 • Feature distributions: 3,5 Observations: • Features: COMPArative RElation GEnerative Model
  26. 43

  27. Dateset • Amazon reviews for 180 digital cameras • Supervised

    settings: 50% - training, 50% - testing Aspect #sentences #1 entity is favored #2 entity is favored Functionality 457 38.5% 61.5% Form Factor 78 61.3% 38.7% Image Quality 129 58.1% 41.9% Price 165 52.1% 47.9%
  28. Comparative Direction • Binary classification of each sentence (#1 entity

    is better or worse) Aspect CompareGem SVM Naïve Bayes Functionality 89.0% 76.6% 74.4% Form Factor 71.5% 57.8% 62.8% Image Quality 73.8% 65.4% 64.5% Price 68.7% 52.8% 55.2%
  29. Entity Ranking • Pairwise ranking of entities, with majority votes

    as ground truth. Aspect CompareGem SVM + BTL Naïve Bayes + BTL Functionality 89.7% 88.6% 88.8% Form Factor 82.7% 79.8% 82.7% Image Quality 80.7% 78.7% 80.6% Price 79.0% 75.8% 76.7%
  30. LEARNING USER PREFERENCES FROM MULTI-MODAL DATA Preference Signal from Review

    Images Preference Signal from Review Text Preference Signal from Social Networks 47 review text images network Reference: • Trong T. Nguyen and Hady W. Lauw, "Representation Learning for Homophilic Preferences,” ACM Conference on Recommender Systems (RecSys'16), Sep 2016
  31. Preference Signal from Social Links 48 Lauw et al. Internet

    Computing 2010 A C B Social Network Adoptions birds of a feather flock together рыбак рыбака видит издалека
  32. Restricted Boltzmann Machines • Let x be binary vector of

    visible units • Let h be binary vector of hidden units • a, b are biases, W are weights • Energy function: • Likelihood: • Individual activation probabilities 49 https://en.wikipedia.org/wiki/Restrict ed_Boltzmann_machine stochastic generative artificial neural networks
  33. RBM for Collaborative Filtering • Each item corresponds to a

    visible unit • Value of visible units may be ratings (from 1 to 5) – softmax instead of sigmoid – for simplicity, subsequent discussion is on binary adoption • Each user corresponds to an RBM instance – parameter sharing across users 50 Salakhutdinov et al. ICML 2007 Latent user-representation
  34. Integrating the social network via hidden layers/ representations in a

    RBM-based approach. – No user-specific parameter for social-network constraints. – In the context of item-adoptions prediction task. 51 SocialRBM U I Explore both user-item (UI) vs. user-user (UU) connections
  35. Model 1: SocialRBM-Wing 52 activation probabilities energy function social connections

    and adoptions play a role as observations encoded jointly through a shared hidden layer social network as observation
  36. Model 2: SocialRBM-Deep 53 energy function top layer h2 has

    U hidden units, corresponding to U users, and each user is represented by a single hidden unit on the top layer with weights shared with their friends activation probabilities social network as sharing of hidden units
  37. Network Randomization (Delicious) social network vs. random network(*) comparison in

    the prediction task 56 (*) By exchanging edges/links in the network while preserving node degrees.
  38. Conclusion • Harnessing multi-modal preference signals – Images, text, social

    networks in addition to ratings/adoptions • Work-in-progress – Still far from full personalization of user experiences • Future work – Additional modalities (e.g., metadata), joint modalities – End-to-end recommendation framework • Opportunities to get involved http://hadylauw.com http://mtkachenko.info