Моделирование пользовательских предпочтений в мультимодальных данных. Hady W. Lauw, Максим Ткаченко (Singapore Management University)

Learning User Preferences from Multi-Modal Data Hady W. Lauw and
Максим Ткаченко

Singapore Management University • 1 of 4 major research universities
in Singapore • City university in downtown Singapore • Established in 2000 • 10 thousand students (20% postgraduates) 6

School of Information Systems 7

Web Mining Group 8 designing algorithms for mining user-generated data
of various modalities for understanding the behaviors and preferences of users, individually and collectively, and applying the mined knowledge to develop user-centric applications Hady Максим

LEARNING USER PREFERENCES FROM MULTI-MODAL DATA

So many choices … 133,520 results for ”Men’s Shoes" 220,721
results for ”iPhone 7 case" 1,627,213 results for ”Kindle eBooks"

… so little time (and space) https://www.statista.com/statistics/274774/forecast-of-mobile-phone-users-worldwide/

Personalized Recommendation

Netflix Prize https://bits.blogs.nytimes.com/2009/09/21/netflix-awards-1-million-prize-and-starts-a-new-contest/

Matrix Factorization

Rating-based Preferences • Models – Matrix Factorization • Gaussian •
Poisson – Probabilistic Latent Semantic Analysis – Restricted Boltzmann Machines – Neighborhood-based recommendation • Sparsity – Most users have very little recorded interactions – Newly launched items have no history • Over-reliance on pointwise observations – Model overfitting – “More of the same” problem 15 Strategy: Going beyond ratings

Multi-Modal Preference Signals 16 User metadata (structured text) review (unstructured
text) rating (numerical) social network similarity photos (e.g., instagram) similarity collaborative filtering images

Preferred.AI Preferences and Recommendations from Data & AI 17 Data
Infrastructure & Representation Learning • Focused crawling framework • Unified product catalogue • Pre-trained features & resources Preference Learning Algorithms • Multi-modal • Multi-relational • Multi-faceted Recommendation Retrieval Engine • Real-time personalization • Indexable representations • Sessionization Apps ThriftCity global search engine for offers FoodRecce food recommendation for groups end-to-end recommendation framework 5-year funding from Singapore National Research Foundation (NRF) Fellowship

LEARNING USER PREFERENCES FROM MULTI-MODAL DATA Preference Signal from Review
Images Preference Signal from Review Text Preference Signal from Social Networks 18 review text images network Reference: • Quoc-Tuan Truong and Hady W. Lauw, “Visual Sentiment Analysis for Review Images with Item-Oriented and User-Oriented CNN”, ACM Multimedia (ACM MM'17), Oct 2017

Review Images

Preference Signal from Sentiment 20 Sentiment Analysis Akamaru Modern with
Kakuni (braised pork belly) topping - Hands down THE best bowl of ramen I've had in my life! Positive Negative or Visual Sentiment Analysis Positive Negative or Image Classification Problem

Neural Network 21 https://ujjwalkarn.me/2016/08/09/quick-intro-neural-networks/

Deep Neural Network 22 http://neuralnetworksanddeeplearning.com/chap5.html

Convolutional Neural Network 23 https://www.mathworks.com/discovery/convolutional-neural-network.html

Convolution 24 https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/ 3 x 3 filter

Pooling 25 https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/

VS-CNN Architecture 227 227 55 55 3 48 128 192
192 128 2048 2048 2048 2048 2 27 27 13 13 13 13 13 13 input conv1 conv2 conv3 conv4 conv5 fc6 fc7 fc8 Visual Sentiment CNN (VS-CNN) Architecture

Experiments on Yelp Dataset • Size: – 96 thousand images
– 8 thousand businesses – 27 thousand users • Coverage: – Boston, Chicago, Houston, Los Angeles, New York, San Francisco, Seattle • Sentiment classes: – Negative: ratings 1 and 2 – Positive: ratings 4 and 5 Random Naïve Bayes VS-CNN Pointwise Accuracy 0.500 0.539 0.544 Pairwise Accuracy 0.500 0.551 0.572 27

Positive Sentiment Images 28

Negative Sentiment Images 29

Item-oriented Parameters Convolutional Layer VS- CNN Item-oriented VS-CNN conv1 conv3
conv5 Pointwise Accuracy 0.544 0.563 0.610 0.612 Pairwise Accuracy 0.572 0.592 0.655 0.660 227 227 55 55 3 48 128 192 192 128 2048 2048 2048 2048 2 27 27 13 13 13 13 13 13 input conv1 conv2 conv3 conv4 conv5 fc6 fc7 fc8

Item-oriented Parameters Fully-Connected Layer VS- CNN Item-oriented VS-CNN conv1 conv3
conv5 fc7 Pointwise Accuracy 0.544 0.563 0.610 0.612 0.620 Pairwise Accuracy 0.572 0.592 0.655 0.660 0.678 227 227 55 55 3 48 128 192 192 128 2048 2048 2048 2048 2 27 27 13 13 13 13 13 13 input conv1 conv2 conv3 conv4 conv5 fc6 fc7 fc8

User-oriented Parameters VS-CNN User-oriented VS-CNN conv1 conv3 conv5 fc7 Pointwise
Accuracy 0.539 0.596 0.638 0.646 0.649 Pairwise Accuracy 0.556 0.639 0.686 0.706 0.743

Images Preference Signal from Review Text Preference Signal from Social Networks 33 review text images network Reference: • Maksim Tkachenko and Hady W. Lauw, "Comparative Relation Generative Model,” IEEE Transactions on Knowledge and Data Engineering (TKDE), 2017

Preference Signal from Comparison Which camera has better image quality?
34 Canon EOS 7D Nikon D300S

Turn to Review Text 35 Identify and interpret comparisons expressed
in texts “Compared to the Canon 7D the Nikon D300s gives sharper pictures with less noise and great details over iso 400.”

Comparison Mining ≠ Sentiment Mining 36

Questions Given a set of comparative sentences: • Each about
two products (e.g., 7D vs. D300S) • On a specific aspect (e.g., image quality) 1. How can we understand the comparative direction in each sentence? 2. Overall, taking into account all sentences, which entity is better? 37

Insight: Better Together 38 Assignment: complete the ranking if you
do not know what “superior” means. Corpus of Comparisons:

Generative Model for Comparative Sentences 39 Generation of comparison outcomes
(which entity is better) Generation of words describing the comparison. Related to Competition Models Related to Naïve Bayes

Relation to Competition Model • Player has latent ability #
• Probability that wins over in a match: 40 • In our context: – Each comparative sentence simulates a match between two entities (players), with the outcome that one entity wins (is better). – The outcome itself is not given. It needs to be determined. – The outcome depends on the text of the comparative sentence. Bradley-Terry-Luce (BTL) ≻ = (# − +) = σ

Relation to Naïve Bayes 41 #2 is favored #1 ..
#2 .. better #1 .. #2 .. sharper #1 is favored #1 .. better .. #2 #1 .. sharper .. #2 • The meaning of a sentence changes if: – Words are different (better vs. worse) – Word order is different • “A is better than B” vs. “B is better than A” • We distinguish whether a word appears before the first-mentioned entity (#1), in between, or after the second-mentioned entity (#2):

CompareGem 42 Generation of comparative sentences Latent parameters: • Entity
rank: # • Comparison direction: 1 • Feature distributions: 3,5 Observations: • Features: COMPArative RElation GEnerative Model

Dateset • Amazon reviews for 180 digital cameras • Supervised
settings: 50% - training, 50% - testing Aspect #sentences #1 entity is favored #2 entity is favored Functionality 457 38.5% 61.5% Form Factor 78 61.3% 38.7% Image Quality 129 58.1% 41.9% Price 165 52.1% 47.9%

Comparative Direction • Binary classification of each sentence (#1 entity
is better or worse) Aspect CompareGem SVM Naïve Bayes Functionality 89.0% 76.6% 74.4% Form Factor 71.5% 57.8% 62.8% Image Quality 73.8% 65.4% 64.5% Price 68.7% 52.8% 55.2%

Entity Ranking • Pairwise ranking of entities, with majority votes
as ground truth. Aspect CompareGem SVM + BTL Naïve Bayes + BTL Functionality 89.7% 88.6% 88.8% Form Factor 82.7% 79.8% 82.7% Image Quality 80.7% 78.7% 80.6% Price 79.0% 75.8% 76.7%

Images Preference Signal from Review Text Preference Signal from Social Networks 47 review text images network Reference: • Trong T. Nguyen and Hady W. Lauw, "Representation Learning for Homophilic Preferences,” ACM Conference on Recommender Systems (RecSys'16), Sep 2016

Preference Signal from Social Links 48 Lauw et al. Internet
Computing 2010 A C B Social Network Adoptions birds of a feather flock together рыбак рыбака видит издалека

Restricted Boltzmann Machines • Let x be binary vector of
visible units • Let h be binary vector of hidden units • a, b are biases, W are weights • Energy function: • Likelihood: • Individual activation probabilities 49 https://en.wikipedia.org/wiki/Restrict ed_Boltzmann_machine stochastic generative artificial neural networks

RBM for Collaborative Filtering • Each item corresponds to a
visible unit • Value of visible units may be ratings (from 1 to 5) – softmax instead of sigmoid – for simplicity, subsequent discussion is on binary adoption • Each user corresponds to an RBM instance – parameter sharing across users 50 Salakhutdinov et al. ICML 2007 Latent user-representation

Integrating the social network via hidden layers/ representations in a
RBM-based approach. – No user-specific parameter for social-network constraints. – In the context of item-adoptions prediction task. 51 SocialRBM U I Explore both user-item (UI) vs. user-user (UU) connections

Model 1: SocialRBM-Wing 52 activation probabilities energy function social connections
and adoptions play a role as observations encoded jointly through a shared hidden layer social network as observation

Model 2: SocialRBM-Deep 53 energy function top layer h2 has
U hidden units, corresponding to U users, and each user is represented by a single hidden unit on the top layer with weights shared with their friends activation probabilities social network as sharing of hidden units

Experiments: Datasets • Two public-datasets: Delicious vs. Last.FM 54

Model Comparison (Delicious) • Task: item-adoption prediction • Metric: Recall@[20…200]
55

Network Randomization (Delicious) social network vs. random network(*) comparison in
the prediction task 56 (*) By exchanging edges/links in the network while preserving node degrees.

Results for LastFM 57

Conclusion • Harnessing multi-modal preference signals – Images, text, social
networks in addition to ratings/adoptions • Work-in-progress – Still far from full personalization of user experiences • Future work – Additional modalities (e.g., metadata), joint modalities – End-to-end recommendation framework • Opportunities to get involved http://hadylauw.com http://mtkachenko.info

THANK YOU Contact US: http://hadylauw.com http://mtkachenko.info

Моделирование пользовательских предпочтений в м...

Моделирование пользовательских предпочтений в мультимодальных данных. Hady W. Lauw, Максим Ткаченко (Singapore Management University)

More Decks by AvitoTech

Other Decks in Science

Featured

Transcript