Entity Ranking by Learning and Inferring Pairwise Preferences from User Reviews

Entity Ranking by Learning and Inferring Pairwise Preferences from User
Reviews AIRS2017 Shinryo Uchida, Takehiro Yamamoto, Makoto P. Kato, Hiroaki Ohshima, Katsumi Tanaka Kyoto University, Japan 1

2 Which camera is more suitable for capturing a sport
scene? Canon EOS 6D EF24-105 Price: $1,500 Sensor size: 35mm Shutter speed: 1/4000 - 30sec ISO: 100 - 102400 Focal length: 24 - 105mm Weight: 1,205g Attributes Nikon D7200 18-300 Price: $1,500 Sensor size: APS-C Shutter speed: 1/8000 - 30sec ISO: 100 - 102400 Focal length: 18 - 300mm Weight: 1,225g Attributes

Definition Search Attribute − An attribute that a user can
know its value before using the product − E.g., size, weight, ISO, price, … Experience Attribute − An attribute that a user cannot know its value unless they use the product − E.g., capturing sport scene, portability, usability, … 3 Nelson, P. (1970). Information and consumer behavior. The Journal of Political Economy, 311-329.

Motivation  Difficult to predict the quality of an experience
attribute from search attributes − especially for non-experts 4 Which camera is best for capturing beautiful photo? ？ Sensor：APS-C Sensor：APS-H Sensor：35mm

Learning to rank entities in terms of a given experience
attribute based on their search attributes Research Goal = (0.6, 0.4, 0.2, 0.7, … ) = (0.3, 0.2, 0.2, 0.4, … ) = (0.8, 0.6, 0.5, 0.7, … ) Size Weight Rsl. F-value Search Attributes o(1) o(2) o(3) = 0.2 = 0.8 = 0.5 Rank Portability fq (o(2)) fq (o(1)) fq (o(3)) e.g. q = Portability (Experience Attribute) 1st 2nd 3rd 5

Use pairwise preferences extracted from user reviews Basic Approach >Portable
Pairwise preference >Portable >Portable … Reviews Training data (o(1) , o(2)) (o(1) , o(3)) (o(3) , o(4)) … Ranking SVM Extract Learn Represent entities by search attributes = (0.6, 0.4, 0.2, 0.7, … ) Size Weight Rsl. F-value o(1) ： 6

Pairwise Preference  Pairwise Preference（obj1 >attr obj2 ） − Order
of objects in terms of a certain attribute − Extracted from comparative sentences written in user reviews 7 Extracted >Portable Pairwise pref. User Reviews is more portable than Comparative sentence

Challenge  Few pairwise preferences in user reviews − Lack
of training data 8 Extracted >Portable Pairwise pref. User Reviews is more portable than Comparative sentence >Portable >Portable >Portable >Portable

Our Approach: Inferring Preferences Infer pairwise preferences based on attribute
dependencies to gain the size of training data 9 Attribute Dep. Weight →− Portable >Weight Pairwise Pref. >Size <Weight <Portable Inferred Pref. <Portable >Portable Size →− Portable Infer

Attribute Dependency  Attribute Dependency（attr1 →[+|−] attr2） − Dependency between
two attributes − Extracted from resultative conjunctions written in user reviews Attribute dep. Weight →− Portable Extracted User Reviews Resultative is portable because it is light 10

Inferring Preferences Infer pairwise preferences based on attribute dependencies to
gain the size of training data 11 Attribute Dep. Weight →− Portability >Weight Pairwise Pref. >Size <Weight <Portable Inferred Pref. <Portable >Portable Size →− Portability Reviews Extracted Infer

Confidence of Preferences  Motivation − Inferred pairwise pref. may
not be reliable  Approach − Assign confidence to pairwise preferences 12 <Portable Inferred pairwise pref. <Portable >Portable Pairwise pref. explicitly written in reviews >Portable , 0.9 , 0.8 , 0.3 , 0.2 confidence score

Confidence of Preferences 13 Attribute Dep. Weight →− Portable >Weight
Pairwise Pref. <Portable Inferred Pref. Infer , 0.6 , 0.5 , 0.3

Pairwise Pref. <Portable Inferred Pref. Infer 𝑪𝑪𝒑𝒑 = 𝟏𝟏 − 𝟏𝟏 − 𝜽𝜽𝟏𝟏 𝒏𝒏𝒑𝒑 , 0.6 , 0.5 , 0.3 Number of sentences that contain the pairwise preference

Pairwise Pref. <Portable Inferred Pref. Infer 𝑪𝑪𝒑𝒑 = 𝟏𝟏 − 𝟏𝟏 − 𝜽𝜽𝟏𝟏 𝒏𝒏𝒑𝒑 𝑪𝑪𝒂𝒂 = 𝟏𝟏 − 𝟏𝟏 − 𝜽𝜽𝟐𝟐 𝒏𝒏𝒂𝒂 , 0.6 , 0.5 , 0.3 Number of sentences that contain the pairwise preference Number of sentences that contain the attribute dependency

Pairwise Pref. <Portable Inferred Pref. Infer 𝑪𝑪𝒑𝒑 = 𝟏𝟏 − 𝟏𝟏 − 𝜽𝜽𝟏𝟏 𝒏𝒏𝒑𝒑 𝑪𝑪𝒂𝒂 = 𝟏𝟏 − 𝟏𝟏 − 𝜽𝜽𝟐𝟐 𝒏𝒏𝒂𝒂 , 0.6 , 0.5 , 0.3 Number of sentences that contain the pairwise preference Number of sentences that contain the attribute dependency 𝑪𝑪 = 𝑪𝑪𝒑𝒑 × 𝑪𝑪𝒂𝒂 More sentences, more confident

Method Overview Reviews Training data (o(1) , o(2), 1.0) (o(1)
, o(3), 0.8) (o(4) , o(3), 0.6) … Extracted Learn By their search attributes Represented Attribute dep. Weight →− Portability Size →− Portability (o(1) , o(4), 0.4) (o(5) , o(1), 0.3) (o(2) , o(3), 0.2) … Fuzzy Ranking SVM >Portability Pairwise pref. >Portability >Portability … 1.0 0.8 0.6 0.4 0.3 >Portability Inferred pref. >Portability >Portability … 0.4 0.3 0.2 Infer Increase the size of training data by inference Confidence for pairwise pref. 17

Fuzzy Ranking SVM 18 minimize: subject to: minimize: subject to:
Ranking SVM [Joachim2002] Fuzzy Ranking SVM (Fuzzy SVM + Ranking SVM) minimize: subject to: Fuzzy SVM [Lin2002] Training data: { ( xi , xj , si ) } weight of training data Ranking SVM Ranking SVM weight of training data (confidence)

Experiments Dataset Evaluation Metric Category # Entities # Reviews #
Attributes Example experience attributes Cameras 688 15,473 47 Usability, feeling of hold Smartphones 624 33,731 186 Easy-to-hold, comfort Headphones 2,229 13,117 304 Clearness, richness collected from Kakaku.com • Accuracy of learnt pairwise preferences evaluated by leave-one-out cross validation Ground Truth • Pairwise preferences written in reviews 20

Baselines and Proposed Methods 20  Ranking by Frequency (Freq)
− # of reviews that contain the query (= experience attr.)  Ranking by Regression (Reg) − Frequency as dependent variable  Without inference and confidence (L2R) − only uses pairwise preferences written in reviews  Without confidence (L2R+Inf) − Infers preferences but confidences are always set to 1.0  Inference and confidence (L2R+Inf+Conf) Baselines Proposed Methods

 Learning to rank by pairwise preferences outperformed baselines −
Frequency of experience attributes in reviews are not useful for ranking Experimental Results 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Cameras Smartphones Headphones Macro Average Freq Reg L2R L2R+Infer L2R+Infer+Conf baselines proposed 22

 The inference (L2R+Infer(+Conf)) improved the performance of learning to
rank (L2R) − Although the effect is small for headphones (few attribute dependencies found) Experimental Results 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Cameras Smartphones Headphones Macro Average Freq Reg L2R L2R+Infer L2R+Infer+Conf baselines proposed 23

Summary  Learning to rank entities in terms of a
given experience attribute based on search attributes − Helpful to non-experts  Key Ideas − Learning to rank by pairwise preferences − Inferring pairwise preferences - To gain the size of training data − Confidence for preferences - Fuzzy Ranking SVM 23

Entity Ranking by Learning and Inferring Pairwi...

Entity Ranking by Learning and Inferring Pairwise Preferences from User Reviews

Takehiro Yamamoto

More Decks by Takehiro Yamamoto

Featured

Transcript

Entity Ranking by Learning and Inferring Pairwise Preferences from User

2 Which camera is more suitable for capturing a sport

Definition Search Attribute − An attribute that a user can

Motivation  Difficult to predict the quality of an experience

Learning to rank entities in terms of a given experience

Use pairwise preferences extracted from user reviews Basic Approach >Portable

Pairwise Preference  Pairwise Preference（obj1 >attr obj2 ） − Order

Challenge  Few pairwise preferences in user reviews − Lack

Our Approach: Inferring Preferences Infer pairwise preferences based on attribute

Attribute Dependency  Attribute Dependency（attr1 →[+|−] attr2） − Dependency between

Inferring Preferences Infer pairwise preferences based on attribute dependencies to

Confidence of Preferences  Motivation − Inferred pairwise pref. may

Confidence of Preferences 13 Attribute Dep. Weight →− Portable >Weight

Confidence of Preferences 14 Attribute Dep. Weight →− Portable >Weight

Confidence of Preferences 15 Attribute Dep. Weight →− Portable >Weight

Confidence of Preferences 16 Attribute Dep. Weight →− Portable >Weight

Method Overview Reviews Training data (o(1) , o(2), 1.0) (o(1)

Fuzzy Ranking SVM 18 minimize: subject to: minimize: subject to:

Experiments Dataset Evaluation Metric Category # Entities # Reviews #

Baselines and Proposed Methods 20  Ranking by Frequency (Freq)

 Learning to rank by pairwise preferences outperformed baselines −

 The inference (L2R+Infer(+Conf)) improved the performance of learning to

Summary  Learning to rank entities in terms of a