Slide 1

Slide 1 text

New and Improved: Modeling Versions to Improve App Recommendation Jovian Lin / Kazunari Sugiyama / Min-Yen Kan / Tat-Seng Chua National University of Singapore

Slide 2

Slide 2 text

“Change is the only constant” SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 2 of 32 (at least in the app domain)

Slide 3

Slide 3 text

books, movies, music, etc. Static : Changes apps : (with version updates) SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 3 of 32

Slide 4

Slide 4 text

Version 1.0 Version 2.0 Includes High Definition (HD) capabilities App  X   SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 4 of 32 An app that was unfavorable in the past may become favorable for a user after a version update.

Slide 5

Slide 5 text

App  X   Legend   An  ID  of   a  topic.   A  version  of     the  app.   Topics   1   2   3   4   5   Users   Clark   Alex   Bob   New!   Version   Version   Version   Version   Version   1.0   1.1   2.0   1.2   3.0   1   2   4   1   3   1   2   5   3   4   SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 5 of 32

Slide 6

Slide 6 text

App  X   Version   Version   Version   Version   Version   New!   Topics   Users   1   2   3   4   5   Legend   An  ID  of   a  topic.   A  version  of     the  app.   Clark   Alex   Bob   1.0   1.1   1   2   4   1   3   1   2.0   2   1.2   3.0   5   3   4   So if Bob has a keen interest in Topic 5… SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 6 of 32

Slide 7

Slide 7 text

App  X   Version   Version   Version   Version   Version   New!   Topics   Users   1   2   3   4   5   Legend   An  ID  of   a  topic.   A  version  of     the  app.   Clark   Alex   Bob   1.0   1.1   1   2   4   1   3   1   2.0   2   1.2   3.0   5   3   4   So if Bob has a keen interest in Topic 5… … the chance that he adopts Version 3.0 of App X will be higher. SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 7 of 32

Slide 8

Slide 8 text

Recent News !!! (as of 27th June 2014) •  Apple has added a new section: “Best New Game Updates” in their App Store. •  Highlights recently updated apps. •  Easier to discover apps that have just been significantly updated. SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 8 of 32

Slide 9

Slide 9 text

Our Approach 1.  Extracting Version Features 2.  Generating Latent Topics 3.  Identifying Important Latent Topics 4.  User Personalization 5.  Calculating Version Snippet Score SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 9 of 32

Slide 10

Slide 10 text

1. Extracting Version Features SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 10 of 32

Slide 11

Slide 11 text

•  Version Snippets (snippet = document) 1. Extracting Version Features SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 11 of 32

Slide 12

Slide 12 text

Major Minor Maintenance •  Version Snippets (snippet = document) •  Version Category 1. Extracting Version Features SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 12 of 32

Slide 13

Slide 13 text

•  Version Snippets (snippet = document) •  Version Category •  Genre Mixture “photo” “entertainment” “social networking” “utilities” 1. Extracting Version Features Genres SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 13 of 32

Slide 14

Slide 14 text

•  Version Snippets (snippet = document) •  Version Category •  Genre Mixture •  Ratings – a rating corresponds to a version of an app. 1. Extracting Version Features “user u gives version v of app a a numerical rating of r” SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 14 of 32

Slide 15

Slide 15 text

•  Interpret the text in version-snippets/documents. •  Use topic models (LDA) to achieve this. •  Text in documents à interpretable representation. •  Investigate 3 variants of LDA. •  Each variant employs a different set of version features. 2. Generating Latent Topics Topic Model Textual Description Version Category Genre Mixture Modifying Corpus LDA ✔ Labeled LDA (LLDA) ✔ ✔ ✔ Injection LDA/LLDA ✔ ✔ ✔ ✔ SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 15 of 32

Slide 16

Slide 16 text

•  LDA generates: •  Per-document topic distribution – i.e., p(z|d) •  Per-topic word distribution – i.e., p(w|z) •  But LDA can’t incorporate “observed” information like: •  Version-category •  Genre mixture 2. Generating Latent Topics LDA LLDA Injection LDA/LLDA SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 16 of 32

Slide 17

Slide 17 text

•  LLDA – supervised topic model that uses “observed labels” as topics [Ramage et al., 2009]. •  But it can be “hacked” to become semi-supervised. •  Semi-supervised: •  Observed labels = version categories & genre mixture* •  Latent topics = discovered/generated from descriptions 2. Generating Latent Topics LDA LLDA Injection LDA/LLDA *number of observed labels varies with different documents. SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 16 of 32

Slide 18

Slide 18 text

•  Enhancing the corpus before using topic models. •  Generate pseudo-terms from metadata* and incorporate them into each document. 2. Generating Latent Topics LDA LLDA Injection LDA/LLDA *metadata = version categories & genre mixture. Text from the original document Pseudo-terms “Enhanced” document SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 16 of 32

Slide 19

Slide 19 text

•  Enhancing the corpus before using topic models. •  Generate pseudo-terms from metadata* and incorporate them into each document. •  Then, perform topic modeling by using LDA/LLDA on the enhanced corpus. •  Shorthand: “inj+LDA” and “inj+LLDA.” 2. Generating Latent Topics LDA LLDA Injection LDA/LLDA *metadata = version categories & genre mixture. SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 16 of 32

Slide 20

Slide 20 text

•  So far, we can model each document as a distribution of topics. •  But we do not know which topics are important for a recommendation. 3. Identifying Important Latent Topics Topic Retina/HD graphics Topics: - game centre - iPad support - Retina display Topics: - game centre - iPad support App X App Y RECOMMEND SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 17 of 32

Slide 21

Slide 21 text

•  So far, we can model each document as a distribution of topics. •  But we do not know which topics are important for a recommendation. •  Furthermore: •  Apps are classified into different genres. •  Each genre works differently to the same type of version update. 3. Identifying Important Latent Topics Topic Retina/HD graphics More relevant Games genre Music genre SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 17 of 32

Slide 22

Slide 22 text

•  Key components: (i) genres & (ii) topics •  We weight every genre-topic pair with “wx (g,z)” 3. Identifying Important Latent Topics g = genre | z = latent topic | x ∈ {LDA, inj+LDA, LLDA, inj+LLDA} SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 18 of 32

Slide 23

Slide 23 text

•  Key components: (i) genres & (ii) topics •  We weight every genre-topic pair with “wx (g,z)” •  “wx (g,z)” is weighted based on the “popularity” of the genre-topic pair. •  “Popularity” is scored based on existing user ratings. •  In other words: 3. Identifying Important Latent Topics Topic Retina/HD graphics More relevant Games genre Music genre Relevance/popularity/weight of a genre-topic pair is based on user ratings SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 18 of 32

Slide 24

Slide 24 text

•  Know each user’s preference w.r.t. the set of topics. •  For all version “v”s of apps that user u has consumed: •  Sum up the probabilities of the set of topics of each v •  Normalize •  We get “p(z|u)” – i.e., probability of user u being interested in topic z 4. User Personalization SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 19 of 32

Slide 25

Slide 25 text

•  We want: scorex (d,u), where •  d = document a.k.a. version-snippet •  u = user •  x ∈ {LDA, inj+LDA, LLDA, inj+LLDA} – i.e., the 4 topic models •  To calculate scorex (d,u): •  Convert document d into set of topics •  Integrate it with: •  genre-topic weights, i.e., “wx (g,z)” •  user personalization, i.e., “p(z|u)” •  Details in paper J 5. Calculating Version Snippet Score SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 20 of 32

Slide 26

Slide 26 text

•  Dataset: •  App metadata (iTunes App Store): •  App ID •  Title & Description •  Genres •  User ratings (iTunes App Store’s Reviews) •  Version descriptions of apps (App Annie) •  Collected: •  9797 users •  6524 apps •  109,338 versions •  1,000,809 ratings Evaluation SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 21 of 32

Slide 27

Slide 27 text

•  Evaluation Metric: Recall •  Zero ratings are uncertain: •  (i) either the user doesn’t know about the app; or •  (ii) the user doesn’t like the app (and didn’t rate it). •  Makes it difficult to accurately compute precision. •  But since the existing ratings are true positives, recall is a more pertinent measure – it only considers the positively rated apps. •  Recall is also used in: •  Wang & Blei (KDD’11) and •  Lin et al. (SIGIR’13). Evaluation SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 22 of 32

Slide 28

Slide 28 text

•  Baselines: 1.  Collaborative filtering (CF) achieved using probabilistic matrix factorization (PMF). 2.  Content-based filtering (CBF) achieved using LDA on textual app descriptions only. 3.  Hybrid baselines using Gradient Tree Boosting (GTB): a.  CF+CBF (collaborative & content) b.  CF+VSR (collaborative & VSR) c.  CBF+VSR (content & VSR) d.  CF+CBF+VSR (collaborative & content & VSR) Evaluation *VSR = version sensitive recommendation SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 23 of 32

Slide 29

Slide 29 text

Compare: WITHOUT genre info VS WITH genre info Results & Analysis 1. Importance of Genre Information (on inj+LLDA) Without genre information With genre information 1.  Genre info is an important discriminatory factor… 2.  … as each genre affects the same type of topic differently. 3.  For example: Topic Retina/HD graphics Games genre Music genre SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 24 of 32

Slide 30

Slide 30 text

Local comparison between Various Topic Models Results & Analysis 2. Comparison of Different Topics Models Supervised LLDA vs LDA vs LLDA vs inj+LDA vs inj+LLDA SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 25 of 32

Slide 31

Slide 31 text

Local comparison between Various Topic Models Results & Analysis 2. Comparison of Different Topics Models 1.  Recall improves as more information is used. 2.  Best = inj+LLDA 3.  Both LLDA (yellow & red) models outperform the LDA (blue & green) counterparts. 4.  Because LLDA utilizes: – semi-supervised, and – use of observed data. 5.  Enhancing corpus generally improves recall. – inj+LLDA > LLDA – inj+LDA > LDA Topic Model Textual Description Version Category Genre Mixture Modifying Corpus LDA ✔ Labeled LDA (LLDA) ✔ ✔ ✔ Inj+LDA / inj+LLDA ✔ ✔ ✔ ✔ SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 26 of 32

Slide 32

Slide 32 text

:: Individual/Standalone Techniques :: VSR vs CF vs CBF Results & Analysis 3a. Comparison Against Other Techniques (Individual) 1.  Our VSR underperformed CF. 2.  But VSR outperformed CBF. 3.  Noisy textual app descriptions affect CBF’s performance (Lin et al. 2013). 4.  Among content-based techniques (app domain), content from version snippets could replace app descriptions. SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 27 of 32

Slide 33

Slide 33 text

:: Hybrid Techniques :: CBF vs CBF+VSR vs CF vs CF+CBF vs CF+VSR vs CF+CBF+VSR Results & Analysis 3b. Comparison Against Other Techniques (Combined) 1.  CF+VSR > CF CBF+VSR > CBF Combining with VSR improves individual CF and CBF techniques alone. 2.  CF+VSR > CF+CBF Version features are better content representations than app descriptions (Further strengthens the point in the previous slide). 3.  CF+VSR ≈ CF+CBF+VSR SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 28 of 32

Slide 34

Slide 34 text

Discussion Dissecting Specific Topics (of inj+LLDA topic model) Display-related topic •  “retina”, “display”, “graphic”, “resolut” •  Minor version-cat •  Genres: •  Utilities •  Productivity Travel-related topic •  “map”, “traffic”, “rout”, “locat”, “trip”, “road”, “address”, “poi” •  Genres: •  Navigation •  Traveling Observed Topic •  Because inj+LLDA incorporates observed labels such as genre info. •  “pain”, “medic”, “drug”, “pregnanc”, “period”, “health”, “track” •  Genres: •  Medical •  Health & Fitness SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 29 of 32

Slide 35

Slide 35 text

Discussion Importance of Version Categories We calculated the importance of each of the 3 version-categories: •  #major: 0.128 •  #minor: 0.656 ** •  #maintenance: 0.216 Why? •  #major updates buggy. •  #maintenance resolves trivial issues. •  #minor updates introduces important bug fixes. Can we improve recommendation if we further augment the version categories? We incorporate a more comprehensive list of version-categories. SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 30 of 32

Slide 36

Slide 36 text

Discussion Importance of Version Categories :: Global Comparison :: “standard” vs “advanced” version-categories 1.  “advanced” > “standard” 2.  Improvement more obvious on lower recommendation ranks (“M”). SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 31 of 32

Slide 37

Slide 37 text

•  Explore the use of version features in recommendation. Conclusion SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 32 of 32

Slide 38

Slide 38 text

•  Explore the use of version features in recommendation. •  Utilize a semi-supervised variant of LDA that accounts for textual descriptions and observed metadata. Conclusion SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 32 of 32

Slide 39

Slide 39 text

•  Explore the use of version features in recommendation. •  Utilize a semi-supervised variant of LDA that accounts for textual descriptions and observed metadata. •  Observe that genre information is a key factor in discriminating the topic distribution. Conclusion SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 32 of 32

Slide 40

Slide 40 text

•  Explore the use of version features in recommendation. •  Utilize a semi-supervised variant of LDA that accounts for textual descriptions and observed metadata. •  Observe that genre information is a key factor in discriminating the topic distribution. •  Version sensitive recommendation (VSR) can be combined with conventional techniques for further improvement. Conclusion SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 32 of 32

Slide 41

Slide 41 text

•  Explore the use of version features in recommendation. •  Utilize a semi-supervised variant of LDA that accounts for textual descriptions and observed metadata. •  Observe that genre information is a key factor in discriminating the topic distribution. •  Version sensitive recommendation (VSR) can be combined with conventional techniques for further improvement. •  Future work: Treat versions as inter-dependent and use a decaying exponential approach to model sequence of versions. Conclusion SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | 32 of 32

Slide 42

Slide 42 text

Thank You! Any questions? Also, I’m finishing my Ph.D. and looking for job opportunities. http://jovianlin.com (or query “jovian lin”) SIGIR 2014 | Session 7 | New and Improved: Modeling Versions to Improve App Recommendation | Q&A