Takuya Kitazawa* Masahide Sugiyama School of Computer Science and Engineering The University of Aizu, Fukushima, Japan * Current affiliation is Graduate School of Information Science and Technology, The University of Tokyo, Japan User Modeling in Folksonomies: Relational Clustering and Tag Weighting
5 Related work [Niwa et al. 2006] S. Niwa et al. Web page recommender system based on folksonomy mining. In Proc. of ITNG2006, pages 388–393, Apr. 2006.
7 Approach: Tag weights-based user modeling users contents Tag frequencies for every content “User model” = group structures and weighted tags Relational matrix
8 Group structures: Infinite Relational Model (IRM) ✦ Simultaneous relational clustering ✦ Find group structures with strength = η C. Kemp et al. Learning systems of concepts with an infinite relational model. In Proc. of AAAI2006, pp. 381–388, July 2006. assign clusters and then sorted
11 Tag weights: TF-IDF-like weighting technique (1/2) TF-IDF weighting in information retrieval Term Frequency (TF) Terms appear many times → characteristic Inverse Document Frequency (IDF) Terms appear in many different documents → irrelevant (e.g. a, the) use similar idea
12 Tag weights: TF-IDF-like weighting technique (2/2) TF-IDF-like tag weighting Term Frequency (TF) Tags appear many times → characteristic Inverse Document Frequency (IDF) Tags appear in many different content clusters → irrelevant
15 Tag weights → user models (overall weights) Overall tag weights for single user cluster tech general general tech … overall weight Rank of tags strength η tag weights ×
User-model-based recommendation New page’s tags User models thresholding P by θ for every cluster 16 By summing up, compute page’s prediction degree “P” “New page can be preferred for these users?” P > θ : recommend to every user in cluster
Evaluation setting 17 Matrix Tuples Used same 1017-by-7000 dataset from Hatena bookmark 172,365 tuples in total ✦ 5-fold cross validation with F-measure ✦ User modeling by using learning data ✦ Thresholding all test tuples for every user cluster
Accuracy and running time Better accuracy than worst and faster running time → achieved sketchy user modeling worst base proposed accuracy higher is better 18 including IRM-based clustering ↑
19 Summary User modeling with faster, sketchy data mining Combine relational clustering and tag weighting Achieved faster, sketchy recommendation 2. How to tackle problems 3. Recommender system-based evaluation 1. What is Folksonomies? — Problem formulation
20 ✦ Consider more competitors ✦ Improve accuracy ✦ Take incremental/online approaches How can I roughly obtain users’ group structures and their preferences on web services? Conclusion
User Modeling in Folksonomies: Relational Clustering and Tag Weighting Takuya Kitazawa Email: [email protected] Implementations and datasets: github.com/takuti/wims-2015