Upgrade to Pro — share decks privately, control downloads, hide ads and more …

User Modeling in Folksonomies

User Modeling in Folksonomies

Presented at the 5th International Conference on Web Intelligence, Mining and Semantics (WIMS 2015)

ACM Digital Library: http://dl.acm.org/citation.cfm?id=2797129

GitHub Repository: https://github.com/takuti/wims-2015

Takuya Kitazawa

July 14, 2015
Tweet

More Decks by Takuya Kitazawa

Other Decks in Research

Transcript

  1. Takuya Kitazawa* Masahide Sugiyama School of Computer Science and Engineering

    The University of Aizu, Fukushima, Japan * Current affiliation is Graduate School of Information Science and Technology, The University of Tokyo, Japan User Modeling in Folksonomies: Relational Clustering and Tag Weighting
  2. 1. What is Folksonomies? — Problem formulation 2 User Modeling

    in Folksonomies 2. How to tackle problems 3. Recommender system-based evaluation
  3. 4 Conventional: compute vectors one-by-one user1 [1 0 0 1

    0 1 1 0 0 0 0 0] user2 [0 1 0 1 0 0 1 0 0 1 0 0] … userN [1 0 0 0 0 0 0 0 0 1 0 0] “Create, normalize, and then compute…” Accurate, but time-consuming
  4. 5 Related work [Niwa et al. 2006] S. Niwa et

    al. Web page recommender system based on folksonomy mining. In Proc. of ITNG2006, pages 388–393, Apr. 2006.
  5. 6 Roughly obtain preferences/characteristics user1 1 0 0 1 0

    1 1 0 0 0 0 0 user2 0 1 0 1 0 0 1 0 0 1 0 0 … userN 1 0 0 0 0 0 0 0 0 1 0 0 ✦ Low accuracy 㲗 serendipity ✦ Short running time for future application Use matrices with stochastic model
  6. 7 Approach: Tag weights-based user modeling users contents Tag frequencies

    for every content “User model” = group structures and weighted tags Relational matrix
  7. 8 Group structures: Infinite Relational Model (IRM) ✦ Simultaneous relational

    clustering ✦ Find group structures with strength = η C. Kemp et al. Learning systems of concepts with an infinite relational model. In Proc. of AAAI2006, pp. 381–388, July 2006. assign clusters and then sorted
  8. 9 Apply IRM (1/2) Data set Hatena bookmark (social bookmarking)

    IRM-based relational clustering 1,017 users 7,000 web pages
  9. 11 Tag weights: TF-IDF-like weighting technique (1/2) TF-IDF weighting in

    information retrieval Term Frequency (TF) Terms appear many times → characteristic Inverse Document Frequency (IDF) Terms appear in many different documents → irrelevant (e.g. a, the) use similar idea
  10. 12 Tag weights: TF-IDF-like weighting technique (2/2) TF-IDF-like tag weighting

    Term Frequency (TF) Tags appear many times → characteristic Inverse Document Frequency (IDF) Tags appear in many different content clusters → irrelevant
  11. 13 Results: top-20 tags topical news technical topics 1st page

    cluster 3rd page cluster weight Rank of tags
  12. 15 Tag weights → user models (overall weights) Overall tag

    weights for single user cluster tech general general tech … overall weight Rank of tags strength η tag weights ×
  13. User-model-based recommendation New page’s tags User models thresholding P by

    θ for every cluster 16 By summing up, compute page’s prediction degree “P” “New page can be preferred for these users?” P > θ : recommend to every user in cluster
  14. Evaluation setting 17 Matrix Tuples Used same 1017-by-7000 dataset from

    Hatena bookmark 172,365 tuples in total ✦ 5-fold cross validation with F-measure ✦ User modeling by using learning data ✦ Thresholding all test tuples for every user cluster
  15. Accuracy and running time Better accuracy than worst and faster

    running time → achieved sketchy user modeling worst base proposed accuracy higher is better 18 including IRM-based clustering ↑
  16. 19 Summary User modeling with faster, sketchy data mining Combine

    relational clustering and tag weighting Achieved faster, sketchy recommendation 2. How to tackle problems 3. Recommender system-based evaluation 1. What is Folksonomies? — Problem formulation
  17. 20 ✦ Consider more competitors ✦ Improve accuracy ✦ Take

    incremental/online approaches How can I roughly obtain users’ group structures and their preferences on web services? Conclusion
  18. User Modeling in Folksonomies: Relational Clustering and Tag Weighting Takuya

    Kitazawa Email: [email protected] Implementations and datasets: github.com/takuti/wims-2015
  19. Running time of IRM-based clustering 22 5 sec 1,017 users

    7,000 web pages 13 sec Iteration 0 Iteration 1 Iteration 2 more accurate?