Upgrade to Pro — share decks privately, control downloads, hide ads and more …

User Modeling in Folksonomies

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.

User Modeling in Folksonomies

Presented at the 5th International Conference on Web Intelligence, Mining and Semantics (WIMS 2015)

ACM Digital Library: http://dl.acm.org/citation.cfm?id=2797129

GitHub Repository: https://github.com/takuti/wims-2015

Avatar for Takuya Kitazawa

Takuya Kitazawa

July 14, 2015
Tweet

More Decks by Takuya Kitazawa

Other Decks in Research

Transcript

  1. Takuya Kitazawa* Masahide Sugiyama School of Computer Science and Engineering

    The University of Aizu, Fukushima, Japan * Current affiliation is Graduate School of Information Science and Technology, The University of Tokyo, Japan User Modeling in Folksonomies: Relational Clustering and Tag Weighting
  2. 1. What is Folksonomies? — Problem formulation 2 User Modeling

    in Folksonomies 2. How to tackle problems 3. Recommender system-based evaluation
  3. 4 Conventional: compute vectors one-by-one user1 [1 0 0 1

    0 1 1 0 0 0 0 0] user2 [0 1 0 1 0 0 1 0 0 1 0 0] … userN [1 0 0 0 0 0 0 0 0 1 0 0] “Create, normalize, and then compute…” Accurate, but time-consuming
  4. 5 Related work [Niwa et al. 2006] S. Niwa et

    al. Web page recommender system based on folksonomy mining. In Proc. of ITNG2006, pages 388–393, Apr. 2006.
  5. 6 Roughly obtain preferences/characteristics user1 1 0 0 1 0

    1 1 0 0 0 0 0 user2 0 1 0 1 0 0 1 0 0 1 0 0 … userN 1 0 0 0 0 0 0 0 0 1 0 0 ✦ Low accuracy 㲗 serendipity ✦ Short running time for future application Use matrices with stochastic model
  6. 7 Approach: Tag weights-based user modeling users contents Tag frequencies

    for every content “User model” = group structures and weighted tags Relational matrix
  7. 8 Group structures: Infinite Relational Model (IRM) ✦ Simultaneous relational

    clustering ✦ Find group structures with strength = η C. Kemp et al. Learning systems of concepts with an infinite relational model. In Proc. of AAAI2006, pp. 381–388, July 2006. assign clusters and then sorted
  8. 9 Apply IRM (1/2) Data set Hatena bookmark (social bookmarking)

    IRM-based relational clustering 1,017 users 7,000 web pages
  9. 11 Tag weights: TF-IDF-like weighting technique (1/2) TF-IDF weighting in

    information retrieval Term Frequency (TF) Terms appear many times → characteristic Inverse Document Frequency (IDF) Terms appear in many different documents → irrelevant (e.g. a, the) use similar idea
  10. 12 Tag weights: TF-IDF-like weighting technique (2/2) TF-IDF-like tag weighting

    Term Frequency (TF) Tags appear many times → characteristic Inverse Document Frequency (IDF) Tags appear in many different content clusters → irrelevant
  11. 13 Results: top-20 tags topical news technical topics 1st page

    cluster 3rd page cluster weight Rank of tags
  12. 15 Tag weights → user models (overall weights) Overall tag

    weights for single user cluster tech general general tech … overall weight Rank of tags strength η tag weights ×
  13. User-model-based recommendation New page’s tags User models thresholding P by

    θ for every cluster 16 By summing up, compute page’s prediction degree “P” “New page can be preferred for these users?” P > θ : recommend to every user in cluster
  14. Evaluation setting 17 Matrix Tuples Used same 1017-by-7000 dataset from

    Hatena bookmark 172,365 tuples in total ✦ 5-fold cross validation with F-measure ✦ User modeling by using learning data ✦ Thresholding all test tuples for every user cluster
  15. Accuracy and running time Better accuracy than worst and faster

    running time → achieved sketchy user modeling worst base proposed accuracy higher is better 18 including IRM-based clustering ↑
  16. 19 Summary User modeling with faster, sketchy data mining Combine

    relational clustering and tag weighting Achieved faster, sketchy recommendation 2. How to tackle problems 3. Recommender system-based evaluation 1. What is Folksonomies? — Problem formulation
  17. 20 ✦ Consider more competitors ✦ Improve accuracy ✦ Take

    incremental/online approaches How can I roughly obtain users’ group structures and their preferences on web services? Conclusion
  18. User Modeling in Folksonomies: Relational Clustering and Tag Weighting Takuya

    Kitazawa Email: [email protected] Implementations and datasets: github.com/takuti/wims-2015
  19. Running time of IRM-based clustering 22 5 sec 1,017 users

    7,000 web pages 13 sec Iteration 0 Iteration 1 Iteration 2 more accurate?