$30 off During Our Annual Pro Sale. View Details »

User Modeling in Folksonomies

User Modeling in Folksonomies

Presented at the 5th International Conference on Web Intelligence, Mining and Semantics (WIMS 2015)

ACM Digital Library: http://dl.acm.org/citation.cfm?id=2797129

GitHub Repository: https://github.com/takuti/wims-2015

Takuya Kitazawa

July 14, 2015
Tweet

More Decks by Takuya Kitazawa

Other Decks in Research

Transcript

  1. Takuya Kitazawa* Masahide Sugiyama
    School of Computer Science and Engineering
    The University of Aizu, Fukushima, Japan
    * Current affiliation is Graduate School of Information Science and Technology,
    The University of Tokyo, Japan
    User Modeling in Folksonomies:
    Relational Clustering and Tag Weighting

    View Slide

  2. 1. What is Folksonomies? — Problem formulation
    2
    User Modeling in Folksonomies
    2. How to tackle problems
    3. Recommender system-based evaluation

    View Slide

  3. 3
    Folksonomies = social networking services
    e.g. Flickr
    Delicious
    Users’ preferences / characteristics
    extract

    View Slide

  4. 4
    Conventional: compute vectors one-by-one
    user1 [1 0 0 1 0 1 1 0 0 0 0 0]
    user2 [0 1 0 1 0 0 1 0 0 1 0 0]

    userN [1 0 0 0 0 0 0 0 0 1 0 0]
    “Create, normalize, and then compute…”
    Accurate, but time-consuming

    View Slide

  5. 5
    Related work [Niwa et al. 2006]
    S. Niwa et al. Web page recommender system based on folksonomy mining.
    In Proc. of ITNG2006, pages 388–393, Apr. 2006.

    View Slide

  6. 6
    Roughly obtain preferences/characteristics
    user1 1 0 0 1 0 1 1 0 0 0 0 0
    user2 0 1 0 1 0 0 1 0 0 1 0 0

    userN 1 0 0 0 0 0 0 0 0 1 0 0
    ✦ Low accuracy 㲗 serendipity
    ✦ Short running time for future application
    Use matrices with stochastic model

    View Slide

  7. 7
    Approach: Tag weights-based user modeling
    users
    contents
    Tag frequencies for every content
    “User model” = group structures and weighted tags
    Relational matrix

    View Slide

  8. 8
    Group structures: Infinite Relational Model (IRM)
    ✦ Simultaneous relational clustering
    ✦ Find group structures with strength = η
    C. Kemp et al. Learning systems of concepts with an infinite relational model.
    In Proc. of AAAI2006, pp. 381–388, July 2006.
    assign clusters
    and then sorted

    View Slide

  9. 9
    Apply IRM (1/2)
    Data set
    Hatena bookmark (social bookmarking)
    IRM-based relational clustering
    1,017 users
    7,000 web pages

    View Slide

  10. 10
    Apply IRM (2/2)
    Data set
    Hatena bookmark (social bookmarking)
    — tags

    View Slide

  11. 11
    Tag weights: TF-IDF-like weighting technique (1/2)
    TF-IDF weighting in information retrieval
    Term Frequency (TF)
    Terms appear many times
    → characteristic
    Inverse Document Frequency (IDF)
    Terms appear in many different documents
    → irrelevant (e.g. a, the)
    use similar idea

    View Slide

  12. 12
    Tag weights: TF-IDF-like weighting technique (2/2)
    TF-IDF-like tag weighting
    Term Frequency (TF)
    Tags appear many times
    → characteristic
    Inverse Document Frequency (IDF)
    Tags appear in many different content clusters
    → irrelevant

    View Slide

  13. 13
    Results: top-20 tags
    topical news technical topics
    1st page cluster 3rd page cluster
    weight
    Rank of tags

    View Slide

  14. 14
    Connecting to user modeling (overview)
    Find strong relation → preferences

    View Slide

  15. 15
    Tag weights → user models (overall weights)
    Overall tag weights for single user cluster
    tech
    general
    general
    tech

    overall
    weight
    Rank of tags
    strength
    η
    tag weights
    ×

    View Slide

  16. User-model-based recommendation
    New page’s tags
    User models
    thresholding P by θ for every cluster
    16
    By summing up, compute page’s prediction degree “P”
    “New page can be preferred for these users?”
    P > θ : recommend to every user in cluster

    View Slide

  17. Evaluation setting
    17
    Matrix Tuples
    Used same 1017-by-7000 dataset from Hatena bookmark
    172,365 tuples in total
    ✦ 5-fold cross validation with F-measure
    ✦ User modeling by using learning data
    ✦ Thresholding all test tuples for every user cluster

    View Slide

  18. Accuracy and running time
    Better accuracy than worst and faster running time
    → achieved sketchy user modeling
    worst
    base
    proposed
    accuracy
    higher is better
    18
    including IRM-based clustering ↑

    View Slide

  19. 19
    Summary
    User modeling with faster, sketchy data mining
    Combine relational clustering and tag weighting
    Achieved faster, sketchy recommendation
    2. How to tackle problems
    3. Recommender system-based evaluation
    1. What is Folksonomies? — Problem formulation

    View Slide

  20. 20
    ✦ Consider more competitors
    ✦ Improve accuracy
    ✦ Take incremental/online approaches
    How can I roughly obtain users’ group structures and
    their preferences on web services?
    Conclusion

    View Slide

  21. User Modeling in Folksonomies:
    Relational Clustering and Tag Weighting
    Takuya Kitazawa
    Email: [email protected]
    Implementations and datasets:
    github.com/takuti/wims-2015

    View Slide

  22. Running time of IRM-based clustering
    22
    5 sec
    1,017 users
    7,000 web pages
    13 sec
    Iteration 0
    Iteration 1
    Iteration 2
    more accurate?

    View Slide

  23. Influence of threshold θ and IRM iteration
    More iteration is
    more accurate?
    → Probably NOT
    23

    View Slide

  24. View Slide