Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Context-aware Image Tweet Modelling and Recommendation

Tao Chen
October 18, 2016

Context-aware Image Tweet Modelling and Recommendation

Presented at the 2016 ACM Multimedia Conference (MM '16) in Amsterdam, The Netherlands.

Tao Chen

October 18, 2016
Tweet

More Decks by Tao Chen

Other Decks in Research

Transcript

  1. 2016
    Context-aware Image Tweet
    Modelling and Recommendation
    Tao Chen, Xiangnan He, Min-Yen Kan

    View Slide

  2. 10/18/2016
    We are visual beings!
    2

    View Slide

  3. 10/18/2016
    We love posting images!
    - Image tweets constitute 14% of Twitter
    posts and 56% of Weibo posts [Chen 2016]
    Image courtesy: Martin
    Parr/Magnum Photos
    3

    View Slide

  4. Interpreting Microblog Images
    •  Vital for downstream applications, such as
    – User interest modelling, retrieval, event detection,
    summarization
    •  Image understanding
    – Low-level features (e.g., SIFT) do not work well due
    to the semantic gap
    – How about visual objects?
    10/18/2016 4

    View Slide

  5. little
    cute
    child
    girl
    indoor
    China ends the one-child policy

    View Slide

  6. transportation
    system
    people
    car
    asphalt
    road
    10/18/2016 6

    View Slide

  7. Interpreting Microblog Images
    •  Vital for downstream applications, e.g.,
    – User interest modelling, retrieval, event detection,
    summarization
    •  Image understanding
    – Low-level features (e.g., SIFT) do not work well due
    to the semantic gap
    – How about visual objects?
    •  Not sufficient for microblog images
    10/18/2016 7

    View Slide

  8. Context is the key to interpret
    microblog images!
    10/18/2016
    Context-aware Image Tweet
    Modelling and Recommendation
    8

    View Slide

  9. •  The most obvious context is post’s text
    10/18/2016 9

    View Slide

  10. •  The most obvious context is post’s text
    •  We focus on conflating the variants of hashtags
    –  #icebucket, #ALSIceBucketChallenge
    10/18/2016 10

    View Slide

  11. •  The most obvious context is post’s text
    •  We focus on conflating the variants of hashtags
    –  #icebucket, #ALSIceBucketChallenge
    10/18/2016 11

    View Slide

  12. 1. Hashtag Enhanced Text
    •  The most obvious context is post’s text
    •  We focus on conflating the variants of hashtags
    –  #icebucket, #ALSIceBucketChallenge
    – 14.3% of image tweets have multi-words hashtags
    10/18/2016
    Microsoft Word Breaker API
    [Wang et al. NAACL’10]
    ice bucket ALS ice bucket challenge
    12

    View Slide

  13. 2. Text in the Image
    10/18/2016
    •  Apply an OCR tool (Google Tesseract) to extract text from images
    •  26.4% of the images have at least one recognized textual word
    Coming soon!!! imdb.to/IGxE9f Pretty much
    13

    View Slide

  14. 3. External URLs
    10/18/2016
    Coming soon!!! imdb.to/IGxE9f
    -  22.7% of image tweets have URLs
    14

    View Slide

  15. 3. External URLs
    10/18/2016
    Coming soon!!! imdb.to/IGxE9f
    -  22.7% of image tweets have URLs
    -  82.1% of external pages contain
    the image in the post
    15

    View Slide

  16. 4. Search Engine as a Context Miner
    •  Not all images in microblogs are user generated
    – Used in other places with a similar context
    10/18/2016
    Pages that
    contain
    the image
    Named entity
    Best guess
    Google Image
    Search
    - 76.0% of images have been
    indexed by Google
    16

    View Slide

  17. 4. Search Engine as a Context Miner
    •  Not all images in microblogs are user generated
    – Used in other places with a similar context
    10/18/2016
    Google Image
    Search
    17

    View Slide

  18. 4. Search Engine as a Context Miner
    •  Not all images in microblogs are user generated
    – Used in other places with a similar context
    10/18/2016
    Google Image
    Search
    Named entity
    Best guess
    Pages that
    contain
    the image
    18

    View Slide

  19. 4. Search Engine as a Context Miner
    •  Not all images in microblogs are user generated
    – Used in other places with a similar context
    10/18/2016
    Pages that
    contain
    the image
    Named entity
    Best guess
    Google Image
    Search
    - 76.0% of images have been
    indexed by Google
    19

    View Slide

  20. CITING: Context-aware Image Tweet Modelling
    10/18/2016
    Coming soon!!! imdb.to/IGxE9f
    1. Hashtag
    enhanced text
    URLs
    3. External web page
    Word Breaker
    2. Text in Image
    OCR Tool Google Image
    Search
    Best guess
    Named entity
    Pages that
    contain
    the image
    4. Search Result
    20

    View Slide

  21. Overlap of the three major sources
    10/18/2016
    14.3% of image tweets have multiple-word hashtags
    Text in
    Image
    External Web Pages
    Google
    Image Search
    21.8%
    79.3%
    26.4%
    21

    View Slide

  22. 10/18/2016
    Text in
    Image
    External Web Pages
    Google
    Image Search
    2.8% 12.5%
    22.2%
    2.5%
    21.8%
    79.3%
    26.4%
    14.3% of image tweets have multiple-word hashtags
    Overlap of the three major sources
    22

    View Slide

  23. Hashtag > External pages > OCR text > Search results
    10/18/2016 23

    View Slide

  24. Hashtag > External pages > OCR text > Search results
    •  External pages contain rich and more relevant
    information than overlaid text
    •  Errors introduced by OCR tool
    10/18/2016 24

    View Slide

  25. Hashtag > External pages > OCR text > Search results
    •  Google Image Search can not differentiate the
    pure text-style and meme images well.
    10/18/2016 25

    View Slide

  26. CITING: Context-aware Image Tweet Modelling
    10/18/2016
    Coming soon!!! imdb.to/IGxE9f
    1. Hashtag
    enhanced text
    URLs
    3. External web page
    Word Breaker
    2. Text in Image
    OCR Tool Google Image
    Search
    Best guess
    Named entity
    Pages that
    contain
    the image
    4. Search Result
    Text quality: Hashtag > External pages > OCR text > Search results
    26

    View Slide

  27. Rules for Fusing Contextual Text
    10/18/2016
    Coming soon!!! imdb.to/IGxE9f Basic text (94.8%): Text from post
    + enhanced hashtags (14.3%)
    URLs
    Basic + Text from
    External Pages
    Basic + OCR Text
    Basic + Text from
    Search Result
    OCR Text
    Yes 14.4%
    No
    Yes 23.5%
    Search
    Engine
    No
    Yes 48.9%
    Reduce contextual text acquisition cost by 18%
    No
    Basic text
    Text quality: Hashtag > External pages > OCR text > Search results
    27

    View Slide

  28. Outline
    1.  Introduction
    2.  Motivation
    5. Conclusion
    10/18/2016
    3.  CITING framework
    4. Personalized image
    tweet recommendation
    28
    We are the first one!

    View Slide

  29. Personalized Image Tweet Recommendation
    10/18/2016
    1 1 0 0
    1 1 0 0
    0 1 1 0
    1 1 0 ?
    U1
    U2
    U3
    U4
    Matrix Factorization (MF)
    -  The state-of-the-art collaborative
    filtering algorithm
    -  Learn a vector representation for
    each user and Item in a latent space
    Will U4
    retweet I4
    ?
    User’s latent
    factor
    Item’s latent
    factor
    29
    I1
    I2
    I3
    I4

    View Slide

  30. Standard MF does not work for image tweets
    10/18/2016
    1 1 0 0
    1 1 0 0
    0 1 1 0
    1 1 0 ?
    U1
    U2
    U3
    U4
    I1
    I2
    I3
    I4
    ? ?
    ? ?
    ? ?
    ? ?
    I5
    I6
    Cold start
    -  Take the features of
    image tweets into
    consideration
    User-item
    interac@on
    User-feature
    interac@on
    decompose
    30

    View Slide

  31. Feature-aware Matrix Factorization
    •  A generic model that incorporates various
    types of features into users’ interest modeling
    •  Not susceptible to cold start
    10/18/2016
    N types of features
    (e.g., CITING text,
    visual objects)
    A feature’s latent
    factor
    Item’s latent factor
    User’s latent
    factor
    31

    View Slide

  32. Model Learning
    •  Pair-wise Learning to Rank
    – Positive tweets (retweets) has a better rank than
    negative ones (non-retweets)
    – Bayesian Personalized Ranking [Rendle et al. 2009]
    – Minimize loss function
    •  Infer the parameters via stochastic gradient
    descent (SGD)
    10/18/2016
    Regularization term
    32

    View Slide

  33. Time-aware Negative Sampling
    •  Retweets are positive instances
    •  We sample negative instances based on the
    time of retweets
    10/18/2016 33
    Retweet
    time
    Non-retweet 1 Non-retweet 2
    2 is more likely to be a real negative instance

    View Slide

  34. Experimental Setting
    •  Keep users’ 10 most recent retweets as a test set
    •  Evaluation metrics
    – Mean Average Precision (MAP)
    – Average Precision at top ranks
    10/18/2016
    Users Retweets All Tweets Ra8ngs
    Training
    926
    174,765 1,316,645 1,592,837
    Test 9,021 77,061 82,743
    34

    View Slide

  35. Method Feature P@1 P@3 P@5 MAP
    1 Random 0.114 0.115 0.115 0.156
    2 Length Post’s text 0.176 0.158 0.150 0.173
    3 Profiling Post’s text 0.336 0.227 0.197 0.202
    4 FAMF Visual Objects (VO) 0.211 0.205 0.192 0.211
    5 FAMF Post’s text 0.359 0.325 0.287 0.275
    6 FAMF CITING 0.419 0.355 0.319 0.298
    Effectiveness of CITING Framework
    10/18/2016 35

    View Slide

  36. Method Feature P@1 P@3 P@5 MAP
    1 Random 0.114 0.115 0.115 0.156
    2 Length Post’s text 0.176 0.158 0.150 0.173
    3 Profiling Post’s text 0.336 0.227 0.197 0.202
    4 FAMF Visual Objects (VO) 0.211 0.205 0.192 0.211
    5 FAMF Post’s text 0.359 0.325 0.287 0.275
    6 FAMF CITING 0.419 0.355 0.319 0.298
    Effectiveness of CITING Framework
    10/18/2016 36

    View Slide

  37. Method Feature P@1 P@3 P@5 MAP
    1 Random 0.114 0.115 0.115 0.156
    2 Length Post’s text 0.176 0.158 0.150 0.173
    3 Profiling Post’s text 0.336 0.227 0.197 0.202
    4 FAMF Visual Objects (VO) 0.211 0.205 0.192 0.211
    5 FAMF Post’s text 0.359 0.325 0.287 0.275
    6 FAMF CITING 0.419 0.355 0.319 0.298
    Effectiveness of CITING Framework
    10/18/2016 37
    -  Visual objects are not sufficient to model semantics of microblog images

    View Slide

  38. Method Feature P@1 P@3 P@5 MAP
    1 Random 0.114 0.115 0.115 0.156
    2 Length Post’s text 0.176 0.158 0.150 0.173
    3 Profiling Post’s text 0.336 0.227 0.197 0.202
    4 FAMF Visual Objects (VO) 0.211 0.205 0.192 0.211
    5 FAMF Post’s text 0.359 0.325 0.287 0.275
    6 FAMF CITING 0.419 0.355 0.319 0.298
    Effectiveness of CITING Framework
    10/18/2016 38

    View Slide

  39. Method Feature P@1 P@3 P@5 MAP
    1 Random 0.114** 0.115 0.115 0.156**
    2 Length Post’s text 0.176** 0.158 0.150 0.173**
    3 Profiling Post’s text 0.336** 0.227 0.197 0.202**
    4 FAMF Visual Objects (VO) 0.211** 0.205 0.192 0.211**
    5 FAMF Post’s text 0.359* 0.325 0.287 0.275**
    6 FAMF CITING 0.419 0.355 0.319 0.298
    Effectiveness of CITING Framework
    10/18/2016 39
    -  Our proposal significantly outperforms the other approaches
    **: p<0.01, *: p<0.05

    View Slide

  40. Effectiveness of CITING Framework
    Method Feature P@1 P@3 P@5 MAP
    1 Random 0.114** 0.115 0.115 0.156**
    2 Length Post’s text 0.176** 0.158 0.150 0.173**
    3 Profiling Post’s text 0.336** 0.227 0.197 0.202**
    4 FAMF Visual Objects (VO) 0.211** 0.205 0.192 0.211**
    5 FAMF Post’s text 0.359* 0.325 0.287 0.275**
    6 FAMF CITING 0.419 0.355 0.319 0.298
    7 FAMF Non-filtered Context 0.413 0.352 0.319 0.296
    10/18/2016
    **: p<0.01, *: p<0.05
    40

    View Slide

  41. Effectiveness of CITING Framework
    Method Feature P@1 P@3 P@5 MAP
    1 Random 0.114** 0.115 0.115 0.156**
    2 Length Post’s text 0.176** 0.158 0.150 0.173**
    3 Profiling Post’s text 0.336** 0.227 0.197 0.202**
    4 FAMF Visual Objects (VO) 0.211** 0.205 0.192 0.211**
    5 FAMF Post’s text 0.359* 0.325 0.287 0.275**
    6 FAMF CITING 0.419 0.355 0.319 0.298
    7 FAMF Non-filtered Context 0.413 0.352 0.319 0.296
    10/18/2016
    **: p<0.01, *: p<0.05
    -  Filtered fusion improves contextual text quality
    41

    View Slide

  42. Effectiveness of CITING Framework
    Method Feature P@1 P@3 P@5 MAP
    1 Random 0.114** 0.115 0.115 0.156**
    2 Length Post’s text 0.176** 0.158 0.150 0.173**
    3 Profiling Post’s text 0.336** 0.227 0.197 0.202**
    4 FAMF Visual Objects (VO) 0.211** 0.205 0.192 0.211**
    5 FAMF Post’s text 0.359* 0.325 0.287 0.275**
    6 FAMF CITING 0.419 0.355 0.319 0.298
    7 FAMF Non-filtered Context 0.413 0.352 0.319 0.296
    8 FAMF CITING+ VO 0.425 0.350 0.313 0.298
    10/18/2016
    **: p<0.01, *: p<0.05
    42

    View Slide

  43. Effectiveness of CITING Framework
    Method Feature P@1 P@3 P@5 MAP
    1 Random 0.114** 0.115 0.115 0.156**
    2 Length Post’s text 0.176** 0.158 0.150 0.173**
    3 Profiling Post’s text 0.336** 0.227 0.197 0.202**
    4 FAMF Visual Objects (VO) 0.211** 0.205 0.192 0.211**
    5 FAMF Post’s text 0.359* 0.325 0.287 0.275**
    6 FAMF CITING 0.419 0.355 0.319 0.298
    7 FAMF Non-filtered Context 0.413 0.352 0.319 0.296
    8 FAMF CITING+ VO 0.425 0.350 0.313 0.298
    10/18/2016
    **: p<0.01, *: p<0.05
    43
    -  The incorporation of visual objects does not consistently improve the
    recommendation performance

    View Slide

  44. Case Study
    10/18/2016
    Average Precision: 0.226 (visual objects) -> 0.592 (CITING)
    44

    View Slide

  45. Conclusion
    •  Released code and datasets: https://github.com/kite1988/famf
    •  Future work
    –  Other contexts: geo-location, time, author
    –  Other fusion approaches, e.g., learn weights of each contextual source
    10/18/2016
    CITING framework to
    model image tweets
    -  Hashtag enhanced text
    -  OCR text
    -  External pages
    -  Search results
    Feature-aware MF to
    recommend image tweets
    -  Decompose user-item
    interaction to user-feature
    interaction
    -  Alleviate cold-start problem
    45

    View Slide