Should We Care About Content? Recommending by Proxy with Big Metadata

77cacd503a0b6641e3b7a2e26dbbbaa1?s=47 Ben Fields
October 01, 2012

Should We Care About Content? Recommending by Proxy with Big Metadata

When constructing a music recommender system, which is more important: a musicological understanding of the catalog of music in a system or the number of times two particular songs were played one after the other and were `liked’? Even better, if a system knows the latter, does the former even matter? Do machines that predict behavior need to learn to listen? Or is observing behavior enough?

77cacd503a0b6641e3b7a2e26dbbbaa1?s=128

Ben Fields

October 01, 2012
Tweet

Transcript

  1. Should we care about content? Recommending by proxy with big

    metadata B e n F i e l d s @ a l s o t h i n g s
  2. Goal:

  3. how to make a recommender

  4. not so much how

  5. Rather: ‘what’

  6. what data should be used to make a recommender?

  7. First, Examples

  8. Image

  9. http://www.flickr.com/photos/dexxus/5652914929/

  10. http://www.flickr.com/photos/msojka/5285298402/

  11. http://www.flickr.com/photos/pinkertons/7730547912/

  12. http://www.flickr.com/photos/nicholas_t/626009491/

  13. http://www.flickr.com/photos/dexxus/5652914929/

  14. http://www.flickr.com/photos/dexxus/5652914929/ http://www.flickr.com/photos/msojka/5285298402/ http://www.flickr.com/photos/pinkertons/7730547912/ http://www.flickr.com/photos/nicholas_t/626009491/

  15. http://www.flickr.com/photos/dexxus/5652914929/ http://www.flickr.com/photos/msojka/5285298402/ http://www.flickr.com/photos/pinkertons/7730547912/ http://www.flickr.com/photos/nicholas_t/626009491/ Same Camera Model Colour Palette Distance

    Location
  16. Music

  17. A small playlist

  18. What should follow this? Meat Loaf - I'd Do Anything

    for Love
  19. maybe? Beethoveen’s Piano Sonata No. 14 in C Sharp Minor,

    Op. 27, No. 2
  20. how about? Journey - Don’t Stop Believin’

  21. A small playlist Meat Loaf - I'd Do Anything for

    Love Beethoveen Journey
  22. Recommending?

  23. prediction of opinions

  24. prediction of opinions about things

  25. prediction of Actions on things

  26. Metadata?

  27. Data About Data

  28. Content descriptors are metadata

  29. Ratings are metadata

  30. But it isn’t always neat and Tidy

  31. human curated Content descriptors are metadata

  32. In the wild

  33. @alsothings 33

  34. @alsothings 34

  35. @alsothings 35

  36. @alsothings 36

  37. @alsothings 37

  38. @alsothings 38

  39. None
  40. Contests

  41. Netflix Prize

  42. Netflix Prize 500K people’s ratings of 18k movies Take these

    and predict the ratings of movies that haven’t been Rated
  43. BellKor

  44. gradient boosted decision trees

  45. Any Value in Content?

  46. No.

  47. Million Song Dataset (kaggle)

  48. Million Song Dataset challenge (kaggle) The partial listening history of

    1M people Predict the tracks that are missing
  49. Collaborative filtering

  50. Any Value in Content?

  51. No.

  52. so?

  53. moar data > deeper Analysis

  54. Simple use of metadata from far and wide gets you

    further than Deep understanding of core-data
  55. But it doesn’t help you explain things to the user.

  56. http://www.netflixprize.com/community/viewtopic.php?id=1537 http://www.kaggle.com/c/msdchallenge http://mir-in-action.blogspot.co.uk/2012/09/collaborative- filtering-still-rules.html http://www.dtic.upf.edu/~ocelma/PhD/ http://www.slideshare.net/plamere/music-recommendation- and-discovery Ben Fields @alsothings

    Questions?