Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Personalised Recommendations

Personalised Recommendations

Edward Tsech

August 09, 2014
Tweet

More Decks by Edward Tsech

Other Decks in Programming

Transcript

  1. About me • Ed Tsech • Clojure, JavaScript developer •

    @edtsech on twitter, github Saturday 9 August 14
  2. Content • Collaborative filtering • User based • Item based

    • Content based / knowledge based recommendations • Mahout • Movie Recommender Example Saturday 9 August 14
  3. Collaborative Filtering • “Collaborative filtering is a method of making

    automatic predictions (filtering) about the interests of a user by collecting preferences or taste information from many users (collaborating).” Saturday 9 August 14
  4. Collaborative Filtering • Last.fm, Twitter, Amazon • Pros • Relatively

    precise, ability to recommend items from different categories • Cons • Cold start problem Saturday 9 August 14
  5. Other Algorithms • Log-likelihood • Slope one • Singular value

    decomposition • K nearest neighbors • Cluster-based Saturday 9 August 14
  6. Content Based • Prismatic • Pros • No cold start

    problem, ability to recommender new items • Cons • Harder to implement, not so precise, sometimes stupid. Saturday 9 August 14
  7. Hybrid Systems • Netflix • Mix collaborative filtering & content-based

    recommendations • Knowledge-based • Add domain information Saturday 9 August 14
  8. Mahout • Scalable machine learning library • User based recommenders

    • Item based recommenders • Various algorithms • Evaluation & rescoring features • Hadoop integration Saturday 9 August 14
  9. Reca • Thin Clojure wrapper for Mahout’s single- machine recommendation

    algorithms • https://github.com/edtsech/reca Saturday 9 August 14
  10. Movie App Demo • 8400000 ratings • 1.7 Gb database

    • 162 037 users • 82 715 movies Saturday 9 August 14
  11. Rescoring • Add application logic to the recommender • Add

    domain specific information • Helps to make a hybrid recommender Saturday 9 August 14
  12. Evaluation Evaluation of user based algorithm based on 3% of

    whole ratings (y axis - average difference) Saturday 9 August 14
  13. Evaluation Evaluation of item based algorithm based on 33% of

    whole ratings (y axis - average difference) Saturday 9 August 14
  14. Performance • 1.5Gb of memory • 250 msecs for user

    based recommender • 60-90 secs for item based recommender • 0.1 msecs after caching Saturday 9 August 14
  15. Links • Mahout in Action [book] • Collective intelligence [book]

    • http://mahout.apache.org/ • http://blog.comsysto.com/2013/04/03/ background-of-collaborative-filtering-with- mahout/ Saturday 9 August 14