Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Personalised Recommendations

Sponsored · Ship Features Fearlessly Turn features on and off without deploys. Used by thousands of Ruby developers.

Personalised Recommendations

Avatar for Edward Tsech

Edward Tsech

August 09, 2014
Tweet

More Decks by Edward Tsech

Other Decks in Programming

Transcript

  1. About me • Ed Tsech • Clojure, JavaScript developer •

    @edtsech on twitter, github Saturday 9 August 14
  2. Content • Collaborative filtering • User based • Item based

    • Content based / knowledge based recommendations • Mahout • Movie Recommender Example Saturday 9 August 14
  3. Collaborative Filtering • “Collaborative filtering is a method of making

    automatic predictions (filtering) about the interests of a user by collecting preferences or taste information from many users (collaborating).” Saturday 9 August 14
  4. Collaborative Filtering • Last.fm, Twitter, Amazon • Pros • Relatively

    precise, ability to recommend items from different categories • Cons • Cold start problem Saturday 9 August 14
  5. Other Algorithms • Log-likelihood • Slope one • Singular value

    decomposition • K nearest neighbors • Cluster-based Saturday 9 August 14
  6. Content Based • Prismatic • Pros • No cold start

    problem, ability to recommender new items • Cons • Harder to implement, not so precise, sometimes stupid. Saturday 9 August 14
  7. Hybrid Systems • Netflix • Mix collaborative filtering & content-based

    recommendations • Knowledge-based • Add domain information Saturday 9 August 14
  8. Mahout • Scalable machine learning library • User based recommenders

    • Item based recommenders • Various algorithms • Evaluation & rescoring features • Hadoop integration Saturday 9 August 14
  9. Reca • Thin Clojure wrapper for Mahout’s single- machine recommendation

    algorithms • https://github.com/edtsech/reca Saturday 9 August 14
  10. Movie App Demo • 8400000 ratings • 1.7 Gb database

    • 162 037 users • 82 715 movies Saturday 9 August 14
  11. Rescoring • Add application logic to the recommender • Add

    domain specific information • Helps to make a hybrid recommender Saturday 9 August 14
  12. Evaluation Evaluation of user based algorithm based on 3% of

    whole ratings (y axis - average difference) Saturday 9 August 14
  13. Evaluation Evaluation of item based algorithm based on 33% of

    whole ratings (y axis - average difference) Saturday 9 August 14
  14. Performance • 1.5Gb of memory • 250 msecs for user

    based recommender • 60-90 secs for item based recommender • 0.1 msecs after caching Saturday 9 August 14
  15. Links • Mahout in Action [book] • Collective intelligence [book]

    • http://mahout.apache.org/ • http://blog.comsysto.com/2013/04/03/ background-of-collaborative-filtering-with- mahout/ Saturday 9 August 14