$30 off During Our Annual Pro Sale. View Details »

Trends in Real-world Recommender Systems

Trends in Real-world Recommender Systems

Takuya Kitazawa

November 21, 2017
Tweet

More Decks by Takuya Kitazawa

Other Decks in Technology

Transcript

  1. Trends in Real-world Recommender Systems
    Your “fancy” algorithm doesn’t scale in production
    Takuya Kitazawa
    @takuti

    View Slide

  2. $ whoami
    Treasure Data, Inc.
    Data Science Engineer
    Apache Hivemall
    Committer
    * All contents are based on the speaker's own thought, and they do NOT reflect the view of any of his previous and current affiliations.

    View Slide

  3. takuti.me

    View Slide

  4. View Slide

  5. View Slide

  6. Trend
    Beyond rating
    Realistic scenario
    Me
    Persistent cold-start
    Online algorithm
    Future
    New application
    Production scale

    View Slide

  7. Messages
    Recommendation ≠ Machine Learning
    Keep Things Simple, Be Data-Driven
    Get Outside of Your Lab

    View Slide

  8. User Modeling in Folksonomies
    Persistently Cold-Starting
    Online Item Recommendation
    Users
    Web pages + Tag
    Master’s thesis (2016)
    Bachelor’s thesis (2014)
    Internship

    View Slide

  9. Master’s thesis (2016)
    Users
    Web pages + Tag
    #BUDI 0OMJOF
    User Modeling in Folksonomies
    Persistently Cold-Starting
    Online Item Recommendation
    Bachelor’s thesis (2014)
    Internship

    View Slide

  10. Master’s thesis (2016)
    Users
    Web pages + Tag
    #BUDI 0OMJOF
    User Modeling in Folksonomies
    Persistently Cold-Starting
    Online Item Recommendation
    Bachelor’s thesis (2014)
    Trend?
    Internship

    View Slide

  11. ACM RecSys Conference 2014-2017
    https://takuti.me/note/recsys-wordcloud/
    2014
    2016
    2015
    2017

    View Slide

  12. ACM RecSys Conference 2014-2017
    https://takuti.me/note/recsys-wordcloud/
    2014
    2016
    2015
    2017
    Beyond collaborative filtering on rating

    View Slide

  13. “Netflix never implemented that solution itself”
    https://digit.hbs.org/submission/the-netflix-prize-crowdsourcing-to-improve-dvd-recommendations/
    https://www.techdirt.com/blog/innovation/articles/20120409/03412518422/why-netflix-never-implemented-algorithm-that-won-netflix-1-million-challenge.shtml

    View Slide

  14. https://digit.hbs.org/submission/the-netflix-prize-crowdsourcing-to-improve-dvd-recommendations/
    https://www.techdirt.com/blog/innovation/articles/20120409/03412518422/why-netflix-never-implemented-algorithm-that-won-netflix-1-million-challenge.shtml
    Change from US DVDs to global streaming
    Did not scale against dynamic growth of users and items
    Use more blended technique

    View Slide

  15. https://www.slideshare.net/optimaltransformation/a-collection-of-quotes-from-albert-einstein

    View Slide

  16. System requirements
    Wide-ranging applications and data
    “Practices”
    Scalability
    Batch vs streaming
    Social networks
    Product review (EC)
    Group recommendation

    View Slide

  17. Recommendation is
    Predicting users’ unforeseen behavior from data
    Users’ history
    Item attributes
    Context …

    View Slide

  18. Recommendation is
    Predicting users’ unforeseen behavior from data
    But,

    View Slide

  19. Recommendation ≠ Machine Learning

    View Slide

  20. Practice: Golf package recommendation at Rakuten
    Course Price Options

    (e.g. caddy, lunch)
    + +
    ML as a tool Interpretable
    Simple
    R. Swezey and Y. Chung. Recommending Short-Lived Dynamic Packages for Golf Booking Services. CIKM 2015.

    View Slide

  21. Theory: My new recommender

    View Slide

  22. Factorization Machines
    S. Rendle. Factorization Machines with libFM. ACM Transactions on Intelligent Systems and Technology, 3(3).

    View Slide

  23. Practice: My new “fancy” recommender on real data
    Poor accuracy Many hyper-params Inefficient Worse than Matrix Factorization
    Don’t squeeze everything into single method

    View Slide

  24. Keep Things Simple, Be Data-Driven

    View Slide

  25. # of data = # of solutions
    Whew! My new algorithm beats well-known methods!

    View Slide

  26. # of data = # of solutions
    Always recommend “most popular” items
    ML-ish techniques
    Whew! My new algorithm beats well-known methods!
    Accuracy
    High
    Low

    View Slide

  27. Simplest: Non-personalized recommendation
    Most Popular Average rating Random

    View Slide

  28. Do the “minimum” math
    https://takuti.me/note/the-amazon-way-on-iot/

    View Slide

  29. Q. Which technique should I use?

    View Slide

  30. Q. Which technique should I use?
    A. It depends on your data and application

    View Slide

  31. Get Outside of Your Lab

    View Slide

  32. Persistent cold-start problem
    at Rakuten Institute of Technology

    View Slide

  33. Golf package recommendation at Rakuten
    Course Price Options

    (caddy, lunch, …)
    + +
    Q. What happens for dynamic trends
    (e.g., changing price and/or users’ taste)

    View Slide

  34. Persistent cold-start on ad data (Yahoo! Lab; 2013)

    View Slide

  35. Persistent cold-start on real web service (Booking.com; 2015)

    View Slide

  36. Persistent cold-start
    Online update Rich auxiliary data
    Incremental Factorization Machines
    Persistently Cold-Starting Online Item Recommendation
    RecProfile 2016
    Master’s thesis
    Problem
    Effective approach

    View Slide

  37. Production-level algorithm should be “usable”
    at Treasure Data

    View Slide

  38. Implement anomaly detection algorithms
    Test on real system metrics
    https://takuti.me/note/td-intern-2016/

    View Slide

  39. Time-series data
    e.g., syslog
    Outlier and change-point in time-series data
    STEP 1
    Find patterns from
    past observations
    Wide-scale “global” change
    time value
    … …
    1508966854 290
    1508966853 294
    1508966852 38
    1508966852 290
    1508966851 294
    1508958753 301
    1508955307 38
    1508954422 38
    1508948503 38
    … …
    Change-Point
    Spiky “local” data point
    Outlier
    STEP 2
    Compute score at each point in time
    “How far from past pattern”

    View Slide

  40. ‣ Probabilistic approach
    ‣ Many hyper-parameters and sensitive result
    ‣ Mathematically tractable, numerical algebraic approach
    ‣ Minimum # of hyper-params with robust result
    ‣ Efficient approximation scheme
    ChangeFinder
    Singular Spectrum Transformation
    Easy-to-use,
    Interpretable

    View Slide

  41. Similarities between anomaly detection and recommendation
    Feature-expressiveness Rich vector representation
    Online-updating Finding similar/dissimilar samples in real-time
    Usability Simple hyper-params, interpretable result
    Scalability Production-level efficient back-end system
    Implicit feedback Binary feedback (buy or not, anomaly or not)

    View Slide

  42. Apply “usable” anomaly detection method for recommendation

    View Slide

  43. RecSys 2016 tutorial by Quora
    Implicit >>> Explicit
    https://www.slideshare.net/xamat/recsys-2016-tutorial-lessons-learned-from-building-reallife-recommender-systems

    View Slide

  44. Don’t be algorithm-driven
    at Silver Egg Technology

    View Slide

  45. ‣ 1M+ purchase log
    ‣ Attributes
    - Customer’s session ID
    - Item ID
    - Timestamp
    Started from algorithm
    Real e-commerce data
    Lack of features

    View Slide

  46. Understanding data
    Small amount of daily purchase
    Customers
    Items
    0.0086% nonzero
    Need to take advantage of sparsity
    in terms of both algorithm and implementation

    View Slide

  47. Understanding data
    Rapidly increasing # of customers and items
    High dimensionality
    Customers Items

    View Slide

  48. Understanding data
    Small % of customers/items contribute many purchases
    Massive “useless” customers and items
    Customers Items

    View Slide

  49. Understanding data
    Timestamp represents seasonality

    View Slide

  50. Assumption
    My algorithm might NOT be effective on this data…

    View Slide

  51. Anyway, let me try as much as I can…
    Dimensionality reduction by hashing
    Store item candidates with time window
    ‣ Only use most-recently observed 100 items for recommendation

    View Slide

  52. Lessons
    Start from data
    Understanding data leads appropriate algorithm
    Think of hybrid approach

    View Slide

  53. Messages
    Recommendation ≠ Machine Learning
    Keep Things Simple, Be Data-Driven
    Get Outside of Your Lab

    View Slide

  54. Future: Scaling recommender in production
    Personalization is everywhere in various ways
    as Netflix said “Everything is recommendation”
    https://www.slideshare.net/justinbasilico/past-present-future-of-recommender-systems-an-industry-perspective

    View Slide

  55. Listen podcast episode with Dr. Joseph Konstan
    ‣ “I hate Amazon’s first page”
    ‣ Recommendation for education
    ‣ Context-aware recommender
    ‣ Cross validation is NOT realistic
    ‣ Serendipity ≠ Just “BAD”
    - = like & didn’t know
    ‣ …

    View Slide

  56. First step
    Online course
    https://takuti.me/note/coursera-recommender-systems/

    View Slide

  57. First step
    Pre-programmed (mostly static) algorithms and metrics
    ‣ Surprise (Python) http://surpriselib.com/
    ‣ fastFM (Python) http://ibayer.github.io/fastFM/
    ‣ Implicit (Python) http://implicit.readthedocs.io/en/latest/
    ‣ MyMediaLite (C#) http://www.mymedialite.net/
    ‣ LibRec (Java) https://www.librec.net/
    ‣ LensKit (Java) http://lenskit.org/
    On Apache Hadoop, Hive, Spark:
    ‣ Apache Mahout http://mahout.apache.org/
    ‣ Apache Hivemall https://hivemall.incubator.apache.org/
    ‣ Spark MLlib https://spark.apache.org/mllib/

    View Slide

  58. And, FluRS :)

    View Slide

  59. Trends in Real-world Recommender Systems
    Your “fancy” algorithm doesn’t scale in production
    Takuya Kitazawa
    @takuti

    View Slide