Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Recommendation.jl: Modeling User-Item Interactions in Julia

Recommendation.jl: Modeling User-Item Interactions in Julia

Presentation at JuliaCon 2022 @ Online https://pretalx.com/juliacon-2022/talk/VWVY9S/

Repository: https://github.com/takuti/Recommendation.jl

Previous JuliaCon 2019 presentation: https://www.youtube.com/watch?v=kC8LKQ_YjyM

Takuya Kitazawa

July 28, 2022
Tweet

More Decks by Takuya Kitazawa

Other Decks in Programming

Transcript

  1. Recommender Systems Input data undergoes a series of vector/matrix computations

    item user value Rating, purchase, click, watch, … Price, category, … Demographics, … Event Matrix Recommender Top-k recommendations …
  2. Recommender Systems Lifecycle that Recommendation.jl makes it easier to follow

    Data Pre-process Build Evaluate Post-process How to build recommender Intro to recommender algorithms Why Julia • Vector/matrix-friendly syntax • High simplicity and modularity • E ff i cient numerical computing • Recommender is not only about ML • Ecosystem for math tools matters
  3. Infinite = Union{AbstractFloat, Integer} mutable struct Even t user::Integer item::Integer

    value::Infinite # e.g. rating, 0/1 end struct DataAccessor events::Array{Event,1} R::AbstractMatri x user_attributes::Dict{Integer,Any} # user => attributes item_attributes::Dict{Integer,Any} # item => attributes # ... end data = DataAccessor(...) recommender = MostPopular(data) fit!(recommender) recommend(...)
  4. • Change in interface & variable type
 e.g., build!(…) →

    fit!(…) • New algorithms: Factorization machines, 
 Matrix factorization with BPR loss • Additional logic for data validation & 
 missing data handling • Data downloader:
 MovieLens, LastFM,
 Amazon review • Synthetic data generator 2019 → 2022 Improving usability toward “v1.0.0” Data Pre-process Build Evaluate Post-process • Support non-accuracy metrics:
 Serendipity, novelty, diversity, coverage
  5. using Recommendatio n data = load_movielens_100k() all_items = collect(keys(data.item_attributes)) fm

    = FactorizationMachines(data) fit!(fm, learning_rate=0.3, max_iter=100) recommendations = recommend(fm, user, topk, all_items) # => list of (item, score) measure(Coverage(), map(first, recommendations), catalog=all_items) Data downloader New algorithm & interface Non-accuracy evaluation
  6. Accuracy vs. Non-Accuracy Metrics What is the de fi nition

    of “good” recommender? Coverag e AggregatedDiversit y GiniInde x IntraListSimilarit y Novelt y Serendipit y ShannonEntropy AU C MA E MA P MP R NDC G Precisio n RMS E Recal l ReciprocalRank split_dat a cross_validatio n measur e evaluate
  7. Recommender Other Tools How OSS communities design recommender systems Language

    Data modeling and linear algebra Characteristics MyMediaLite C# Built-in arithmetic operators with fi le IOs Simplicity and transparency of basic recommendation techniques LibRec Java Custom interfaces (e.g., dense/sparse matrices) & built-in data structures Wide range of algorithms implemented from scratch LensKit / LightFM Python NumPy/SciPy & Cython Rapid development and wider use cases in the Python community Recommendation.jl Julia Built-in vector/matrix representations Take full advantage of built-in o ff ering for simplicity and e ff i ciency Data model Algorithm Interface Metrics Utils
  8. More details to be available in proceedings julia> using Pkg;

    Pkg.add(“Recommendation” ) github.com/takuti/Recommendation.jl