Upgrade to Pro — share decks privately, control downloads, hide ads and more …

推薦システムと、RayやDeepSpeedを用いたユーザー・アイテムの次世代の埋め込み表現 /...

推薦システムと、RayやDeepSpeedを用いたユーザー・アイテムの次世代の埋め込み表現 / Recommender System and Next-Gen. Embedding Representations of Users and Items Using Ray and DeepSpeed

「Vector Forge」はベース技術としてコンテンツモデルを利用した高効率で性能の経時劣化を抑えたモデルです。LINE Newsのレコメンデーションでの経験をベースとしてそれを改善、サービス横断させました。コンテンツモデルとシーケンスレコメンドモデルの連携の重要性やモデルの評価システムの構築に関する話です。

More Decks by LINEヤフーTech (LY Corporation Tech)

Other Decks in Technology

Transcript

  1. Introduction to the Vector Forge Project Recommender System and Next-Gen.

    Embedding Representations of Users and Items Using Ray and DeepSpeed LY Corporation, Data Science Group Katsuya Iida
  2. - Who are we? - Current recsys for LINE NEWS

    - Vector Forge - New vector features - Vector Forge model architecture - Vector Forge Benchmark - Summary and future plans Agenda Ref: Rayleigh: A Management Platform for Ray clusters https://tech-verse.lycorp.co.jp/2025/ja/session/1022/
  3. • Joint team of ML teams from LINE and Yahoo

    Japan • One of missions is to deliver company wide user/item vectors for ML tasks • ML engineers with experienced with recommender systems • I also work on LINE NEWS recommendation Who are we? Introduction of Vector Forge Project
  4. Current recsys for LINE NEWS Current recsys for LINE NEWS

    • Who are we? • Current recsys for LINE NEWS • Vector Forge - New vector features • Vector Forge model architecture • Vector Forge Benchmark • Summary and future plans
  5. • Recommendation is based on user log and item •

    User log data and item data are processed through several stages Recommendation pipeline Current recsys for LINE NEWS User log data Item data Feature extraction Retrieval model Ranker model raw features raw + processed features item candidates end users Heuristic post-process Vector Forge focuses this area
  6. Feature extraction Current recsys for LINE NEWS User log data

    Item data Feature extraction Retrieval model Ranker model end users Heuristic post-process
  7. • Often consuming raw features directly is inefficient • Raw

    features can be high-dimensional and complex • Non-structured image/text and large number of items • We produce user/item embedding (vector) Feature extraction Current recsys for LINE NEWS User log data Item data Feature extraction user_1 : [-6.7, 4.9, ..., 3.8] user_2 : [ 4.8, 9.9, ..., -9.7] user_3 : [-8.4, -3.8, ..., -4.7] user_4 : [ 7.5, 8. , ..., 3.4] item_1 : [-9.2, 9.8, ..., 2.9] item_2 : [-3.6, 9.2, ..., -1.8] item_3 : [-4.8, -7.6, ..., 6.4]
  8. • We have 71 feature extraction models that are used

    for various tasks including LINE NEWS recommendation Feature extraction Current recsys for LINE NEWS User/Item data for LINE Gift LINE Gift user_1 : [-6.7, 4.9, ..., 3.8] ... User/Item data for LINE Sticker User demography LINE Sticker user_1 : [ 4.8, 9.9, ..., -9.7] ... User demography user_1 : [-8.4, -3.8, ..., -4.7] ... Each feature extraction model produces a different vector for each user. Training frequency Every week Prediction frequency Every day Feature extraction models
  9. Retrieval model Current recsys for LINE NEWS User log data

    Item data Feature extraction Retrieval model Ranker model end users Heuristic post-process
  10. Retrieval model Current recsys for LINE NEWS User log data

    for LINE NEWS Item data for LINE NEWS • For LINE NEWS, we use ID-based embedding two-tower DNN architecture with added user vector features, but without item vector features. User tower Item tower List of viewed items Target items User vector Item vector Contrastive loss User vector for LINE Official Account User vector for LINE Ads User/item vectors from retrieval model are different from user/item vectors from feature extraction Item embedding User features
  11. Cold item problem Current recsys for LINE NEWS • We

    want to recommend latest news articles, but new articles have too few user interactions. • CB2CF: A Neural Multiview Content-to-Collaborative Filtering Model for Completely Cold Item Recommendations 1 but... Item vector from CF model Item data (text data) Text BERT Item text data Predicted item vector Text BERT model produces item vector from text content 1 https://arxiv.org/pdf/1611.00384
  12. • In retrieval stage, candidate items from ID-based model and

    CB2CF model are mixed and send to the ranker stage. • ID-based model is fact-based. It is good at old items • CB2CF model uses text data. It is good at new items Mixing ID-based and CB2CF Current recsys for LINE NEWS ID-based model CB2CF model User vector (ID-based) Item vector (ID-based) Item vector (CB2CF) ANN lookup (ID-based) ANN lookup (CB2CF) Ranker Mixed candidates
  13. Vector Forge - New vector features Vector Forge - New

    vector features • Who are we? • Current recsys for LINE NEWS • Vector Forge - New vector features • Vector Forge model architecture • Vector Forge Benchmark • Summary and future plans
  14. Why new project? Vector Forge - New vector features Want

    to reduce training cost Want to use various data Modernize ML framework
  15. • For accurate recommendation, ID-based model needs frequent training •

    CB2CF needs training after ID-based • CB2CF item vectors depends on ID-based item vectors • BERT (CB2CF) is slower than ID-based MLP-mixer ID-based + CB2CF is expensive Vector Forge - New vector features ID-based model CB2CF model User vector (ID-based) Item vector (ID-based) Item vector (CB2CF) User log data Expensive to train ID-based → CB2CF training dependency Needs frequent training Updated frequently Item data
  16. Reverse the dependency Vector Forge - New vector features Content

    model User model User vector Item vector User log data Content model inherently needs less frequent training because meanings of text/image don’t change overnight. Item data ID-based model CB2CF model User vector Item vector (ID-based) Item vector (CB2CF) User log data ID-based → CB2CF training dependency Item data As is New: Vector Forge
  17. • CB2CF and ID-based work on a single vector space.

    • Difficult to use other kind of vectors. (SLM/LLM/BERT/CLIP, etc.) • Easier if dependency is reversed. Want to use various data Vector Forge - New vector features single vector space text vector space vision vector space User model (ID-based) Item model (ID-based) Item model (SLM/LLM/BER T) Item model (CLIP) User model (Vector Forge) Item model (CB2CF)
  18. • We used to use in-house ML framework 1 •

    It works well with traditional batch DNN training/prediction • Simple runtime preprocessing is possible • Old framework won’t go away soon, but we try something new... Modernize ML framework Vector Forge - New vector features 1 https://linedevday.linecorp.com/2020/en/sessions/9750/
  19. • Ray + DeepSpeed • Supports distributed training natively •

    Supports libraries for SLM/LLM training like DeepSpeed • Complicated preprocessing can be written easily • Splitting time series into input/target pair • Unstructured text processing • LM masking of BERT Modernize ML framework Vector Forge - New vector features
  20. Vector Forge model architecture Vector Forge model architecture • Who

    are we? • Current recsys for LINE NEWS • Vector Forge - New vector features • Vector Forge model architecture • Vector Forge Benchmark • Summary and future plans
  21. • We train ID-based, CB2CF once with longer user context

    • Get all user vectors from ID-based model • Get all item vectors from CB2CF model • User model doesn’t use item vectors from ID-based model Less frequent CB2CF training Vector Forge model architecture ID-based model CB2CF model user vector (ID-based) Item vector (ID-based) Item vector (CB2CF) User model
  22. • What is UIECF (User and Item Embedding models to

    Collaborative Filtering) • Input is a sequence of item vectors, output is a user vector. UIECF user model Vector Forge model architecture Transformer encoder I1 I2 I3 I4 I5 I6 O1 O2 O3 O4 O5 O6 Pooling layer User vector (ID-based) Item vector (CB2CF) U Regression loss
  23. • [SASRec] Self-Attentive Sequential Recommendation1 • SASRec is a causal

    model (like GPT). One output for a step • [BERT4Rec] Sequential Recommendation with Bidirectional Encoder Representations from Transformer2 • BERT4Rec is a masked model. One output for a step • [PinnerFormer] Sequence Modeling for User Representation at Pinterest3 • PinnerFormer uses large window (28 days) as target • [Our method: UIECF] • UIECF is bi-directional. One output for a sequence • UIECF is not end-to-end. Related works Vector Forge model architecture 1 https://arxiv.org/pdf/1808.09781 2 https://arxiv.org/pdf/1904.06690 3 https://arxiv.org/pdf/2205.04507
  24. • Very complicated preprocessing -> Spark • Complicated preprocessing →

    Ray • Larger model size → DeepSpeed • Trained on 2 nodes with A100 PCIe or 1 nodes with 8 A100 SMX4 Spark, Ray and DeepSpeed Vector Forge model architecture User log data Item data Apache Spark︎ 1 Ray 2 DeepSpeed 3 Feature extraction Last-mile Preprocessing Distributed training Trained model 1 https://spark.apache.org/ 2 https://github.com/ray-project/ray 3 https://github.com/deepspeedai/DeepSpeed
  25. DeepSpeed performance Vector Forge model architecture Model Baseline 80GB PCIe

    x 2 Zero Stage 2 80GB PCIe x 2 Zero Stage 3 80GB PCIe x 2 Baseline 40GB SXM4 x 8 Zero Stage 2 40GB SXM4 x 8 Zero Stage 3 40GB SXM4 x 8 GPTX Neo 1B 12.49s/it OOM 3.10s/it 3.09s/it GPTX Neo 3B 27.70s/it OOM 6.62s/it 6.36s/it GPTX Neo 7B OOM SEGV SEGV OOM 13.80s/it 13.62s/it Models are trained with contrastive task
  26. Vector Forge Benchmark Vector Forge Benchmark • Who are we?

    • Current recsys for LINE NEWS • Vector Forge - New vector features • Vector Forge model architecture • Vector Forge Benchmark • Summary and future plans
  27. • We want to evaluate our vectors with downstream tasks.

    • Reproducible evaluation with train/test dataset • Train/test dataset ETL Vector Forge Benchmark Vector Forge Benchmark User log data Item data Train/test ETL Benchmark train data Benchmark test data Benchmark train data For Vector Forge train/test • LINE NEWS retrieval/i2i • LINE GIFT (JP) • LINE Ads ... For eval task train/test • LINE NEWS ranker • Yahoo! JAPAN Shopping recommendation • User demography prediction (age, gender, ...) ... Split date (e.g. 2024110100)
  28. • Prediction from downstream tasks are evaluated • Evaluators output

    metrics Benchmark evaluators Vector Forge Benchmark Vector Forge model Benchmark train data User vector Item vector Downstream tasks Task prediction Benchmark test data Benchmark Evaluators • ndcg • entropy • accuracy • f1_macro • ...
  29. Evaluation of aging degradation Vector Forge Benchmark Old version of

    UIECF was used for this comparison 0 0.005 0.01 0.015 0.02 0.025 20230624 20230924 20231224 20240324 20240624 CB2CF full ndcg@5 UIECF full ndcg@5 Trained with old data but less degradation
  30. • LINE NEWS A/B test is planned in Jun... LINE

    NEWS A/B Vector Forge Benchmark
  31. • UIECF was evaluated with Vector Forge Benchmark • Training

    data • User log of LINE NEWS (Not yet multi-domain) • Text and image of items (Multi-modal) • Test data • LINE NEWS recommendation • And other downstream tasks (Multi-domain) • But models with image data is not studied much (Not yet multi-modal) • A/B test with LINE NEWS is ongoing Where we are Vector Forge Benchmark
  32. • New model for feature vectors by Vector Forge •

    Use content features (text/vision) for item vectors • Reduced training cost by reversed dependency • Future plans • Better model for multi-domain recommendation • Further reduce the cost of model training • Evaluate multi-modal / multi-modal item model Summary and future plans Summary and future plans