Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Recent Advances in Candidate Matching

wing.nus
March 13, 2023
500

Recent Advances in Candidate Matching

Abstract: The candidate matching, i.e., retrieving potentially relevant candidate items to users for further ranking, is highly important since even the cleverest housewife cannot bake bread without flour. While many research efforts are devoted to the ranking stage, few works focus on the former one where the space for more sophisticated models seems to be limited due to the requirement of low computation cost, especially for an industrial system. No matter which scenario, the matching stage is highly important since even the cleverest housewife cannot bake bread without flour. In this talk, we will introduce the background knowledge towards this task at first. Then, an overview on the recent efforts of this line will be presented. At last, we will discuss the possible directions in the near future.

Speaker Bio: Dr. Chenliang Li is a Professor at School of Cyber Science and Engineering, Wuhan University. His research interests inlcude information retrieval, natural language processing and social media analysis. He has published over 90 research papers on leading academic conferences and journals such as SIGIR, ACL, WWW, IJCAI, AAAI, TKDE and TOIS. He has served as Associate Editor / Editorial Board Member for ACM TOIS, ACM TALLIP, IPM and JASIST. His research won the SIGIR 2016 Best Student Paper Honorable Mention and TKDE Featured Spotlight Paper.

wing.nus

March 13, 2023
Tweet

More Decks by wing.nus

Transcript

  1. Introduction Part 1 2 We are now living in an

    age of INFORMATION EXPLOSION Search Recommendation Chatbot Information Seeking happens everyday for everyone in almost everywhere
  2. Introduction Part 1 3 Candidate Matching Ranking Phase 1 Phase

    2 Hundreds Tens Models: Matching Models Ranking Models All Items Millions User History and Contexts All other Side Info Data Sources: The two-stage pipeline is widely deployed in real-world systems Ranking Models Matching Models Candidate Matching / Retrieval / Generation are equivalent to each other
  3. Introduction Part 1 4 Candidate Matching Ranking Phase 1 Phase

    2 Hundreds Tens Models: Matching Models Ranking Models All Items Millions User History and Contexts All other Side Info Data Sources: The two-stage pipeline is widely deployed in real-world systems Ranking Models Matching Models LOW LATENCY Query Encoder Item Encoder starbucks drinks S
  4. Introduction Part 1 5 Candidate Matching Ranking Phase 1 Phase

    2 Hundreds Tens Models: Matching Models Ranking Models All Items Millions User History and Contexts All other Side Info Data Sources: The two-stage pipeline is widely deployed in real-world systems Ranking Models Matching Models LOW LATENCY Query Encoder Item Encoder starbucks drinks S User History and Contexts All other Side Info LOW LATENCY How to Use Context and Side Info
  5. Introduction Part 1 6 The HISTORY for candidate matching Item-based

    CF Representation Learning Interaction-based Learning Before 2013 Now Pearson-based CF SLIM (i.e., MF) DSSM, YoutubeDNN, BST, DPR, ESAM, Condenser, MADR, MIND, KEMI UMI, PDN, AGREE
  6. Introduction Part 1 7 The HISTORY for candidate matching Item-based

    CF Pearson-based CF SLIM (i.e., MF) A B C High Correlation Item-based filtering (Amazon, 2001) Pearson Coefficient Latent Vector
  7. Introduction Part 1 8 The HISTORY for candidate matching Representation

    Learning DNN/Transformer DSSM, YoutubeDNN, BST, DPR, ESAM, Condenser, MADR, MIND, KEMI Multi-Interest Learning
  8. Introduction Part 1 9 The HISTORY for candidate matching Interaction-based

    Learning UMI, PDN, AGREE How to enable efficient interaction-based feature learning?
  9. Overview - Representation Learning – Deep DNN Part 2 10

    p Computation parallelization and speedup are must-be for dense retrieval Deep Neural Networks for YouTube Recommendations, RecSys 2016 Learning Deep Structured Semantic Models for Web Search using Clickthrough Data, CIKM 2013
  10. Overview - Representation Learning - Multi-Interest Learning Part 2 11

    p To model multi-aspect / -interest nature of the world Controllable Multi-Interest Framework for Recommendation, KDD 2020 Multi-Aspect Dense Retrieval, KDD 2022
  11. Overview - Representation Learning - GNN Part 2 12 p

    To exploit high-order semantics and correlations Neural Graph Collaborative Filtering, SIGIR 2019 LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation, SIGIR 2020
  12. Overview - Representation Learning – GNN + Multi-Interest Part 2

    13 1 • Refine user preferences based on multi-level correlations between historical items. Graph convolution Aggregation 2 • Focus on extracting different interests by performing historical item clustering. Multi-interest learning p To exploit the both benefits of GNN and multi-interest learning When Multi-Level Meets Multi-Interest: A Multi-Grained Neural Model for Sequential Recommendation, SIGIR 2022
  13. Overview - Representation Learning – GNN + Multi-Interest Part 2

    14 When Multi-Level Meets Multi-Interest: A Multi-Grained Neural Model for Sequential Recommendation, SIGIR 2022
  14. Overview - Representation Learning – Long-tail Problem Part 2 15

    p Non-displayed items cause exposure bias • Trained only with displayed items • Retrieve items in the entire space Exposure Bias Displayed Non-displayed Label ESAM: Discriminative Domain Adaptation with Non-Displayed Items to Improve Long-Tail Performance, SIGIR 2021
  15. Overview - Representation Learning – Long-tail Problem Part 2 16

    p Why poor long-tail performance • Domain Shift • Representation learning is not robust and consistent Cause How to solve • Unsupervised Domain Adaptation (reduce domain shift) ESAM: Discriminative Domain Adaptation with Non-Displayed Items to Improve Long-Tail Performance, SIGIR 2021
  16. Overview - Representation Learning – Long-tail Problem Part 2 17

    p Ranking model backbone • Unsupervised domain adaptation to reduce domain shift p Formula definition ESAM ESAM n Motivation • Architectural solutions may not generalize well. • Highlight the importance of learning good feature representations for non-displayed items. Non-displayed Items Displayed Items ESAM: Discriminative Domain Adaptation with Non-Displayed Items to Improve Long-Tail Performance, SIGIR 2021
  17. Overview - Representation Learning – Long-tail Problem Part 2 18

    p Domain Shift • Attribute correlation alignment 𝑳𝑫𝑨 Displayed Non-displayed Label Source Domain Target Domain • High-Level Attribute Distribution definition • Distribution verification ESAM: Discriminative Domain Adaptation with Non-Displayed Items to Improve Long-Tail Performance, SIGIR 2021
  18. Overview - Representation Learning – Long-tail Problem Part 2 19

    p Center-wise clustering for source domain 𝑳𝑫𝑪 𝒄 • Attribute correlation alignment • 𝑳𝑫𝑪 𝒄 makes similar items cohere together while dissimilar items separate from each other Items with the same feedback (click) are similar Item Click? ✔ × ✔ × × × ✔ ESAM: Discriminative Domain Adaptation with Non-Displayed Items to Improve Long-Tail Performance, SIGIR 2021
  19. Overview - Representation Learning – Long-tail Problem Part 2 20

    p Self-training for target clustering 𝑳𝑫𝑪 𝒑 • To suppress negative transfer & easy-to-hard strategy • Why: ignoring target label information when aligning • Entropy regularization Negative Transfer ESAM: Discriminative Domain Adaptation with Non-Displayed Items to Improve Long-Tail Performance, SIGIR 2021
  20. Overview – Representation Learning - Auxiliary Knowledge Part 2 21

    Knowledge-aware Attention for Information Propagation p Linking items with their attributes KGAT: Knowledge Graph Attention Network for Recommendation, KDD 2019
  21. Overview - Representation Learning – Auxiliary Knowledge - MMoE Part

    2 22 p Sharing App in Taobao • UV ~ 19M, GMV ~ 400M p Target • More sharing between users in the platform • More interactions and more friends p Features • Many different sharing scenarios • Correlations and discrepancies Questions: 1. How to exploit scenario dependent knowledge ? 2. How to handle the low-resource nature for long-tail scenarios ? 3. How to accommodate with social relations between users ? Heterogeneous Graph Augmented Multi-Scenario Sharing Recommendation with Tree-Guided Expert Networks, WSDM 2021
  22. Overview - Representation Learning – Auxiliary Knowledge - MMoE Part

    2 23 p TreeMMoE • We can build a tree structure to model the hierarchical relations between different scenarios “C2C→Sharing→Entity→Product→Makeups” • Some fine-grained scenarios would hold some common ancestor scenarios • Knowledge can be transferred Heterogeneous Graph Augmented Multi-Scenario Sharing Recommendation with Tree-Guided Expert Networks, WSDM 2021
  23. Overview - Representation Learning – Auxiliary Knowledge - MMoE Part

    2 24 p TreeMMoE Each layer of the tree has a gate for the corresponding expert network Heterogeneous Graph Augmented Multi-Scenario Sharing Recommendation with Tree-Guided Expert Networks, WSDM 2021
  24. Overview - Representation Learning – Auxiliary Knowledge - MMoE Part

    2 25 p Sum up for representation learning. Two-Tower DNN Multi-Interest Learning Long-tail Problem Auxiliary Knowledge Representation Learning
  25. Overview - Interaction-based Learning – Attention Mechanism Part 2 26

    p Attention Mechanism (Target Attention) for relevant feature highlighting Deep Interest Network for Click-Through Rate Prediction, KDD 2018
  26. Overview - Interaction-based Learning – User Side Part 2 27

    p Identify important user features / behaviors for better representation User-Aware Multi-Interest Learning for Candidate Matching in Recommenders, SIGIR 2022
  27. Overview - Interaction-based Learning – Item Side Part 2 28

    p Identify important item features / behaviors for precise item-item relevance Path-based Deep Network for Candidate Item Matching in Recommenders, SIGIR 2021
  28. Overview - Interaction-based Learning – Optimization Part 2 29 p

    Enable target attention in an efficient way Fast Semantic Matching via Flexible Contextualized Interaction, WSDM 2022
  29. Overview - Representation Learning – Auxiliary Knowledge - MMoE Part

    2 30 p Sum up for interaction-based learning. User Side Item Side Reduce Computation Cost Interaction-Based Learning
  30. Future Trends – One Model Serves ALL Part 3 31

    Automatically Network Sharing and Optimization Category Search Insurance Search Hot Trends Item Search Live Commerce 500万Content Matrices 直播/视频 Complicated Entity Relations X00+ Insurances 8k+Fund 80K+Stocks 美股/港股/A股 4k+Agents 100+ Financial Organ. XXX Sections Category Search Section/Con cept Search Fund Search (Price) Hot Trends QA Diverse Queries & Intents 白酒/军工/材料 富/广发/鹏华 人寿险/车险、财产险/健康险 新发/热门/金选 ALiPay APP Automatic Expert Selection for Multi-Scenario and Multi-Task Search, SIGIR 2022
  31. Future Trends – One Model Serves ALL Part 3 32

    One model serves ALL – Automatically Network Sharing and Optimization Category Search Insurance Search Hot Trends Item Search Live Commerce 500万Content Matrices 直播/视频 Complicated Entity Relations X00+ Insurances 8k+Fund 80K+Stocks 美股/港股/A股 4k+Agents 100+ Financial Organ. XXX Sections Category Search Section/Con cept Search Fund Search (Price) Hot Trends QA Diverse Queries & Intents 白酒/军工/材料 富/广发/鹏华 人寿险/车险、财产险/健康险 新发/热门/金选 ALiPay APP Automatic Expert Selection for Multi-Scenario and Multi-Task Search, SIGIR 2022
  32. Future Trends – One Model Serves ALL Part 3 33

    One model serves ALL – Automatically Network Sharing and Optimization • Personalization for each instance • Modeling complex scenarios & tasks • Flexible & Scalable • End-to-End、Low Cost Automatic Expert Selection for Multi-Scenario and Multi-Task Search, SIGIR 2022
  33. Future Trends – Go Beyond Two-Tower Part 3 34 REAL

    Interaction-based Learning – Coupling and Decoupling Expensive but Effective starbucks drinks S … … Beyond Two-Tower: Attribute Guided Representation Learning for Candidate Retrieval, WWW 2023
  34. Future Trends Part 3 35 REAL Interaction-based Learning – Coupling

    and Decoupling Expensive but Effective starbucks drinks S … … Can we transfer the capacity of interaction-based learning for inference phase??? Beyond Two-Tower: Attribute Guided Representation Learning for Candidate Retrieval, WWW 2023
  35. Future Trends Part 3 36 REAL Interaction-based Learning – Coupling

    and Decoupling Coupling for Training starbucks drinks S … … starbucks drinks S … … Cheap yet Effective Decoupling for Inference Beyond Two-Tower: Attribute Guided Representation Learning for Candidate Retrieval, WWW 2023
  36. Future Trends Part 3 37 REAL Interaction-based Learning – Coupling

    and Decoupling Coupling for Training starbucks drinks S … … starbucks drinks S … … Cheap yet Effective Decoupling for Inference How to design an appropriate coupling mechanism to support effective representation learning and easy decoupling afterwards for inference phase? Beyond Two-Tower: Attribute Guided Representation Learning for Candidate Retrieval, WWW 2023
  37. Future Trends Part 3 38 REAL Interaction-based Learning – Coupling

    and Decoupling Attribute Fusion Layer Attribute-Aware Learning Beyond Two-Tower: Attribute Guided Representation Learning for Candidate Retrieval, WWW 2023
  38. Future Trends Part 3 39 REAL Interaction-based Learning – Coupling

    and Decoupling The most relevant attributes are more diverse for AGREE than a vanilla two-tower solution Without Attribute-Aware Learning Beyond Two-Tower: Attribute Guided Representation Learning for Candidate Retrieval, WWW 2023
  39. Future Trends Part 3 40 REAL Interaction-based Learning – Coupling

    and Decoupling The most relevant attributes are more diverse for AGREE than a vanilla two-tower solution Without Attribute-Aware Learning Beyond Two-Tower: Attribute Guided Representation Learning for Candidate Retrieval, WWW 2023 More sophisticated coupling and decoupling mechanisms deserve investigation.
  40. We need YOU Part 4 42 The Most Beautiful Campus

    in China • 国家优秀青年科学基⾦项⽬(海外)岗位 • 固定教职教授 • 固定教职副教授 • 特聘研究员 • 特聘副研究员 Faculty positions in every level are available!
  41. Let us QA! [email protected] http://lichenliang.net/ 43 The End p A

    SIMPLE overview towards the current progress on candidate matching p LOW LATENCY is a MUST BE (We need ARTs for both effectiveness and efficiency) • Our human beings are GREEDY p Some insights towards the future trends p Hope some of you can join us!