$30 off During Our Annual Pro Sale. View Details »

Recent Advances in Candidate Matching

wing.nus
March 13, 2023
350

Recent Advances in Candidate Matching

Abstract: The candidate matching, i.e., retrieving potentially relevant candidate items to users for further ranking, is highly important since even the cleverest housewife cannot bake bread without flour. While many research efforts are devoted to the ranking stage, few works focus on the former one where the space for more sophisticated models seems to be limited due to the requirement of low computation cost, especially for an industrial system. No matter which scenario, the matching stage is highly important since even the cleverest housewife cannot bake bread without flour. In this talk, we will introduce the background knowledge towards this task at first. Then, an overview on the recent efforts of this line will be presented. At last, we will discuss the possible directions in the near future.

Speaker Bio: Dr. Chenliang Li is a Professor at School of Cyber Science and Engineering, Wuhan University. His research interests inlcude information retrieval, natural language processing and social media analysis. He has published over 90 research papers on leading academic conferences and journals such as SIGIR, ACL, WWW, IJCAI, AAAI, TKDE and TOIS. He has served as Associate Editor / Editorial Board Member for ACM TOIS, ACM TALLIP, IPM and JASIST. His research won the SIGIR 2016 Best Student Paper Honorable Mention and TKDE Featured Spotlight Paper.

wing.nus

March 13, 2023
Tweet

More Decks by wing.nus

Transcript

  1. Recent Advances on Candidate
    Matching
    Chenliang LI
    Wuhan University
    2023.03.02

    View Slide

  2. Introduction
    Part 1 2
    We are now living in an age of INFORMATION EXPLOSION
    Search Recommendation Chatbot
    Information Seeking happens everyday for everyone in almost everywhere

    View Slide

  3. Introduction
    Part 1 3
    Candidate
    Matching
    Ranking
    Phase 1 Phase 2
    Hundreds Tens
    Models: Matching
    Models
    Ranking
    Models
    All
    Items
    Millions
    User History and
    Contexts
    All other Side Info
    Data Sources:
    The two-stage pipeline is widely deployed in real-world systems
    Ranking Models
    Matching Models
    Candidate Matching / Retrieval / Generation are equivalent to each other

    View Slide

  4. Introduction
    Part 1 4
    Candidate
    Matching
    Ranking
    Phase 1 Phase 2
    Hundreds Tens
    Models: Matching
    Models
    Ranking
    Models
    All
    Items
    Millions
    User History and
    Contexts
    All other Side Info
    Data Sources:
    The two-stage pipeline is widely deployed in real-world systems
    Ranking Models
    Matching Models
    LOW LATENCY
    Query
    Encoder
    Item
    Encoder
    starbucks drinks
    S

    View Slide

  5. Introduction
    Part 1 5
    Candidate
    Matching
    Ranking
    Phase 1 Phase 2
    Hundreds Tens
    Models: Matching
    Models
    Ranking
    Models
    All
    Items
    Millions
    User History and
    Contexts
    All other Side Info
    Data Sources:
    The two-stage pipeline is widely deployed in real-world systems
    Ranking Models
    Matching Models
    LOW LATENCY
    Query
    Encoder
    Item
    Encoder
    starbucks drinks
    S
    User History and
    Contexts
    All other Side Info
    LOW LATENCY
    How to Use Context and Side Info

    View Slide

  6. Introduction
    Part 1 6
    The HISTORY for candidate matching
    Item-based CF Representation Learning Interaction-based Learning
    Before 2013 Now
    Pearson-based CF
    SLIM (i.e., MF)
    DSSM, YoutubeDNN, BST,
    DPR, ESAM, Condenser,
    MADR, MIND, KEMI
    UMI, PDN,
    AGREE

    View Slide

  7. Introduction
    Part 1 7
    The HISTORY for candidate matching
    Item-based CF
    Pearson-based CF
    SLIM (i.e., MF)
    A
    B
    C
    High
    Correlation
    Item-based filtering
    (Amazon, 2001)
    Pearson Coefficient Latent Vector

    View Slide

  8. Introduction
    Part 1 8
    The HISTORY for candidate matching
    Representation Learning
    DNN/Transformer
    DSSM, YoutubeDNN, BST,
    DPR, ESAM, Condenser,
    MADR, MIND, KEMI
    Multi-Interest Learning

    View Slide

  9. Introduction
    Part 1 9
    The HISTORY for candidate matching
    Interaction-based Learning
    UMI, PDN, AGREE How to enable efficient interaction-based feature learning?

    View Slide

  10. Overview - Representation Learning – Deep DNN
    Part 2 10
    p Computation parallelization and speedup are must-be for dense retrieval
    Deep Neural Networks for YouTube Recommendations, RecSys 2016
    Learning Deep Structured Semantic Models for Web Search using Clickthrough Data, CIKM 2013

    View Slide

  11. Overview - Representation Learning - Multi-Interest Learning
    Part 2 11
    p To model multi-aspect / -interest nature of the world
    Controllable Multi-Interest Framework for Recommendation, KDD 2020
    Multi-Aspect Dense Retrieval, KDD 2022

    View Slide

  12. Overview - Representation Learning - GNN
    Part 2 12
    p To exploit high-order semantics and correlations
    Neural Graph Collaborative Filtering, SIGIR 2019
    LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation, SIGIR 2020

    View Slide

  13. Overview - Representation Learning – GNN + Multi-Interest
    Part 2 13
    1
    • Refine user preferences based on
    multi-level correlations between
    historical items.
    Graph convolution
    Aggregation
    2
    • Focus on extracting different
    interests by performing historical
    item clustering.
    Multi-interest learning
    p To exploit the both benefits of GNN and multi-interest learning
    When Multi-Level Meets Multi-Interest: A Multi-Grained Neural Model for Sequential Recommendation, SIGIR 2022

    View Slide

  14. Overview - Representation Learning – GNN + Multi-Interest
    Part 2 14
    When Multi-Level Meets Multi-Interest: A Multi-Grained Neural Model for Sequential Recommendation, SIGIR 2022

    View Slide

  15. Overview - Representation Learning – Long-tail Problem
    Part 2 15
    p Non-displayed items cause exposure bias
    • Trained only with displayed items
    • Retrieve items in the entire space
    Exposure Bias
    Displayed
    Non-displayed
    Label
    ESAM: Discriminative Domain Adaptation with Non-Displayed Items to Improve Long-Tail Performance, SIGIR 2021

    View Slide

  16. Overview - Representation Learning – Long-tail Problem
    Part 2 16
    p Why poor long-tail performance
    • Domain Shift
    • Representation learning is not robust and consistent
    Cause
    How to solve
    • Unsupervised Domain Adaptation (reduce domain shift)
    ESAM: Discriminative Domain Adaptation with Non-Displayed Items to Improve Long-Tail Performance, SIGIR 2021

    View Slide

  17. Overview - Representation Learning – Long-tail Problem
    Part 2 17
    p Ranking model backbone
    • Unsupervised domain adaptation to reduce domain shift
    p Formula definition
    ESAM
    ESAM
    n Motivation
    • Architectural solutions may not generalize well.
    • Highlight the importance of learning good feature
    representations for non-displayed items.
    Non-displayed
    Items
    Displayed
    Items
    ESAM: Discriminative Domain Adaptation with Non-Displayed Items to Improve Long-Tail Performance, SIGIR 2021

    View Slide

  18. Overview - Representation Learning – Long-tail Problem
    Part 2 18
    p Domain Shift
    • Attribute correlation alignment 𝑳𝑫𝑨
    Displayed
    Non-displayed
    Label
    Source Domain
    Target Domain
    • High-Level Attribute Distribution
    definition
    • Distribution verification
    ESAM: Discriminative Domain Adaptation with Non-Displayed Items to Improve Long-Tail Performance, SIGIR 2021

    View Slide

  19. Overview - Representation Learning – Long-tail Problem
    Part 2 19
    p Center-wise clustering for source domain 𝑳𝑫𝑪
    𝒄
    • Attribute correlation alignment
    • 𝑳𝑫𝑪
    𝒄 makes similar items cohere together while dissimilar items
    separate from each other
    Items with the same feedback (click) are similar
    Item
    Click? ✔ × ✔ × × × ✔
    ESAM: Discriminative Domain Adaptation with Non-Displayed Items to Improve Long-Tail Performance, SIGIR 2021

    View Slide

  20. Overview - Representation Learning – Long-tail Problem
    Part 2 20
    p Self-training for target clustering 𝑳𝑫𝑪
    𝒑
    • To suppress negative transfer & easy-to-hard strategy
    • Why: ignoring target label information when aligning
    • Entropy regularization
    Negative Transfer
    ESAM: Discriminative Domain Adaptation with Non-Displayed Items to Improve Long-Tail Performance, SIGIR 2021

    View Slide

  21. Overview – Representation Learning - Auxiliary Knowledge
    Part 2 21
    Knowledge-aware Attention for Information Propagation
    p Linking items with their attributes
    KGAT: Knowledge Graph Attention Network for Recommendation, KDD 2019

    View Slide

  22. Overview - Representation Learning – Auxiliary Knowledge - MMoE
    Part 2 22
    p Sharing App in Taobao
    • UV ~ 19M, GMV ~ 400M
    p Target
    • More sharing between users in the platform
    • More interactions and more friends
    p Features
    • Many different sharing scenarios
    • Correlations and discrepancies
    Questions:
    1. How to exploit scenario dependent knowledge ?
    2. How to handle the low-resource nature for long-tail scenarios ?
    3. How to accommodate with social relations between users ?
    Heterogeneous Graph Augmented Multi-Scenario Sharing Recommendation with Tree-Guided Expert Networks, WSDM 2021

    View Slide

  23. Overview - Representation Learning – Auxiliary Knowledge - MMoE
    Part 2 23
    p TreeMMoE
    • We can build a tree structure to model the hierarchical relations between different scenarios
    “C2C→Sharing→Entity→Product→Makeups”
    • Some fine-grained scenarios would hold some common ancestor scenarios
    • Knowledge can be transferred
    Heterogeneous Graph Augmented Multi-Scenario Sharing Recommendation with Tree-Guided Expert Networks, WSDM 2021

    View Slide

  24. Overview - Representation Learning – Auxiliary Knowledge - MMoE
    Part 2 24
    p TreeMMoE
    Each layer of the tree has a gate for the corresponding expert network
    Heterogeneous Graph Augmented Multi-Scenario Sharing Recommendation with Tree-Guided Expert Networks, WSDM 2021

    View Slide

  25. Overview - Representation Learning – Auxiliary Knowledge - MMoE
    Part 2 25
    p Sum up for representation learning.
    Two-Tower DNN Multi-Interest Learning Long-tail Problem Auxiliary Knowledge
    Representation Learning

    View Slide

  26. Overview - Interaction-based Learning – Attention Mechanism
    Part 2 26
    p Attention Mechanism (Target Attention) for relevant feature highlighting
    Deep Interest Network for Click-Through Rate Prediction, KDD 2018

    View Slide

  27. Overview - Interaction-based Learning – User Side
    Part 2 27
    p Identify important user features / behaviors for better representation
    User-Aware Multi-Interest Learning for Candidate Matching in Recommenders, SIGIR 2022

    View Slide

  28. Overview - Interaction-based Learning – Item Side
    Part 2 28
    p Identify important item features / behaviors for precise item-item relevance
    Path-based Deep Network for Candidate Item Matching in Recommenders, SIGIR 2021

    View Slide

  29. Overview - Interaction-based Learning – Optimization
    Part 2 29
    p Enable target attention in an efficient way
    Fast Semantic Matching via Flexible Contextualized Interaction, WSDM 2022

    View Slide

  30. Overview - Representation Learning – Auxiliary Knowledge - MMoE
    Part 2 30
    p Sum up for interaction-based learning.
    User Side Item Side Reduce Computation Cost
    Interaction-Based Learning

    View Slide

  31. Future Trends – One Model Serves ALL
    Part 3 31
    Automatically Network Sharing and Optimization
    Category Search
    Insurance Search
    Hot Trends
    Item Search
    Live Commerce 500万Content Matrices
    直播/视频
    Complicated Entity
    Relations
    X00+ Insurances
    8k+Fund
    80K+Stocks
    美股/港股/A股
    4k+Agents
    100+ Financial Organ.
    XXX Sections
    Category
    Search
    Section/Con
    cept Search
    Fund Search
    (Price)
    Hot Trends
    QA
    Diverse Queries &
    Intents
    白酒/军工/材料
    富/广发/鹏华
    人寿险/车险、财产险/健康险
    新发/热门/金选
    ALiPay APP
    Automatic Expert Selection for Multi-Scenario and Multi-Task Search, SIGIR 2022

    View Slide

  32. Future Trends – One Model Serves ALL
    Part 3 32
    One model serves ALL – Automatically Network Sharing and Optimization
    Category Search
    Insurance Search
    Hot Trends
    Item Search
    Live Commerce 500万Content Matrices
    直播/视频
    Complicated Entity
    Relations
    X00+ Insurances
    8k+Fund
    80K+Stocks
    美股/港股/A股
    4k+Agents
    100+ Financial Organ.
    XXX Sections
    Category
    Search
    Section/Con
    cept Search
    Fund Search
    (Price)
    Hot Trends
    QA
    Diverse Queries &
    Intents
    白酒/军工/材料
    富/广发/鹏华
    人寿险/车险、财产险/健康险
    新发/热门/金选
    ALiPay APP
    Automatic Expert Selection for Multi-Scenario and Multi-Task Search, SIGIR 2022

    View Slide

  33. Future Trends – One Model Serves ALL
    Part 3 33
    One model serves ALL – Automatically Network Sharing and Optimization
    • Personalization for each instance
    • Modeling complex scenarios & tasks
    • Flexible & Scalable
    • End-to-End、Low Cost
    Automatic Expert Selection for Multi-Scenario and Multi-Task Search, SIGIR 2022

    View Slide

  34. Future Trends – Go Beyond Two-Tower
    Part 3 34
    REAL Interaction-based Learning – Coupling and Decoupling
    Expensive but Effective
    starbucks drinks
    S


    Beyond Two-Tower: Attribute Guided Representation Learning for Candidate Retrieval, WWW 2023

    View Slide

  35. Future Trends
    Part 3 35
    REAL Interaction-based Learning – Coupling and Decoupling
    Expensive but Effective
    starbucks drinks
    S


    Can we transfer the capacity of interaction-based learning
    for inference phase???
    Beyond Two-Tower: Attribute Guided Representation Learning for Candidate Retrieval, WWW 2023

    View Slide

  36. Future Trends
    Part 3 36
    REAL Interaction-based Learning – Coupling and Decoupling
    Coupling for Training
    starbucks drinks
    S


    starbucks drinks
    S


    Cheap yet Effective
    Decoupling for Inference
    Beyond Two-Tower: Attribute Guided Representation Learning for Candidate Retrieval, WWW 2023

    View Slide

  37. Future Trends
    Part 3 37
    REAL Interaction-based Learning – Coupling and Decoupling
    Coupling for Training
    starbucks drinks
    S


    starbucks drinks
    S


    Cheap yet Effective
    Decoupling for Inference
    How to design an appropriate coupling mechanism to support
    effective representation learning and easy decoupling afterwards
    for inference phase?
    Beyond Two-Tower: Attribute Guided Representation Learning for Candidate Retrieval, WWW 2023

    View Slide

  38. Future Trends
    Part 3 38
    REAL Interaction-based Learning – Coupling and Decoupling
    Attribute Fusion Layer
    Attribute-Aware Learning
    Beyond Two-Tower: Attribute Guided Representation Learning for Candidate Retrieval, WWW 2023

    View Slide

  39. Future Trends
    Part 3 39
    REAL Interaction-based Learning – Coupling and Decoupling
    The most relevant attributes are more diverse for AGREE than a vanilla two-tower solution
    Without Attribute-Aware Learning
    Beyond Two-Tower: Attribute Guided Representation Learning for Candidate Retrieval, WWW 2023

    View Slide

  40. Future Trends
    Part 3 40
    REAL Interaction-based Learning – Coupling and Decoupling
    The most relevant attributes are more diverse for AGREE than a vanilla two-tower solution
    Without Attribute-Aware Learning
    Beyond Two-Tower: Attribute Guided Representation Learning for Candidate Retrieval, WWW 2023
    More sophisticated coupling and decoupling mechanisms deserve
    investigation.

    View Slide

  41. We need YOU
    Part 4 41
    The Most Beautiful Campus in China

    View Slide

  42. We need YOU
    Part 4 42
    The Most Beautiful Campus in China
    • 国家优秀青年科学基⾦项⽬(海外)岗位
    • 固定教职教授
    • 固定教职副教授
    • 特聘研究员
    • 特聘副研究员
    Faculty positions in every level are available!

    View Slide

  43. Let us QA!
    [email protected]
    http://lichenliang.net/
    43
    The End
    p A SIMPLE overview towards the current progress on candidate matching
    p LOW LATENCY is a MUST BE (We need ARTs for both effectiveness and efficiency)
    • Our human beings are GREEDY
    p Some insights towards the future trends
    p Hope some of you can join us!

    View Slide