Utilizing Embeddings   In Learning To Rank   For Search

Utilizing Embeddings   In Learning To Rank   For Search
> Shawn TSAI / LINE Taiwan Data Dev

Agenda > Search Everywhere > Search Result Relevance > Embeddings
> Learning To Rank > Search Workflow

Search Everywhere Life on LINE

Search Result Relevance > The main goal is to reduce
the semantic gap between user query and documents. > The key points: semantic features and ranking function. > Search is a ranking problem. The ordering is more important than the predicted probability of a single instance.

> 聊天記錄⼀一直在資料壓縮中 > 不管怎麼按備份聊天紀錄都不能備份 Limitation: Different description Limitation: No shared
keywords > 為什什麼有時候賴都不會通知 > 訊息都跑不出來來是怎樣 Search Scoring & Limitation > , > ( ) = () = 1 + + 1 + 1 _ = ∗ Standard similarity function: TF-IDF

Embeddings

Word Embedding > Vector representation > Capturing context of a
word in a document, semantic/syntactic similarity, relation with other words Source: Efficient Estimation of Word Representations in Vector Space

BERT BERT is a new method of pre-training language representations
which obtains state-of-the-art results on a wide array of NLP tasks. Source: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Bidirectional Encoder Representations from Transformers

Querying By Vector Representation Sent. Encoding  By Pre-trained BERT  Model
Query Document Vecs Index Query Vec Documents Document Vecs Online Offline Nearest Neighbor Search Build   NN   Index

Learning To Rank

Learning To Rank > Applying machine learning to construct ranking
models for information retrieval systems > Caring more about ranking rather than rating prediction > Scoring by machine learning • Creating document index by Elasticsearch • Using embeddings to train ranking models • Serving search queries by Elasticsearch with ranking models

Filters Search Architecture Documents Query Filter Index ES + Re-ranking
BERT Matches Ranked Results NER … Scoring Index Ranking Models

Ƃ Custom Scoring Function Ƃ

Search Workflow With Learning To Rank User’s Needs Measure Relevance
Pre-process Inverted-index Features Selection Ranking Models Scoring Function NDCG MAP Precision@k Deploy Monitoring Feedback Evaluation Build Index Learning To Rank Serve Data

More Consideration > Good judge lists matching user needs of
search quality > Good metrics measuring search results > Incorporating with embeddings into scoring function > Synchronizing the version between indexing and serving layers > A/B testing

Thank you

Utilizing Embeddings   In Learning To Rank   Fo...

Utilizing Embeddings   In Learning To Rank   For Search

LINE Developers Taiwan PRO

More Decks by LINE Developers Taiwan

Other Decks in Programming

Featured

Transcript