Slide 1

Slide 1 text

No content

Slide 2

Slide 2 text

Utilizing Embeddings 
 In Learning To Rank 
 For Search > Shawn TSAI / LINE Taiwan Data Dev

Slide 3

Slide 3 text

Agenda > Search Everywhere > Search Result Relevance > Embeddings > Learning To Rank > Search Workflow

Slide 4

Slide 4 text

Search Everywhere Life on LINE

Slide 5

Slide 5 text

Search Result Relevance > The main goal is to reduce the semantic gap between user query and documents. > The key points: semantic features and ranking function. > Search is a ranking problem. The ordering is more important than the predicted probability of a single instance.

Slide 6

Slide 6 text

> 聊天記錄⼀一直在資料壓縮中 > 不管怎麼按備份聊天紀錄都不能備份 Limitation: Different description Limitation: No shared keywords > 為什什麼有時候賴都不會通知 > 訊息都跑不出來來是怎樣 Search Scoring & Limitation > , > ( ) = () = 1 + + 1 + 1 _ = ∗ Standard similarity function: TF-IDF

Slide 7

Slide 7 text

Embeddings

Slide 8

Slide 8 text

Word Embedding > Vector representation > Capturing context of a word in a document, semantic/syntactic similarity, relation with other words Source: Efficient Estimation of Word Representations in Vector Space

Slide 9

Slide 9 text

BERT BERT is a new method of pre-training language representations which obtains state-of-the-art results on a wide array of NLP tasks. Source: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Bidirectional Encoder Representations from Transformers

Slide 10

Slide 10 text

Querying By Vector Representation Sent. Encoding
 By Pre-trained BERT
 Model Query Document Vecs Index Query Vec Documents Document Vecs Online Offline Nearest Neighbor Search Build 
 NN 
 Index

Slide 11

Slide 11 text

Learning To Rank

Slide 12

Slide 12 text

Learning To Rank > Applying machine learning to construct ranking models for information retrieval systems > Caring more about ranking rather than rating prediction > Scoring by machine learning • Creating document index by Elasticsearch • Using embeddings to train ranking models • Serving search queries by Elasticsearch with ranking models

Slide 13

Slide 13 text

Filters Search Architecture Documents Query Filter Index ES + Re-ranking BERT Matches Ranked Results NER … Scoring Index Ranking Models

Slide 14

Slide 14 text

Ƃ Custom Scoring Function Ƃ

Slide 15

Slide 15 text

Search Workflow With Learning To Rank User’s Needs Measure Relevance Pre-process Inverted-index Features Selection Ranking Models Scoring Function NDCG MAP Precision@k Deploy Monitoring Feedback Evaluation Build Index Learning To Rank Serve Data

Slide 16

Slide 16 text

More Consideration > Good judge lists matching user needs of search quality > Good metrics measuring search results > Incorporating with embeddings into scoring function > Synchronizing the version between indexing and serving layers > A/B testing

Slide 17

Slide 17 text

Thank you