Utilizing Embeddings
In Learning To Rank
For Search
> Shawn TSAI / LINE Taiwan Data Dev
Slide 3
Slide 3 text
Agenda
> Search Everywhere
> Search Result Relevance
> Embeddings
> Learning To Rank
> Search Workflow
Slide 4
Slide 4 text
Search Everywhere
Life on LINE
Slide 5
Slide 5 text
Search Result Relevance
> The main goal is to reduce the semantic gap between user query and
documents.
> The key points: semantic features and ranking function.
> Search is a ranking problem. The ordering is more important than the
predicted probability of a single instance.
Word Embedding
> Vector representation
> Capturing context of a word in a
document, semantic/syntactic
similarity, relation with other words
Source: Efficient Estimation of Word
Representations in Vector Space
Slide 9
Slide 9 text
BERT
BERT is a new method of pre-training language representations which
obtains state-of-the-art results on a wide array of NLP tasks.
Source: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Bidirectional Encoder Representations from Transformers
Slide 10
Slide 10 text
Querying By Vector Representation
Sent. Encoding
By
Pre-trained
BERT
Model
Query
Document
Vecs Index
Query Vec
Documents Document
Vecs
Online
Offline
Nearest Neighbor Search
Build
NN
Index
Slide 11
Slide 11 text
Learning To Rank
Slide 12
Slide 12 text
Learning To Rank
> Applying machine learning to construct ranking models for information
retrieval systems
> Caring more about ranking rather than rating prediction
> Scoring by machine learning
• Creating document index by Elasticsearch
• Using embeddings to train ranking models
• Serving search queries by Elasticsearch with ranking models
Slide 13
Slide 13 text
Filters
Search Architecture
Documents
Query
Filter
Index
ES + Re-ranking
BERT
Matches Ranked
Results
NER
…
Scoring
Index Ranking Models
Slide 14
Slide 14 text
Ƃ
Custom Scoring Function
Ƃ
Slide 15
Slide 15 text
Search Workflow With Learning To Rank
User’s Needs
Measure Relevance
Pre-process
Inverted-index
Features Selection
Ranking Models
Scoring Function
NDCG
MAP
Precision@k
Deploy
Monitoring
Feedback
Evaluation
Build Index Learning To Rank Serve
Data
Slide 16
Slide 16 text
More Consideration
> Good judge lists matching user needs of search quality
> Good metrics measuring search results
> Incorporating with embeddings into scoring function
> Synchronizing the version between indexing and serving layers
> A/B testing