Jointly End-To-End Embedding Learning for Large-Scale Sticker Similarity Search

2019 DevDay Jointly End-To-End Embedding Learning for Large-Scale Sticker Similarity
Search > Vinhhiep Le > LINE Fukuoka Data Labs Machine Learning Engineer

Agenda > Sticker similarity definition > Common approach and problem
> Solution and experiment > Summary

Sticker Similarity Search > Content-based image retrieval: Given a sticker,
retrieve a list of similar stickers.

Common Approach and Problem > Represent images by embedding and
retrieve similar images by ranking similarity scores using a distance metric > Problem: When it comes to hundred of millions stickers, exhaustive searching the whole database is computationally infeasible

Solution > Multiple levels of filters:

Cheap Representations > Filter out hopeless candidates through multiple filter
layers • Learning: binary code • Non-learning: Bag of Visual Words (BOVW)

Cheap Representations > Why binary code • Could be represented
as a document • Fast distance calculation using Hamming distance

Our Approach > Three-level Filters: > We jointly learn binary
code and float embedding in on network end-to-end!

Experimental Results > Test set: 14,440 stickers, 480 packages, each
package has 30 similar stickers. > Evaluation metric: Precision@TopK

Summary > Multi-layer search is good to deal with large-scale
search > Represent the image by learning method is better than non-learning

Thank You

Jointly End-To-End Embedding Learning for Large...

Jointly End-To-End Embedding Learning for Large-Scale Sticker Similarity Search

LINE DevDay 2019

More Decks by LINE DevDay 2019

Other Decks in Technology

Featured

Transcript

2019 DevDay Jointly End-To-End Embedding Learning for Large-Scale Sticker Similarity

Agenda > Sticker similarity definition > Common approach and problem

Sticker Similarity Search > Content-based image retrieval: Given a sticker,

Common Approach and Problem > Represent images by embedding and

Solution > Multiple levels of filters:

Cheap Representations > Filter out hopeless candidates through multiple filter

Cheap Representations > Why binary code • Could be represented

Our Approach > Three-level Filters: > We jointly learn binary

Experimental Results > Test set: 14,440 stickers, 480 packages, each

Summary > Multi-layer search is good to deal with large-scale

Thank You