Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Jointly End-To-End Embedding Learning for Large-Scale Sticker Similarity Search

Jointly End-To-End Embedding Learning for Large-Scale Sticker Similarity Search

Vinhhiep Le
LINE Fukuoka Data Labs Machine Learning Engineer
https://linedevday.linecorp.com/jp/2019/sessions/S1-03

LINE DevDay 2019

November 20, 2019
Tweet

More Decks by LINE DevDay 2019

Other Decks in Technology

Transcript

  1. 2019 DevDay Jointly End-To-End Embedding Learning for Large-Scale Sticker Similarity

    Search > Vinhhiep Le > LINE Fukuoka Data Labs Machine Learning Engineer
  2. Common Approach and Problem > Represent images by embedding and

    retrieve similar images by ranking similarity scores using a distance metric > Problem: When it comes to hundred of millions stickers, exhaustive searching the whole database is computationally infeasible
  3. Cheap Representations > Filter out hopeless candidates through multiple filter

    layers • Learning: binary code • Non-learning: Bag of Visual Words (BOVW)
  4. Cheap Representations > Why binary code • Could be represented

    as a document • Fast distance calculation using Hamming distance
  5. Our Approach > Three-level Filters: > We jointly learn binary

    code and float embedding in on network end-to-end!
  6. Experimental Results > Test set: 14,440 stickers, 480 packages, each

    package has 30 similar stickers. > Evaluation metric: Precision@TopK
  7. Summary > Multi-layer search is good to deal with large-scale

    search > Represent the image by learning method is better than non-learning