Jointly End-To-End Embedding Learning for Large-Scale Sticker Similarity Search

Jointly End-To-End Embedding Learning for Large-Scale Sticker Similarity Search

Vinhhiep Le
LINE Fukuoka Data Labs Machine Learning Engineer
https://linedevday.linecorp.com/jp/2019/sessions/S1-03

Be4518b119b8eb017625e0ead20f8fe7?s=128

LINE DevDay 2019

November 20, 2019
Tweet

Transcript

  1. 2019 DevDay Jointly End-To-End Embedding Learning for Large-Scale Sticker Similarity

    Search > Vinhhiep Le > LINE Fukuoka Data Labs Machine Learning Engineer
  2. Agenda > Sticker similarity definition > Common approach and problem

    > Solution and experiment > Summary
  3. Sticker Similarity Search > Content-based image retrieval: Given a sticker,

    retrieve a list of similar stickers.
  4. Common Approach and Problem > Represent images by embedding and

    retrieve similar images by ranking similarity scores using a distance metric > Problem: When it comes to hundred of millions stickers, exhaustive searching the whole database is computationally infeasible
  5. Solution > Multiple levels of filters:

  6. Cheap Representations > Filter out hopeless candidates through multiple filter

    layers • Learning: binary code • Non-learning: Bag of Visual Words (BOVW)
  7. Cheap Representations > Why binary code • Could be represented

    as a document • Fast distance calculation using Hamming distance
  8. Our Approach > Three-level Filters: > We jointly learn binary

    code and float embedding in on network end-to-end!
  9. Experimental Results > Test set: 14,440 stickers, 480 packages, each

    package has 30 similar stickers. > Evaluation metric: Precision@TopK
  10. Summary > Multi-layer search is good to deal with large-scale

    search > Represent the image by learning method is better than non-learning
  11. Thank You