$30 off During Our Annual Pro Sale. View Details »

Finding beans in burgers: paper reading notes

Finding beans in burgers: paper reading notes

Notes from my reading of the CVPR 2018 paper: "Finding beans in burgers:
Deep semantic-visual embedding with localization"

Leszek Rybicki

July 07, 2018
Tweet

More Decks by Leszek Rybicki

Other Decks in Research

Transcript

  1. Finding beans in burgers Deep semantic-visual embedding with localization @lunardog

    関東コンピュータービジョン勉強会  2018.07.07
  2. 自己紹介 • レシェック • ポーランド人 • 2005~ 機械学習の研究者 • 2010~

    日本に来ました • 2016~ クックパッドに入社 • github: @lunardog twitter: @_lunardog_
  3. CVPR 2018 SIGIR 2018 MsCOCO Recipe1M

  4. CVPR 2017

  5. Learning Cross-modal Embeddings for Cooking Recipes and Food Images •

    CVPR 2017 • joint embedding of images and recipes
  6. None
  7. None
  8. CVPR 2018

  9. None
  10. MsCOCO -> MsCOCO MsCOCO -> Flickr30K

  11. triplet loss WELDON pooling

  12. Triplet Loss

  13. FaceNet: A Unified Embedding for Face Recognition and Clustering Florian

    Schroff, Dmitry Kalenichenko, James Philbin
  14. FaceNet: A Unified Embedding for Face Recognition and Clustering Florian

    Schroff, Dmitry Kalenichenko, James Philbin
  15. y z z’ 1-<y,z> 1-<y,z’> α

  16. ≥α ≥α ≥α ≥α ≥α ≥α

  17. None
  18. ≥α ≥α ≥α ≥α ≥α ≥α

  19. triplet loss WELDON pooling

  20. 1-<y,z> 1-<y,z’> α

  21. ≥α ≥α ≥α ≥α ≥α ≥α Instance Loss

  22. ≥α ≥α ≥α ≥α ≥α ≥α Semantic Loss

  23. WELDON Pooling

  24. Global Average Pooling Linear Typical Image Classifier

  25. WELDON

  26. 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

    0.0 0.0 0.0 0.0 0.05 0.3 0.1 0.0 0.0 0.0 00 0.5 1.0 1.0 0.3 0.0 0.0 0.0 0.5 1.0 1.0 1.0 0.01 0.0 0.0 0.2 1.0 1.0 1.0 0.0 0.0 0.0 0.0 0.05 0.4 0.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.05 0.3 0.1 0.0 0.0 0.0 00 0.5 1.0 1.0 0.3 0.0 0.0 0.0 0.5 1.0 1.0 1.0 0.01 0.0 0.0 0.2 1.0 1.0 1.0 0.0 0.0 0.0 0.0 0.05 0.4 0.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.05 0.3 0.1 0.0 0.0 0.0 00 0.5 1.0 1.0 0.3 0.0 0.0 0.0 0.5 1.0 1.0 1.0 0.01 0.0 0.0 0.2 1.0 1.0 1.0 0.0 0.0 0.0 0.0 0.05 0.4 0.2 0.0 0.0 0.0 Global MAX Pooling Global Average Pooling
  27. 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

    0.0 0.0 0.0 0.0 0.05 0.3 0.1 0.0 0.0 0.0 00 0.5 1.0 1.0 0.3 0.0 0.0 0.0 0.5 1.0 1.0 1.0 0.01 0.0 0.0 0.2 1.0 1.0 1.0 0.0 0.0 0.0 0.0 0.05 0.4 0.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.05 0.3 0.1 0.0 0.0 0.0 00 0.5 1.0 1.0 0.3 0.0 0.0 0.0 0.5 1.0 1.0 1.0 0.01 0.0 0.0 0.2 1.0 1.0 1.0 0.0 0.0 0.0 0.0 0.05 0.4 0.2 0.0 0.0 0.0 min + max Pooling bottom m top k
  28. None
  29. None
  30. https://tokyo-ml.github.io/hotdog-tf-js/ http://techlife.cookpad.com/entry/2018/04/06/124455

  31. The END