Link
Embed
Share
Beginning
This slide
Copy link URL
Copy link URL
Copy iframe embed code
Copy iframe embed code
Copy javascript embed code
Copy javascript embed code
Share
Tweet
Share
Tweet
Slide 1
Slide 1 text
Finding beans in burgers Deep semantic-visual embedding with localization @lunardog 関東コンピュータービジョン勉強会 2018.07.07
Slide 2
Slide 2 text
自己紹介 ● レシェック ● ポーランド人 ● 2005~ 機械学習の研究者 ● 2010~ 日本に来ました ● 2016~ クックパッドに入社 ● github: @lunardog twitter: @_lunardog_
Slide 3
Slide 3 text
CVPR 2018 SIGIR 2018 MsCOCO Recipe1M
Slide 4
Slide 4 text
CVPR 2017
Slide 5
Slide 5 text
Learning Cross-modal Embeddings for Cooking Recipes and Food Images ● CVPR 2017 ● joint embedding of images and recipes
Slide 6
Slide 6 text
No content
Slide 7
Slide 7 text
No content
Slide 8
Slide 8 text
CVPR 2018
Slide 9
Slide 9 text
No content
Slide 10
Slide 10 text
MsCOCO -> MsCOCO MsCOCO -> Flickr30K
Slide 11
Slide 11 text
triplet loss WELDON pooling
Slide 12
Slide 12 text
Triplet Loss
Slide 13
Slide 13 text
FaceNet: A Unified Embedding for Face Recognition and Clustering Florian Schroff, Dmitry Kalenichenko, James Philbin
Slide 14
Slide 14 text
FaceNet: A Unified Embedding for Face Recognition and Clustering Florian Schroff, Dmitry Kalenichenko, James Philbin
Slide 15
Slide 15 text
y z z’ 1- 1- α
Slide 16
Slide 16 text
≥α ≥α ≥α ≥α ≥α ≥α
Slide 17
Slide 17 text
No content
Slide 18
Slide 18 text
≥α ≥α ≥α ≥α ≥α ≥α
Slide 19
Slide 19 text
triplet loss WELDON pooling
Slide 20
Slide 20 text
1- 1- α
Slide 21
Slide 21 text
≥α ≥α ≥α ≥α ≥α ≥α Instance Loss
Slide 22
Slide 22 text
≥α ≥α ≥α ≥α ≥α ≥α Semantic Loss
Slide 23
Slide 23 text
WELDON Pooling
Slide 24
Slide 24 text
Global Average Pooling Linear Typical Image Classifier
Slide 25
Slide 25 text
WELDON
Slide 26
Slide 26 text
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.05 0.3 0.1 0.0 0.0 0.0 00 0.5 1.0 1.0 0.3 0.0 0.0 0.0 0.5 1.0 1.0 1.0 0.01 0.0 0.0 0.2 1.0 1.0 1.0 0.0 0.0 0.0 0.0 0.05 0.4 0.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.05 0.3 0.1 0.0 0.0 0.0 00 0.5 1.0 1.0 0.3 0.0 0.0 0.0 0.5 1.0 1.0 1.0 0.01 0.0 0.0 0.2 1.0 1.0 1.0 0.0 0.0 0.0 0.0 0.05 0.4 0.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.05 0.3 0.1 0.0 0.0 0.0 00 0.5 1.0 1.0 0.3 0.0 0.0 0.0 0.5 1.0 1.0 1.0 0.01 0.0 0.0 0.2 1.0 1.0 1.0 0.0 0.0 0.0 0.0 0.05 0.4 0.2 0.0 0.0 0.0 Global MAX Pooling Global Average Pooling
Slide 27
Slide 27 text
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.05 0.3 0.1 0.0 0.0 0.0 00 0.5 1.0 1.0 0.3 0.0 0.0 0.0 0.5 1.0 1.0 1.0 0.01 0.0 0.0 0.2 1.0 1.0 1.0 0.0 0.0 0.0 0.0 0.05 0.4 0.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.05 0.3 0.1 0.0 0.0 0.0 00 0.5 1.0 1.0 0.3 0.0 0.0 0.0 0.5 1.0 1.0 1.0 0.01 0.0 0.0 0.2 1.0 1.0 1.0 0.0 0.0 0.0 0.0 0.05 0.4 0.2 0.0 0.0 0.0 min + max Pooling bottom m top k
Slide 28
Slide 28 text
No content
Slide 29
Slide 29 text
No content
Slide 30
Slide 30 text
https://tokyo-ml.github.io/hotdog-tf-js/ http://techlife.cookpad.com/entry/2018/04/06/124455
Slide 31
Slide 31 text
The END