文献紹介_20180420_CSN _ Learning Type-Aware Embeddings for Fashion Compatibility

文献紹介 CSN | Learning Type-Aware Embeddings for Fashion Compatibility author:
Vasileva, Mariya I et al. 2018

abstract - outfit recommendation の model の提案 - 評価用の新しい厳格なデータセットを提案 -
similarity と compatibility を同時に学習 - comatibility の学習をcategory の pair-wise に分けることで、improper triangle の問題を解決 - より厳しい新しいデータセットで両taskともにSOTA

Table of Contents - Introduction - Method - Experiments &
Results - Conclusion

Introduction - 先行研究の問題点 - improper triangle - test dataset が簡単
(Experimentsのとこで説明する。)

Introduction - Related Work - [(A. Veit et al. 2015)
SiameseNet | Learning Visual Clothing Style with Heterogeneous Dyadic Co-occurrences.](https://arxiv.org/pdf/1509.07473.pdf)

Introduction - Improper Triangle 図の出典: [(K. Yamaguchi et al. 2015)
Mix and Match: Joint Model for Clothing and Attribute Recognition.](http://vision.is.tohoku.ac.jp/~kyamagu) - compatibility では以下の三角不等式が成り立つわけではない。 - 「tops A と bottoms B が compatible」かつ「bottoms B と shoes C が compatible」→ 「tops A と shoes C が compatible」

Introduction - Related Work - [(X. Han et al. 2017)
Bi-LSTM | Learning Fashion Compatibility with Bidirectional LSTMs.](https://arxiv.org/pdf/1707.05691.pdf)

Method - Data - 大元はPolyvore - outfit = item image
sequence - text - 以下の 3 variants を用いた。 - Maryland Polyvore (X. Han et al. 2017) - test data が簡単 - Polyvore Outfits-D (ours) - Polyvore Outfits (ours)

Method - Data - Maryland Polyvore は - 定量的評価をするには test
data が不適切。簡単。（Experimentsのところで説明する。） - テキストの情報が貧弱。

Method - Model: CSN - Veit, A., Belongie, S., Karaletsos,
T.: Conditional similarity networks. In: CVPR. (2017) を参考にした。

Method - CSN の input/output image x category u: bottoms
v: tops text t comatible image-text/text-text distance image-image distance

Method - Model: CSN - 3つのmoduleからなる - similarity - VSE
= Visual Semantic Embedding: text と image を意味が近いと距離が近くなるよう embed - Sim: SiameseNetで、同じcategoryどうしのtext/imageを近くにembed - compatibility - Type-Specific Embed - Sim の embedded space から category pair-wise な space に projectionして、 - compatible な image どうしを近づけて embed

Method - VSE: Visual Semantic Embedding image x comatible 自分のimage
と text は近づけ、自分以外の text は遠ざける。 image だけでなく、textの情報も与えることで、より similarなものが近くなるよう embed。

Method - Sim image x comatible category が同じ image/text どうしを
近づけ、違うcategoryは遠ざける。先行研究では、ここで compatibilityの triplet lossをとってた。

Method - Type-Specific Projection image x comatible projection で category
の pair-wise space に分けてからcompatibility の sim learning

Method - Type-Specific Projection

Method - Model: CSN - 3つのmoduleからなる（おさらい） - similarity - VSE
= Visual Semantic Embedding: text と image を意味が近いと距離が近くなるよう embed - Sim: SiameseNetで、同じcategoryどうしのtext/imageを近くにembed - compatibility - Type-Specific Embed - Sim の embedded space から category pair-wise な space に projectionして、 - compatible な image どうしを近づけて embed

Experiments - Evaluation - task & metric - 2 task
- FITB = Fill in the Blank - Compatibility Prediction - 5 dataset - Maryland (All Negatives) - Maryland (Composition Filtering) - Maryland (Category-Aware Negative) 上と同じ? - Polyvore Outfits - Polyvore Outfits-D

Experiments - task(1/2) - FITB - 1 correct, 3 wrong
の中から compatible な correct を選ぶ - metric: Accuracy

Experiments - task(2/2) - Compatibility Prediction - compatible/imconpatible な outfit
を binary classification - compatible (positive sample) - Polyvore 上の outfit は全てcompatible とする。 - incompatible (negative sample) - dataset の種類により、samplingの仕方が違う。 - metric: AUC

- Maryland (All Negatives) - Maryland (Composition Filtering): Maryland の
test data では、 - FITB: 候補 item が明らかに違う category → correct item の予測が簡単。 - Compat. Pred.: categoryの重複や欠損がある negative outfit → imcompatible と予測するのが簡単。 - 簡単なのものを削除 Experiments (1/3) - Maryland

Results

Experiments(2/3) - Category-Aware Negative - Maryland (Category-Aware Negative): Maryland の
test data では、 - FITB: 候補 item が明らかに違う category → correct item の予測が簡単。 - Compat. Pred.: categoryの重複や欠損がある outfit → imcompatible と予測するのが簡単。 - 簡単なものを削除するだけでなく、 categoryを指定してnegative sampling する。

Results

Experiments(3/3) - Polyvore Outfits(-D) - item数/outfit を増やした。 - text 情報も増やした。
- negative sampling は category-aware の方法。 - D: trainとtestでitemどうしの被りもなし。

Results

Results - Similar

Results - Compatible

Results - Outfit Generation

Conclusion - outfit recommendation の model の提案 - 評価用の新しい厳格なデータセットを提案 -
similarity と compatibility を同時に学習 - comatibility の学習をcategory の pair-wise に分けることで、improper triangle の問題を解決 - より厳しい新しいデータセットで両taskともにSOTA

文献紹介_20180420_CSN _ Learning Type-Aware Embeddi...

文献紹介_20180420_CSN _ Learning Type-Aware Embeddings for Fashion Compatibility

hrsma2i

More Decks by hrsma2i

Other Decks in Research

Featured

Transcript

文献紹介 CSN | Learning Type-Aware Embeddings for Fashion Compatibility author:

abstract - outfit recommendation の model の提案 - 評価用の新しい厳格なデータセットを提案 -

Table of Contents - Introduction - Method - Experiments &

Introduction - 先行研究の問題点 - improper triangle - test dataset が簡単

Introduction - Related Work - [(A. Veit et al. 2015)

Introduction - Improper Triangle 図の出典: [(K. Yamaguchi et al. 2015)

Introduction - Related Work - [(X. Han et al. 2017)

Method - Data - 大元はPolyvore - outfit = item image

Method - Data - Maryland Polyvore は - 定量的評価をするには test

Method - Model: CSN - Veit, A., Belongie, S., Karaletsos,

Method - CSN の input/output image x category u: bottoms

Method - Model: CSN - 3つのmoduleからなる - similarity - VSE

Method - VSE: Visual Semantic Embedding image x comatible 自分のimage

Method - Sim image x comatible category が同じ image/text どうしを

Method - Type-Specific Projection image x comatible projection で category

Method - Type-Specific Projection

Method - Model: CSN - 3つのmoduleからなる（おさらい） - similarity - VSE

Experiments - Evaluation - task & metric - 2 task

Experiments - task(1/2) - FITB - 1 correct, 3 wrong

Experiments - task(2/2) - Compatibility Prediction - compatible/imconpatible な outfit

- Maryland (All Negatives) - Maryland (Composition Filtering): Maryland の

Results

Experiments(2/3) - Category-Aware Negative - Maryland (Category-Aware Negative): Maryland の

Results

Experiments(3/3) - Polyvore Outfits(-D) - item数/outfit を増やした。 - text 情報も増やした。

Results

Results - Similar

Results - Compatible

Results - Outfit Generation

Conclusion - outfit recommendation の model の提案 - 評価用の新しい厳格なデータセットを提案 -