Slide 1

Slide 1 text

Deep Relative Attributes Yaser Souri, Erfan Noury, Ehsan Adeli-Mosabbeb

Slide 2

Slide 2 text

Visual Recognition Typical Recognition Dog Panda Bear Dog

Slide 3

Slide 3 text

Visual Recognition Attribute Recognition 4 legs white 4 legs black&white 4 legs brown 4 legs black&white

Slide 4

Slide 4 text

Attributes Mid-Level semantic properties, shared by objects Human-understandable and machine-detectable Ferrari, Vittorio, and Andrew Zisserman. "Learning visual attributes." NIPS 2007. Farhadi, Ali, et al. "Describing objects by their attributes." CVPR 2009.

Slide 5

Slide 5 text

Attributes Applications: ● Zero-shot learning ● Visual Search ● Interactive Recognition Lampert, Christoph H., et al. "Learning to detect unseen object classes by between-class attribute transfer." CVPR 2009. Kovashka, Adriana, et al. "Whittlesearch: Image search with relative attribute feedback."CVPR 2012. Branson, Steve, et al. "Visual recognition with humans in the loop." ECCV 2010.

Slide 6

Slide 6 text

Relative Attributes The problem with Binary/Categorical Attributes Smiling Not Smiling ?

Slide 7

Slide 7 text

Relative Attributes The problem with Binary/Categorical Attributes Smiling Not Smiling > > less than more than Devi Parikh, and Kristen Grauman. "Relative attributes." ICCV 2011. Marr Prize

Slide 8

Slide 8 text

Learn a function from a set of pairwise supervisions Relative Attributes Devi Parikh, and Kristen Grauman. "Relative attributes." ICCV 2011.

Slide 9

Slide 9 text

Previous works lack: ● Use hand-engineered features (GIST, HOG, …) ● Use complicated and often special purpose models We propose: ● Using Convolutional Neural Networks and a simple ranking model ● Learning features and ranking end-to-end Relative Attributes Related Work Devi Parikh, and Kristen Grauman. "Relative attributes." ICCV 2011. Li, Shaoxin, et al. "Relative forest for attribute prediction." ACCV 2012. Yu, Anbo, and Kristen Grauman. "Fine-grained visual comparisons with local learning." CVPR 2014. Sandeep, Ramachandruni N., et al. "Relative Parts: Distinctive Parts for Learning Relative Attributes." CVPR 2014.

Slide 10

Slide 10 text

Deep Relative Attributes Test time linear and global ranking model ConvNet

Slide 11

Slide 11 text

Deep Relative Attributes Train time Burges, Chris, et al. "Learning to rank using gradient descent." ICML 2005.

Slide 12

Slide 12 text

Deep Relative Attributes Train time Burges, Chris, et al. "Learning to rank using gradient descent." ICML 2005.

Slide 13

Slide 13 text

Deep Relative Attributes Train time Burges, Chris, et al. "Learning to rank using gradient descent." ICML 2005.

Slide 14

Slide 14 text

Deep Relative Attributes Details ● Each attribute separately ● ConvNet: VGG16 (conv1 - fc7) ● Initial weights of the ConvNet: pre-trained on ILSVRC2014 ● Optimization: ○ SGD with minibatch of size 32 (16 x 2) ○ RMSProp + weight decay ○ Gradient clipping ○ Random shuffling of data for each epoch + flip augmentation for small datasets ● Code: Theano + Lasagne

Slide 15

Slide 15 text

Results Global Ranking

Slide 16

Slide 16 text

Results Relative Attribute Prediction

Slide 17

Slide 17 text

Results Relative Attribute Prediction

Slide 18

Slide 18 text

Results Relative Attribute Prediction

Slide 19

Slide 19 text

Results Relative Attribute Prediction

Slide 20

Slide 20 text

Results Saliency Maps and Localization Simonyan, Karen, et al. "Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps." ICLR Workshop 2014.

Slide 21

Slide 21 text

Results Saliency Maps and Localization

Slide 22

Slide 22 text

Analysis Feature learning effect Pretrained features tSNE Embedding

Slide 23

Slide 23 text

Analysis Feature learning effect Learned features tSNE Embedding

Slide 24

Slide 24 text

Conclusion A method for end-to-end feature learning and ranking of images State-of-the-art relative attribute prediction results Our model is able to localize attributes Code and models will be released soon Paper at arXiv: http://arxiv.org/abs/1512.04103

Slide 25

Slide 25 text

? Thank you