Good Old Pretraining for Landmark Recognition

Good Old Pretraining for Landmark Recognition


Cookpad Bristol

June 16, 2019


  1. CVPR'19 Landmark Recognition Workshop 5th Place: Good Old Pretraining (with

    New Tricks) Daniel Fernandez, Andrey Ponikar and Mikhail Fain (Cookpad Ltd.) 1. Landmark-Not-Landmark classifier 2. Pretraining SE-ResNeXt-101 3. Training and inference for 140K common classes Manually tag 100 images as landmarks of X different types, or not landmarks, using active learning techniques Retrain top layer of ResNet-50 with cross-entropy loss Landmark-Not-Landmark Classifier Repeat 10 times 4MM training images, 200K classes Filter out rare classes, with <25 images Filter non-landmarks and cluster the images using ImageNet-based features. For each class, keep only images belonging to largest cluster 600K training images, 10K classes Train SE-ResNeXt-101 with softmax cross-entropy Feature extractor (the layer before last) Image Feature extractors (fixed) (size 4000x1) FC (4000x1) FC (375x64) MatMul (375x375) FC (64x375) Reshape (140Kx1) FC (4000x1) Factorized version of FC (140Kx1), to regularize and allow larger batch sizes Landmark classifier (fixed) Softmax Splitting stage 2 and 3 allowed rapid experimentation since features are precomputed A bit of manual effort for landmark labeling goes a long way, helping with data cleanup and calibrating the probabilities for GAP metric With appropriate regularization, softmax cross-entropy on top of precomputed features is still competitive on this task The final model is a blend of models with the same pipeline as described but minor tweaks (different feature extractors, clustering etc.) Top-1 Multiply probabilities Class and probability Main Takeaways Training: for Stage 3 model we used logarithmic sampling scheme with softmax cross-entropy as a loss function