$30 off During Our Annual Pro Sale. View Details »

(Recognition) Large-scale Landmark Retrieval/Recognition
under a Noisy and Diverse Dataset

@smly
June 16, 2019

(Recognition) Large-scale Landmark Retrieval/Recognition
under a Noisy and Diverse Dataset

@smly

June 16, 2019
Tweet

More Decks by @smly

Other Decks in Research

Transcript

  1. Team smlyaka:
    Kohei Ozaki * (Recruit Technologies)
    Shuhei Yokoo * (University of Tsukuba)

    * Equal contribution.
    Large-scale Landmark Retrieval/Recognition

    under a Noisy and Diverse Dataset (arXiv:1906.04087)
    (3rd place solution, recognition)

    View Slide

  2. RANSAC+DELF

    (pre-trained, v2)

    PrivLB=0.3373
    Single Ensemble
    +Inlier-count term Post-processing
    Best single (512d)

    + soft-voting

    PrivLB=0.2079
    Ensemble 6 models

    (dim=3072)

    PrivLB=0.3513
    Remove TopFreq (>=30)

    PrivLB=0.3630
    (※ Note: Our dataset cleaning method for image representation learning

    also plays a quite important role in our solution.

    We will present it on the retrieval session soon later.)
    We presents (⭐1) “soft-voting” step and (⭐2) “post-processing” step.
    Solution summary
    Our solution is a combination of basic techniques:

    Soft-voting, inlier-count method, ensemble and post-processing.
    ⭐ ⭐

    View Slide

  3. Soft-voting with spatial verification
    Our recognition method is based on accumulating top-K
    nearest neighbors in the train set.
    0.85
    1.00
    0.75
    1.00
    0.60
    0.50
    Query
    Euclidean

    search
    TOP k (k=3) nearest neighbors in the train set
    Similarity term
    Inlier-count term
    The New Town Hall

    in Hanover
    Hamburg City Hall
    Similarity term Inlier-count term
    Confidence scoring:
    a set of q's neighbors (top3) and its members are assigned to l.
    Inlier-count
    ˆ
    y = argmax =
    sl Hamburg City Hall
    l
    =

    View Slide

  4. Soft-voting with spatial verification
    Similarity term Inlier-count term
    Confidence scoring:
    0.85
    1.00
    0.75
    1.00
    0.60
    0.50
    Query
    Euclidean

    search
    TOP k (k=3) nearest neighbors in the train set
    Similarity term
    Inlier-count term
    a set of q's neighbors (top3) and its members are assigned to l.
    Inlier-count
    The New Town Hall

    in Hanover
    Hamburg City Hall
    Our recognition method is based on accumulating top-K
    nearest neighbors in the train set.
    ˆ
    y = argmax =
    sl Hamburg City Hall
    l
    =

    View Slide

  5. Post-processing for distractors
    We treat categories that appear more frequently than 30
    times in the test set as non-landmark categories.
    landmark_id=129232, freq=91 landmark_id=179959, freq=1144
    This idea is related to “stop word” in natural language processing.

    Using Open Images might have similar effects to remove distractors.

    View Slide

  6. RANSAC+DELF

    (pre-trained, v2)

    PrivLB=0.3373
    Single Ensemble
    +Inlier-count term Post-processing
    Best single (512d)

    + soft-voting

    PrivLB=0.2079
    Ensemble 6 models

    (dim=3072)

    PrivLB=0.3513
    Remove TopFreq (>=30)

    PrivLB=0.3630
    (※ Note: Our dataset cleaning method for image representation learning

    also plays a quite important role in our solution.

    We will present it on the retrieval session soon later.)
    Takeaways (Summary)
    Our solution is a combination of basic techniques:

    Soft-voting, inlier-count method, ensemble and post-processing.
    ⭐ ⭐
    The inlier-count term significantly improves the GAP score.

    View Slide