$30 off During Our Annual Pro Sale. View Details »

Intelligent Image Posters

Intelligent Image Posters

These were done for an exhibition.

Max Jaderberg

June 17, 2012
Tweet

More Decks by Max Jaderberg

Other Decks in Technology

Transcript

  1. THE INTELLIGENT IMAGE MAX JADERBERG! SUPERVISOR: ANDREW ZISSERMAN! LARGE SCALE

    OBJECT RECOGNITION THE GOAL: AUTOMATICALLY RECOGNISE AND TAG OBJECTS IN IMAGES.! ! IMPLEMENTATION:! A WEBSITE WHERE USERS CAN UPLOAD A PHOTO TO BE MADE IN TO AN INTELLIGENT IMAGE.! NORMAL IMAGE! INTELLIGENT IMAGE! E A C H A R T I C L E O N W I K I P E D I A DEFINES AN OBJECT. ALL THE I M A G E S F R O M A N O B J E C T ʼ S ARTICLE ARE DOWNLOADED TO A DATABASE AND DEFINE THE MODEL OF THAT OBJECT!
  2. 0 1 2 3 4 5 6 7 8 9

    10 x 104 0 0.005 0.01 0.015 0.02 0.025 HOW IT WORKS: The features of an image are detected and described by a 128-dimensional vector. The feature space is then quantized, creating “visual words”.! 1.  EXTRACT VISUAL WORDS! SCALE INVARIANT FEATURE TRANSFORM! 2.  GOOGLE STYLE SEARCH! matches = search(Database for Words);! FEATURES! WORDS ! HISTOGRAM! QUERY! DATABASE! A! B! C! D! T h e t f - i d f w e i g h t e d histograms are then used to search the database for similar histograms. This is the same method as Google uses for text search, replacing words with “visual words”.! 3.  SPATIALLY VERIFY! The match from the database and query image must depict the same object, so there should be a spatial c o r r e s p o n d e n c e between the visual words of the two images.!
  3. IMPROVEMENTS: RANSAC! Initially RANSAC was used to estimate the affine

    transformation relating the visual words of two features.! NOSAC! This was replaced by a method called NOSAC which uses the scale of each visual word correspondence to improve the estimation of the transformation.! NP5MGICzB0M1H3F N AkrPNlcmXU6MluF9WXcnBUuYsdaZ6YY0oPH−u7I.jpg es/Marlborough H ouse/2425|Marlborough H ouse.jpg inliers: 6 U6MluF9WXcnBUuYsdaZ6YY0oPH−u7I.jpg ough H ouse.jpg inliers: 20 AFFINE INVARIANT! An affine invariant feature detector and descriptor was used. This means that features that are skewed due to changes in viewpoint can still be matched together. Each feature is now represented by an ellipse, with a pair of features being able to estimate a full affine transformation.! TURBO-BOOSTING! A novel method was developed to increase the information content of the database of Wikipedia images (called model images) by augmenting them with the visual words of images from Microsoft Bing (called turbo images).! (a) Model images (b) Turbo images (c) Model image words (d) Turbo image words inliers: 17 (e) The turbo image is spatially verified with the model image (a) Model images (b) Turbo images (c) Model image words (d) Turbo image words inliers: 17 (e) The turbo image is spatially verified with the model image (f) The words in the spatially verified region are used to augment the model image (a) Model images (b) Turbo images (c) Model image words (d) Turbo image words inliers: 17 (e) The turbo image is spatially verified with the model image (f) The words in the spatially verified region are used to augment the model image Figure 6.2: The turboboosting process. 34 1.  D O W N L O A D IMAGES FROM BING FOR EACH OBJECT! 2.  C O M PA R E E A C H MODEL IMAGE WITH E A C H T U R B O IMAGE! 3.  IF A MATCH, PROJECT THE WORDS FROM THE TURBO IMAGE ON TO THE MODEL IMAGE! (a) Model images (b) Turbo images (c) Model image words (d) Turbo image words inliers: 17 (e) The turbo image is spatially verified with the model image (f) The words in the spatially verified region are used to augment the model image
  4. IMPLEMENTATION THE OBJECT RECOGNITION AND TAGGING ENGINE WAS BUILT IN

    MATLAB. THIS IS C O N N E C T E D T O A WEBSERVER WRITTEN IN PYTHON VIA A MATLAB- PYTHON BRIDGE. THE WEBSERVER DELIVERS AN HTML AND JAVASCRIPT APP TO THE BROWSER. ! RESULTS Performance measured by yield - the percentage of test images successfully recognized.! 18.1%! BASELINE SYSTEM:! ! + NOSAC:! ! + AFFINE INVARIANT:! ! + ROOTSIFT:! ! + TURBO-BOOSTING:! 20.8%! 22.5%! 24.5%! 31.7%!