Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building a Fine Grained Image Classification Sy...

Building a Fine Grained Image Classification System for Nature Images

The story of our journey with image classification on the Fieldguide project.
Presented at PyConIE 2018 in Dublin.

Fergal Walsh

November 10, 2018
Tweet

More Decks by Fergal Walsh

Other Decks in Programming

Transcript

  1. Fergal Walsh PyConIE 2018 Outline • What is image classification?

    Why do we need it? • Our Computer Vision journey • The technology • The challenges • Our solutions • Open problems
  2. Fergal Walsh PyConIE 2018 Image Classification Assigning a class label

    to an image Bird > Swallow Butterfly > Whites
  3. Fergal Walsh PyConIE 2018 Image Classification Assigning a class label

    to an image Bird > Swallow > African Swallow Butterfly > Whites > The Orangetip
  4. Fergal Walsh PyConIE 2018 Image Classification Assigning a class label

    to an image Bird > Hirundinidae > Petrochelidon > spilodera Butterfly > Pieridae > Pieridae > cardamines
  5. Fergal Walsh PyConIE 2018 It was great, It enabled great

    demos It was pretty fast It could help us start building a catalogue
  6. Fergal Walsh PyConIE 2018 But, Will it learn? Can we

    teach it about Moths or Shells or ..? >> No, we have no way of incrementally training this neural net classifier. Can we retrain a new net regularly? >> We could train every month or so, if we had millions of training samples. >> We didn’t.
  7. Fergal Walsh PyConIE 2018 Solutions What if we reframe this

    as a search problem? Can we find the most similar images to this image? If so, we can use the classes of those as our suggestions.
  8. Fergal Walsh PyConIE 2018 Deep Learning: Convolutional Neural Nets CaffeNet

    Architecture Classification Feature Extraction High Level Features
  9. Fergal Walsh PyConIE 2018 Solutions: Image Search • Compute and

    store feature vectors for all images • Compute feature vector for query image • Calculate distance from query vector to every other vector • Pick the top 20 • Suggest the classes of these images to the user
  10. Fergal Walsh PyConIE 2018 It was great, It was fast

    Sometimes, it was like magic with spot on suggestions Images were immediately indexed and ‘learned’ from Even a single example image could help classify new images
  11. Fergal Walsh PyConIE 2018 But, Memory usage & query time

    grew linearly, we had to keep increasing RAM Sometimes, it seemed like it had learned nothing at all It had two related problems: • It didn’t know where to look ◦ it didn’t differentiate between subject & background • It didn’t know what was important ◦ It was too sensitive to differences in size or rotation
  12. Fergal Walsh PyConIE 2018 Solutions We moved to a Postgres

    based storage & query system • Two stage query process to scale better ◦ Hamming distance query across binary hashes of vectors (pg_similarity) ◦ Euclidean distance across first N Hamming results (Cube) We researched image segmentation to remove background interference
  13. Fergal Walsh PyConIE 2018 Solutions: Image segmentation Separating subject from

    background GrabCut (from cv2 import grabCut) By segmenting the subject we could auto-crop and auto-rotate to reduce unimportant differences.
  14. Fergal Walsh PyConIE 2018 It was great, It was stable

    It was scalable It was less confused by background patterns
  15. Fergal Walsh PyConIE 2018 But, • Dataset growing, queries getting

    slower • Auto-crop not perfect • Increasing user expectations
  16. Fergal Walsh PyConIE 2018 Solutions: import tensorflow as tf “The

    solution to all Machine Learning problems” - half of hackernews “TensorFlow™ is an open source software library for high performance numerical computation. … allows easy deployment of computation across a variety of platforms (CPUs, GPUs, TPUs), and from desktops to clusters of servers to mobile and edge devices. … it comes with strong support for machine learning and deep learning ...” - tensorflow.org
  17. Fergal Walsh PyConIE 2018 Solutions: import tensorflow as tf “TensorFlow™

    is an open source software library for high performance numerical computation. … allows easy deployment of computation across a variety of platforms (CPUs, GPUs, TPUs), and from desktops to clusters of servers to mobile and edge devices. … it comes with strong support for machine learning and deep learning ...” - tensorflow.org
  18. Fergal Walsh PyConIE 2018 Solutions: Fine-Tuning We rewrote our system

    to use Tensorflow and Google’s Inception net. We finally started training fine-tuning a neural net of our own. We created a specialised net just for Butterflies & Moths (leps) • ~ 3000 classes, 20-1000 images per class • ~ 24 hours training time with a GPU • Original images with original crops. No segmentation. • ~ 70% accuracy
  19. Fergal Walsh PyConIE 2018 It was great, It was fast

    It was accurate Its accuracy was measurable
  20. Fergal Walsh PyConIE 2018 But, Lack of similarity ranking made

    it difficult to verify correctness of suggestions
  21. Fergal Walsh PyConIE 2018 Solution We implemented similarity ranking within

    the categories of the classification results to display category suggestions with the most similar image • Using feature vectors from the same inception net But, This caused a further challenge when we needed to update the inception net to add more classes - the feature vectors were no longer comparable We ended up with two inception nets; one for classification that was updated regularly and one for feature extraction that remained static
  22. Fergal Walsh PyConIE 2018 It was great, It combined the

    best of classification & search/ranking Results were more useful
  23. Fergal Walsh PyConIE 2018 But, Latency increased now that two

    neural net passes were required Users could only benefit from the specialised net when using the Leps app. When uploading a moth image on main Fieldguide we still had to use the slower, less accurate, less satisfying search method. We were now maintaining two parallel systems; one using Caffe, one using Tensorflow.
  24. Fergal Walsh PyConIE 2018 Solutions • We trained another high

    level Inception network to classify into broad Nature categories. This then directed the request towards the speciality Leps net or a narrowed similarity search. • We used this same Inception net to extract features instead of the generic Caffenet. ◦ Now everything was under one Tensorflow framework. New challenges: • Latency increased now that up to three neural net passes were required!
  25. Fergal Walsh PyConIE 2018 It was great, Everything under one

    framework again. Speed and usefulness of similarity search improved.
  26. Fergal Walsh PyConIE 2018 But, Users loved the accuracy and

    relative speed of the Leps net. They wanted more! We got more data, lots more data! Could we scale this fine-tuned approach to a much wider set of categories?
  27. Fergal Walsh PyConIE 2018 Solutions? • Train a single net

    for everything with >100,000 classes? • Seems near impossible. No published research beyond 1000-10,000 classes. • Train many specialised Inception nets? • Possible for a limited number of nets but very costly in terms of training time + resources and inference resource requirements.
  28. Fergal Walsh PyConIE 2018 Solutions? Take a step back -

    Why is the solution to every problem another neural net!? Maybe everything looks like a nail because all we’ve got are hammers! What if we go traditional and split the feature extraction step from the classification step? A CNN like Inception is mostly a feature extractor with a small classifier layer added on top. Can we train an Inception feature extractor specialised for nature? Can we train thousands of ‘tiny’ classifiers to classify parts of the category tree using these feature vectors?
  29. Fergal Walsh PyConIE 2018 Solutions Can we train an Inception

    feature extractor specialised for nature? > Yes we can. Trained as a classifier for ~5k Genus categories from across nature. Can we train a hierarchical ensemble of thousands of ‘tiny’ classifiers to classify parts of the category tree using these feature vectors? > Yes, from sklearn.linear_model import SGDClassifier
  30. Fergal Walsh PyConIE 2018 It was great, We didn’t need

    to retrain any neural networks. Training smaller classifiers was fast and ‘easy’. It was very fast. It used a fraction of the memory. It gave best effort hierarchical results.
  31. Fergal Walsh PyConIE 2018 But, It didn’t work as well

    as we hoped. Accuracy on Leps was significantly lower than with the speciality Inception net. This approach is less robust to noise and mislabeled data.
  32. Fergal Walsh PyConIE 2018 Solutions We are working on manual

    and automatic curation tools to help improve our training sets. We currently have a hybrid approach with a hierarchical ensemble of classifiers and a few speciality Inception nets for special areas of interest As the data quality improves & our accuracy improves we hope to move away from the speciality approach.
  33. Fergal Walsh PyConIE 2018 Lessons & Conclusions The computer vision

    & machine learning fields are doing amazing open work. The best libraries & frameworks are all in Python :) CV may start out as ‘plug+play’ but you quickly end up deep down the rabbit hole. A lot of ‘glue’ and pipeline code is required between the various components Scaling CV is hard, and expensive. Neural net orchestration becomes a thing :( Training still needs human supervision & manual intervention
  34. Fergal Walsh PyConIE 2018 Thanks Andre @ Fieldguide for the

    constant challenges Alp Güler & Grant Van Horn for computer vision & machine learning advice & guidance Taylan & my colleagues on the Fieldguide team @ Hipo