Building a Fine Grained Image Classification System for Nature Images

Building a Fine Grained Image Classification System for Nature Images
Fergal Walsh @hipolabs

Fergal Walsh PyConIE 2018

Fergal Walsh PyConIE 2018 Fieldguide

Fergal Walsh PyConIE 2018 Outline • What is image classification?
Why do we need it? • Our Computer Vision journey • The technology • The challenges • Our solutions • Open problems

Fergal Walsh PyConIE 2018 Image Classification Assigning a class label
to an image

to an image Bird Butterfly

to an image Bird > Swallow Butterfly > Whites

to an image Bird > Swallow > African Swallow Butterfly > Whites > The Orangetip

to an image Bird > Hirundinidae > Petrochelidon > spilodera Butterfly > Pieridae > Pieridae > cardamines

Fergal Walsh PyConIE 2018 Traditional Image Classification

CaffeNet Demo

Fergal Walsh PyConIE 2018 Deep Learning: Convolutional Neural Nets CaffeNet
Architecture Classification Feature Extraction

Fergal Walsh PyConIE 2018 Deep Learning

CaffeNet Demo

Fergal Walsh PyConIE 2018 It was great, It enabled great
demos It was pretty fast It could help us start building a catalogue

Fergal Walsh PyConIE 2018 But, Will it learn? Can we
teach it about Moths or Shells or ..? >> No, we have no way of incrementally training this neural net classifier. Can we retrain a new net regularly? >> We could train every month or so, if we had millions of training samples. >> We didn’t.

Fergal Walsh PyConIE 2018 Solutions What if we reframe this
as a search problem? Can we find the most similar images to this image? If so, we can use the classes of those as our suggestions.

Fergal Walsh PyConIE 2018 Deep Learning: Convolutional Neural Nets CaffeNet
Architecture Classification Feature Extraction High Level Features

Fergal Walsh PyConIE 2018 Solutions: Image Search • Compute and
store feature vectors for all images • Compute feature vector for query image • Calculate distance from query vector to every other vector • Pick the top 20 • Suggest the classes of these images to the user

Fergal Walsh PyConIE 2018 It was great, It was fast
Sometimes, it was like magic with spot on suggestions Images were immediately indexed and ‘learned’ from Even a single example image could help classify new images

Fergal Walsh PyConIE 2018 But, Memory usage & query time
grew linearly, we had to keep increasing RAM Sometimes, it seemed like it had learned nothing at all It had two related problems: • It didn’t know where to look ◦ it didn’t differentiate between subject & background • It didn’t know what was important ◦ It was too sensitive to differences in size or rotation

Fergal Walsh PyConIE 2018 Solutions We moved to a Postgres
based storage & query system • Two stage query process to scale better ◦ Hamming distance query across binary hashes of vectors (pg_similarity) ◦ Euclidean distance across first N Hamming results (Cube) We researched image segmentation to remove background interference

Fergal Walsh PyConIE 2018 Solutions: Image segmentation Separating subject from
background GrabCut (from cv2 import grabCut) By segmenting the subject we could auto-crop and auto-rotate to reduce unimportant differences.

Fergal Walsh PyConIE 2018 It was great, It was stable
It was scalable It was less confused by background patterns

Fergal Walsh PyConIE 2018 But, • Dataset growing, queries getting
slower • Auto-crop not perfect • Increasing user expectations

Fergal Walsh PyConIE 2018 Solutions: import tensorflow as tf “The
solution to all Machine Learning problems” - half of hackernews “TensorFlow™ is an open source software library for high performance numerical computation. … allows easy deployment of computation across a variety of platforms (CPUs, GPUs, TPUs), and from desktops to clusters of servers to mobile and edge devices. … it comes with strong support for machine learning and deep learning ...” - tensorflow.org

Fergal Walsh PyConIE 2018 Solutions: import tensorflow as tf “TensorFlow™
is an open source software library for high performance numerical computation. … allows easy deployment of computation across a variety of platforms (CPUs, GPUs, TPUs), and from desktops to clusters of servers to mobile and edge devices. … it comes with strong support for machine learning and deep learning ...” - tensorflow.org

Fergal Walsh PyConIE 2018 Solutions: Fine-Tuning We rewrote our system
to use Tensorflow and Google’s Inception net. We finally started training fine-tuning a neural net of our own. We created a specialised net just for Butterflies & Moths (leps) • ~ 3000 classes, 20-1000 images per class • ~ 24 hours training time with a GPU • Original images with original crops. No segmentation. • ~ 70% accuracy

Fergal Walsh PyConIE 2018 It was great, It was fast
It was accurate Its accuracy was measurable

Fergal Walsh PyConIE 2018 But, Lack of similarity ranking made
it difficult to verify correctness of suggestions

Fergal Walsh PyConIE 2018 Solution We implemented similarity ranking within
the categories of the classification results to display category suggestions with the most similar image • Using feature vectors from the same inception net But, This caused a further challenge when we needed to update the inception net to add more classes - the feature vectors were no longer comparable We ended up with two inception nets; one for classification that was updated regularly and one for feature extraction that remained static

Fergal Walsh PyConIE 2018 It was great, It combined the
best of classification & search/ranking Results were more useful

Fergal Walsh PyConIE 2018 But, Latency increased now that two
neural net passes were required Users could only benefit from the specialised net when using the Leps app. When uploading a moth image on main Fieldguide we still had to use the slower, less accurate, less satisfying search method. We were now maintaining two parallel systems; one using Caffe, one using Tensorflow.

Fergal Walsh PyConIE 2018 Solutions • We trained another high
level Inception network to classify into broad Nature categories. This then directed the request towards the speciality Leps net or a narrowed similarity search. • We used this same Inception net to extract features instead of the generic Caffenet. ◦ Now everything was under one Tensorflow framework. New challenges: • Latency increased now that up to three neural net passes were required!

Fergal Walsh PyConIE 2018 It was great, Everything under one
framework again. Speed and usefulness of similarity search improved.

Fergal Walsh PyConIE 2018 But, Users loved the accuracy and
relative speed of the Leps net. They wanted more! We got more data, lots more data! Could we scale this fine-tuned approach to a much wider set of categories?

Fergal Walsh PyConIE 2018 Solutions? • Train a single net
for everything with >100,000 classes? • Seems near impossible. No published research beyond 1000-10,000 classes. • Train many specialised Inception nets? • Possible for a limited number of nets but very costly in terms of training time + resources and inference resource requirements.

Fergal Walsh PyConIE 2018 Solutions? Take a step back -
Why is the solution to every problem another neural net!? Maybe everything looks like a nail because all we’ve got are hammers! What if we go traditional and split the feature extraction step from the classification step? A CNN like Inception is mostly a feature extractor with a small classifier layer added on top. Can we train an Inception feature extractor specialised for nature? Can we train thousands of ‘tiny’ classifiers to classify parts of the category tree using these feature vectors?

Fergal Walsh PyConIE 2018 Classification Feature Extraction Inception V3 Architecture

Fergal Walsh PyConIE 2018 Solutions Can we train an Inception
feature extractor specialised for nature? > Yes we can. Trained as a classifier for ~5k Genus categories from across nature. Can we train a hierarchical ensemble of thousands of ‘tiny’ classifiers to classify parts of the category tree using these feature vectors? > Yes, from sklearn.linear_model import SGDClassifier

Fergal Walsh PyConIE 2018 It was great, We didn’t need
to retrain any neural networks. Training smaller classifiers was fast and ‘easy’. It was very fast. It used a fraction of the memory. It gave best effort hierarchical results.

Fergal Walsh PyConIE 2018 But, It didn’t work as well
as we hoped. Accuracy on Leps was significantly lower than with the speciality Inception net. This approach is less robust to noise and mislabeled data.

Fergal Walsh PyConIE 2018 Solutions We are working on manual
and automatic curation tools to help improve our training sets. We currently have a hybrid approach with a hierarchical ensemble of classifiers and a few speciality Inception nets for special areas of interest As the data quality improves & our accuracy improves we hope to move away from the speciality approach.

Fergal Walsh PyConIE 2018 Lessons & Conclusions The computer vision
& machine learning fields are doing amazing open work. The best libraries & frameworks are all in Python :) CV may start out as ‘plug+play’ but you quickly end up deep down the rabbit hole. A lot of ‘glue’ and pipeline code is required between the various components Scaling CV is hard, and expensive. Neural net orchestration becomes a thing :( Training still needs human supervision & manual intervention

Fergal Walsh PyConIE 2018 Thanks Andre @ Fieldguide for the
constant challenges Alp Güler & Grant Van Horn for computer vision & machine learning advice & guidance Taylan & my colleagues on the Fieldguide team @ Hipo

Fergal Walsh PyConIE 2018 Questions? Me: @fergalwalsh, [email protected] Hipo: @hipolabs
Fieldguide: fieldguide.ai, [email protected]

Building a Fine Grained Image Classification Sy...

Building a Fine Grained Image Classification System for Nature Images

More Decks by Fergal Walsh

Other Decks in Programming

Featured

Transcript