Introduction to Image Classification in Python: from API calls to Neural Networks

Introduction to Image Classification in Python From API calls to
Neural Networks @josmasflores

José Domínguez @josmasflores This publication has emanated from research supported
in part by a research grant from Science Foundation Ireland (SFI) under Grant Number SFI/12/RC/2289

Agenda – Problem Statement – Defining a Baseline – Using
APIs – The Local Setup – Datasets for training and testing – Bag of Features – Neural Networks for Image classification

Problem Statement Why do I need Image recognition?

My research – Focus on management of Diabetes type II
– Keep track of Activity – Plenty of off the shelf solutions – Keep track of Food Intake Episodes – Try to recognise the food in front of you (it’s more complicated than that)

Not always easy…

Really, not always easy…

Defining a Baseline Image recognition from a services point of
view

Baseline –I am a developer, I can make API calls

Baseline II –I am a developer, I can make API
calls –I am not a Data Scientist

(Research) Question As an engineer, should I use a commercial
API, or should I instead spend some time and effort to build the same service locally (if I can)?

Using APIs Simple and effective

Some Companies with free tier API services

Code for Clarifai

Code for Indico.io

Code for Imagga

My Testing Dataset Collected during the summer of 2016

104 sets of 3 pictures (Top, Chest, and Eye positions)

Imagga

Imagga Clarifai

Indico.io Imagga Clarifai

Indico.io Imagga Clarifai APIs accessed in July 2016. Since then,
Clarifai has made available a Food specific model.

The Local Setup Training and Testing, and Inference

How does it all work? Classification exercise: Supervised Learning.

Labelled accelerometer data sample

A bunch of TEST data Some Validation of the model,
depending on the expected and real labels 1) 1.5)

The Food 101 Dataset @inproceedings{bossard14, title = {Food-101 -- Mining
Discriminative Components with Random Forests}, author = {Bossard, Lukas and Guillaumin, Matthieu and Van Gool, Luc}, booktitle = {European Conference on Computer Vision}, year = {2014} }

Bag of Features Bag of Words

In Python https://github.com/bikz05/bag-of-words Uses SIFT + K-means + StandardScaler +
LinearSVC

In Python https://github.com/josmas/bag-of-words Uses SIFT/ORB + K-means + StandardScaler +
LogisticRegression

A proper explanation Can be found at: http://www.robots.ox.ac.uk/~az/icvss08_az_bow.pdf

Results – Pretty bad, to be honest – Food pictures
are not suited for this type of classification – A classifier on 96 classes: – Took 4 and half days to run in a 6th Gen i7 with 32GB of RAM – Resulted in an accuracy of 8%, which is better than a random choice, but not all that much better.

https://github.com/bikz05/bag-of-words

Transfer Learning with neural networks

A Beginner’s Guide to Deep Learning by Irene Chen

TensorFlow for Poets https://codelabs.developers.google.com/ codelabs/ tensorflow-for-poets/

Training your own ConvNet You certainly can, if…

Training your own Convnet if you have… – Millions of
images in hundreds of categories – Access to multiple GPUs – A few weeks (2-3 for Image Net) to spare

Sharing is caring… It is common for people to release
their final ConvNet checkpoints that can be fine-tuned by others. Check out the Caffe Model Zoo.

Transfer Learning

How good Transfer Learning is? http://cs231n.github.io/transfer-learning/

The Inception Model Inception is a huge image classification model
with millions of parameters that can differentiate a large number of kinds of images

How to use Inception? # python tensorflow/examples/image_retraining/retrain.py \ --bottleneck_dir=/tf_files/bottlenecks \
--how_many_training_steps 500 \ --model_dir=/tf_files/inception \ --output_graph=/tf_files/retrained_graph.pb \ --output_labels=/tf_files/retrained_labels.txt \ --image_dir /tf_files/flower_photos

Output of the re-training phase:

Querying the new classifier

Results

Wasn’t it Food you were talking about?

Training: 60 classes from Food101 2016-07-02 17:35:31.167942: Step 3999: Train
accuracy = 64.0% 2016-07-02 17:35:31.168007: Step 3999: Cross entropy = 1.487842 2016-07-02 17:35:31.205460: Step 3999: Validation accuracy = 52.0% Final test accuracy = 63.6%

Training: 30 classes from Food101 2016-07-02 17:52:57.361022: Step 3999: Train
accuracy = 74.0% 2016-07-02 17:52:57.361091: Step 3999: Cross entropy = 0.890878 2016-07-02 17:52:57.397736: Step 3999: Validation accuracy = 78.0% Final test accuracy = 74.0%

CNN Food classifier

Indico.io Imagga Clarifai General Unfair Comparison on APIs

CNN Food classifier over 0.5

Comparing to Indico –Indico provides 174 results over 50%, covering
79 classes. –The local setup provides 65 results over 50%, covering 20 classes.

Comparing APIs and Local Solutions

Use APIs if… –You cannot deal with the burden of
creating and fine tuning your own setup. –Most APIs are free to a certain limit.

Use a local classifier… – Based on a pre-trained model
if you have some time to spare, and you want to get into the whole machine learning craze. – You’ll need to do some image processing if you want better accuracy.

Can do both? – Sure, it will depend on your
use case. – Indico allows to fine tune their models to your specific needs. I guess it is a type of transfer learning process, but I don’t know how it works.

Picture credits – Dog of Muffin: http://imgur.com/gallery/TTpIGvo – Supervised Learning:
http://image.slidesharecdn.com/nltkscikit-learnpyconfr2010ogrisel- 100828080703-phpapp02/95/statistical-learning-and-text-classification-with-nltk-and-scikitlearn-4- 728.jpg – Bag of Words: http://image.slidesharecdn.com/9bow-140722092155-phpapp02/95/bagofwords-models- 2-638.jpg – Word histogram: http://cs.brown.edu/courses/cs143/2011/results/proj3/senewman/wordhistogram.jpg – Visual Words overview: http://www.mathworks.com/help/vision/ug/bagoffeatures_visualwordsoverview.png – Bag of features: https://raw.githubusercontent.com/bikz05/bag-of-words/master/docs/images/bog.png – Impact in computer vision: http://www.slideshare.net/roelofp/deep-learning-a-birdseye-view – Magic: http://www.saasoft.com/blog/wp-content/uploads/2011/07/Magic.jpg – Barney: http://cartoonbros.com/wp-content/uploads/2016/08/barney-11.jpg – Layers: http://www.kdnuggets.com/wp-content/uploads/dnn-layers.jpg – Neural Net: https://e-lab.github.io/data/img/2013-09-neuralnet.png

Questions?

Introduction to Image Classification in Python:...

Introduction to Image Classification in Python: from API calls to Neural Networks

More Decks by Jos

Other Decks in Research

Featured

Transcript