Introduction to Image Classification in Python: from API calls to Neural Networks

by Jos

Slide 1

Slide 1 text

Introduction to Image Classification in Python From API calls to Neural Networks @josmasflores

Slide 2

Slide 2 text

José Domínguez @josmasflores This publication has emanated from research supported in part by a research grant from Science Foundation Ireland (SFI) under Grant Number SFI/12/RC/2289

Slide 3

Slide 3 text

Agenda – Problem Statement – Defining a Baseline – Using APIs – The Local Setup – Datasets for training and testing – Bag of Features – Neural Networks for Image classification

Slide 4

Slide 4 text

Problem Statement Why do I need Image recognition?

Slide 5

Slide 5 text

My research – Focus on management of Diabetes type II – Keep track of Activity – Plenty of off the shelf solutions – Keep track of Food Intake Episodes – Try to recognise the food in front of you (it’s more complicated than that)

Slide 6

Slide 6 text

Not always easy…

Slide 7

Slide 7 text

Really, not always easy…

Slide 8

Slide 8 text

Defining a Baseline Image recognition from a services point of view

Slide 9

Slide 9 text

Baseline –I am a developer, I can make API calls

Slide 10

Slide 10 text

Baseline II –I am a developer, I can make API calls –I am not a Data Scientist

Slide 11

Slide 11 text

(Research) Question As an engineer, should I use a commercial API, or should I instead spend some time and effort to build the same service locally (if I can)?

Slide 12

Slide 12 text

Using APIs Simple and effective

Slide 13

Slide 13 text

Some Companies with free tier API services

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

Code for Clarifai

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

Code for Indico.io

Slide 19

Slide 19 text

Code for Imagga

Slide 20

Slide 20 text

My Testing Dataset Collected during the summer of 2016

Slide 21

Slide 21 text

104 sets of 3 pictures (Top, Chest, and Eye positions)

Slide 22

Slide 22 text

Imagga

Slide 23

Slide 23 text

Imagga Clarifai

Slide 24

Slide 24 text

Indico.io Imagga Clarifai

Slide 25

Slide 25 text

Indico.io Imagga Clarifai APIs accessed in July 2016. Since then, Clarifai has made available a Food specific model.

Slide 26

Slide 26 text

The Local Setup Training and Testing, and Inference

Slide 27

Slide 27 text

How does it all work? Classification exercise: Supervised Learning.

Slide 28

Slide 28 text

Labelled accelerometer data sample

Slide 29

Slide 29 text

1) 2)

Slide 30

Slide 30 text

A bunch of TEST data Some Validation of the model, depending on the expected and real labels 1) 1.5)

Slide 31

Slide 31 text

1) 2)

Slide 32

Slide 32 text

The Food 101 Dataset @inproceedings{bossard14, title = {Food-101 -- Mining Discriminative Components with Random Forests}, author = {Bossard, Lukas and Guillaumin, Matthieu and Van Gool, Luc}, booktitle = {European Conference on Computer Vision}, year = {2014} }

Slide 33

Slide 33 text

No content

Slide 34

Slide 34 text

Bag of Features Bag of Words

Slide 35

Slide 35 text

No content

Slide 36

Slide 36 text

No content

Slide 37

Slide 37 text

No content

Slide 38

Slide 38 text

In Python https://github.com/bikz05/bag-of-words Uses SIFT + K-means + StandardScaler + LinearSVC

Slide 39

Slide 39 text

In Python https://github.com/josmas/bag-of-words Uses SIFT/ORB + K-means + StandardScaler + LogisticRegression

Slide 40

Slide 40 text

A proper explanation Can be found at: http://www.robots.ox.ac.uk/~az/icvss08_az_bow.pdf

Slide 41

Slide 41 text

Results – Pretty bad, to be honest – Food pictures are not suited for this type of classification – A classifier on 96 classes: – Took 4 and half days to run in a 6th Gen i7 with 32GB of RAM – Resulted in an accuracy of 8%, which is better than a random choice, but not all that much better.

Slide 42

Slide 42 text

https://github.com/bikz05/bag-of-words

Slide 43

Slide 43 text

Transfer Learning with neural networks

Slide 44

Slide 44 text

A Beginner’s Guide to Deep Learning by Irene Chen

Slide 45

Slide 45 text

No content

Slide 46

Slide 46 text

TensorFlow for Poets https://codelabs.developers.google.com/ codelabs/ tensorflow-for-poets/

Slide 47

Slide 47 text

TensorFlow for Poets https://codelabs.developers.google.com/ codelabs/ tensorflow-for-poets/

Slide 48

Slide 48 text

Training your own ConvNet You certainly can, if…

Slide 49

Slide 49 text

Training your own Convnet if you have… – Millions of images in hundreds of categories – Access to multiple GPUs – A few weeks (2-3 for Image Net) to spare

Slide 50

Slide 50 text

Sharing is caring… It is common for people to release their final ConvNet checkpoints that can be fine-tuned by others. Check out the Caffe Model Zoo.

Slide 51

Slide 51 text

Transfer Learning

Slide 52

Slide 52 text

No content

Slide 53

Slide 53 text

No content

Slide 54

Slide 54 text

No content

Slide 55

Slide 55 text

How good Transfer Learning is? http://cs231n.github.io/transfer-learning/

Slide 56

Slide 56 text

The Inception Model Inception is a huge image classification model with millions of parameters that can differentiate a large number of kinds of images

Slide 57

Slide 57 text

How to use Inception? # python tensorflow/examples/image_retraining/retrain.py \ --bottleneck_dir=/tf_files/bottlenecks \ --how_many_training_steps 500 \ --model_dir=/tf_files/inception \ --output_graph=/tf_files/retrained_graph.pb \ --output_labels=/tf_files/retrained_labels.txt \ --image_dir /tf_files/flower_photos

Slide 58

Slide 58 text

Output of the re-training phase:

Slide 59

Slide 59 text

Querying the new classifier

Slide 60

Slide 60 text

No content

Slide 61

Slide 61 text

Results

Slide 62

Slide 62 text

Wasn’t it Food you were talking about?

Slide 63

Slide 63 text

Training: 60 classes from Food101 2016-07-02 17:35:31.167942: Step 3999: Train accuracy = 64.0% 2016-07-02 17:35:31.168007: Step 3999: Cross entropy = 1.487842 2016-07-02 17:35:31.205460: Step 3999: Validation accuracy = 52.0% Final test accuracy = 63.6%

Slide 64

Slide 64 text

Training: 30 classes from Food101 2016-07-02 17:52:57.361022: Step 3999: Train accuracy = 74.0% 2016-07-02 17:52:57.361091: Step 3999: Cross entropy = 0.890878 2016-07-02 17:52:57.397736: Step 3999: Validation accuracy = 78.0% Final test accuracy = 74.0%

Slide 65

Slide 65 text

CNN Food classifier

Slide 66

Slide 66 text

Indico.io Imagga Clarifai General Unfair Comparison on APIs

Slide 67

Slide 67 text

CNN Food classifier over 0.5

Slide 68

Slide 68 text

No content

Slide 69

Slide 69 text

No content

Slide 70

Slide 70 text

Comparing to Indico –Indico provides 174 results over 50%, covering 79 classes. –The local setup provides 65 results over 50%, covering 20 classes.

Slide 71

Slide 71 text

No content

Slide 72

Slide 72 text

Comparing APIs and Local Solutions

Slide 73

Slide 73 text

Use APIs if… –You cannot deal with the burden of creating and fine tuning your own setup. –Most APIs are free to a certain limit.

Slide 74

Slide 74 text

Use a local classifier… – Based on a pre-trained model if you have some time to spare, and you want to get into the whole machine learning craze. – You’ll need to do some image processing if you want better accuracy.

Slide 75

Slide 75 text

Can do both? – Sure, it will depend on your use case. – Indico allows to fine tune their models to your specific needs. I guess it is a type of transfer learning process, but I don’t know how it works.

Slide 76

Slide 76 text

Picture credits – Dog of Muffin: http://imgur.com/gallery/TTpIGvo – Supervised Learning: http://image.slidesharecdn.com/nltkscikit-learnpyconfr2010ogrisel- 100828080703-phpapp02/95/statistical-learning-and-text-classification-with-nltk-and-scikitlearn-4- 728.jpg – Bag of Words: http://image.slidesharecdn.com/9bow-140722092155-phpapp02/95/bagofwords-models- 2-638.jpg – Word histogram: http://cs.brown.edu/courses/cs143/2011/results/proj3/senewman/wordhistogram.jpg – Visual Words overview: http://www.mathworks.com/help/vision/ug/bagoffeatures_visualwordsoverview.png – Bag of features: https://raw.githubusercontent.com/bikz05/bag-of-words/master/docs/images/bog.png – Impact in computer vision: http://www.slideshare.net/roelofp/deep-learning-a-birdseye-view – Magic: http://www.saasoft.com/blog/wp-content/uploads/2011/07/Magic.jpg – Barney: http://cartoonbros.com/wp-content/uploads/2016/08/barney-11.jpg – Layers: http://www.kdnuggets.com/wp-content/uploads/dnn-layers.jpg – Neural Net: https://e-lab.github.io/data/img/2013-09-neuralnet.png

Slide 77

Slide 77 text

Questions?