Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Introduction to Image Classification in Python: from API calls to Neural Networks

Jos
November 06, 2016

Introduction to Image Classification in Python: from API calls to Neural Networks

An introduction to image classification, starting by using APIs from commercial services, and continuing with an attempt to replicate the same services locally through two different techniques, bag of features and transfer learning.

Jos

November 06, 2016
Tweet

More Decks by Jos

Other Decks in Research

Transcript

  1. José Domínguez @josmasflores This publication has emanated from research supported

    in part by a research grant from Science Foundation Ireland (SFI) under Grant Number SFI/12/RC/2289
  2. Agenda – Problem Statement – Defining a Baseline – Using

    APIs – The Local Setup – Datasets for training and testing – Bag of Features – Neural Networks for Image classification
  3. My research – Focus on management of Diabetes type II

    – Keep track of Activity – Plenty of off the shelf solutions – Keep track of Food Intake Episodes – Try to recognise the food in front of you (it’s more complicated than that)
  4. Baseline II –I am a developer, I can make API

    calls –I am not a Data Scientist
  5. (Research) Question As an engineer, should I use a commercial

    API, or should I instead spend some time and effort to build the same service locally (if I can)?
  6. Indico.io Imagga Clarifai APIs accessed in July 2016. Since then,

    Clarifai has made available a Food specific model.
  7. A bunch of TEST data Some Validation of the model,

    depending on the expected and real labels 1) 1.5)
  8. The Food 101 Dataset @inproceedings{bossard14, title = {Food-101 -- Mining

    Discriminative Components with Random Forests}, author = {Bossard, Lukas and Guillaumin, Matthieu and Van Gool, Luc}, booktitle = {European Conference on Computer Vision}, year = {2014} }
  9. Results – Pretty bad, to be honest – Food pictures

    are not suited for this type of classification – A classifier on 96 classes: – Took 4 and half days to run in a 6th Gen i7 with 32GB of RAM – Resulted in an accuracy of 8%, which is better than a random choice, but not all that much better.
  10. Training your own Convnet if you have… – Millions of

    images in hundreds of categories – Access to multiple GPUs – A few weeks (2-3 for Image Net) to spare
  11. Sharing is caring… It is common for people to release

    their final ConvNet checkpoints that can be fine-tuned by others. Check out the Caffe Model Zoo.
  12. The Inception Model Inception is a huge image classification model

    with millions of parameters that can differentiate a large number of kinds of images
  13. How to use Inception? # python tensorflow/examples/image_retraining/retrain.py \ --bottleneck_dir=/tf_files/bottlenecks \

    --how_many_training_steps 500 \ --model_dir=/tf_files/inception \ --output_graph=/tf_files/retrained_graph.pb \ --output_labels=/tf_files/retrained_labels.txt \ --image_dir /tf_files/flower_photos
  14. Training: 60 classes from Food101 2016-07-02 17:35:31.167942: Step 3999: Train

    accuracy = 64.0% 2016-07-02 17:35:31.168007: Step 3999: Cross entropy = 1.487842 2016-07-02 17:35:31.205460: Step 3999: Validation accuracy = 52.0% Final test accuracy = 63.6%
  15. Training: 30 classes from Food101 2016-07-02 17:52:57.361022: Step 3999: Train

    accuracy = 74.0% 2016-07-02 17:52:57.361091: Step 3999: Cross entropy = 0.890878 2016-07-02 17:52:57.397736: Step 3999: Validation accuracy = 78.0% Final test accuracy = 74.0%
  16. Comparing to Indico –Indico provides 174 results over 50%, covering

    79 classes. –The local setup provides 65 results over 50%, covering 20 classes.
  17. Use APIs if… –You cannot deal with the burden of

    creating and fine tuning your own setup. –Most APIs are free to a certain limit.
  18. Use a local classifier… – Based on a pre-trained model

    if you have some time to spare, and you want to get into the whole machine learning craze. – You’ll need to do some image processing if you want better accuracy.
  19. Can do both? – Sure, it will depend on your

    use case. – Indico allows to fine tune their models to your specific needs. I guess it is a type of transfer learning process, but I don’t know how it works.
  20. Picture credits – Dog of Muffin: http://imgur.com/gallery/TTpIGvo – Supervised Learning:

    http://image.slidesharecdn.com/nltkscikit-learnpyconfr2010ogrisel- 100828080703-phpapp02/95/statistical-learning-and-text-classification-with-nltk-and-scikitlearn-4- 728.jpg – Bag of Words: http://image.slidesharecdn.com/9bow-140722092155-phpapp02/95/bagofwords-models- 2-638.jpg – Word histogram: http://cs.brown.edu/courses/cs143/2011/results/proj3/senewman/wordhistogram.jpg – Visual Words overview: http://www.mathworks.com/help/vision/ug/bagoffeatures_visualwordsoverview.png – Bag of features: https://raw.githubusercontent.com/bikz05/bag-of-words/master/docs/images/bog.png – Impact in computer vision: http://www.slideshare.net/roelofp/deep-learning-a-birdseye-view – Magic: http://www.saasoft.com/blog/wp-content/uploads/2011/07/Magic.jpg – Barney: http://cartoonbros.com/wp-content/uploads/2016/08/barney-11.jpg – Layers: http://www.kdnuggets.com/wp-content/uploads/dnn-layers.jpg – Neural Net: https://e-lab.github.io/data/img/2013-09-neuralnet.png