Python & Caffe: Getting Started with Deep Learning

1 Python & Caffe: Getting Started with Deep Learning Saurabh
Kumar SciPy India 2016

2 Machine Learning Applications • Cognition: • Face Recognition (Facebook)
• Image Classification • Speech Recognition • Anomaly Detection • Genetics • Weather Forecasting • Spam Detection • Ad placement on web pages Teaching machines to do a task by observing how its done rather than being programmed for it! Courtesy coursera.org

3 Types of Machine Learning Supervised • Learning a desired
behavior with labeled data. • Make sense of new data based on prior data. • Eg. Regression and Classification Unsupervised • Making inferences without any labeled data. • Discover unknown or hidden patterns. • Eg. Clustering and Dimensionality Reduction Reinforcement • Act in an environment to maximize reward. • Build autonomous agents that learn. • Eg. Recommendation Systems, Game Playing and Robot Navigation.

4 Deep Learning More Advanced tasks • Self driving cars.
• Machine Translation • Image Caption Generation • Sentiment Analysis • Text Generation • Already better than humans in: • Image Recognition • Speech Recognition • Board Games Courtesy google.com

5 What will we talk about today? • Perceptron •
Artificial Neural Networks • Deep Neural Networks • Caffe • A simple Image recognition Deep Net with Caffe • 3D shape recognition with cascaded Deep Nets

6 Perceptron : The building block • Built at Cornell
in 1960 • Inspired from the architecture of a neuron • Multiplies each of its inputs with a set of weights and sums these products. • This final sum is then passed through an activation function. Courtesy cs.utexas.edu

7 Basic Logic Gates with Perceptron Courtesy inf.ed.ac.uk

8 Artificial Neural Networks • Large number of perceptron interconnected
with each other • Inspired from the architecture of mammalian brain • The structure is organized in the form of layers • Has an input layer, an output layer and a few hidden layers Courtesy codeproject.com

9 Deep Neural Networks • Got popular due to availability
of increasing computing power • Large number of hidden layers. eg. ResNet has ~150 Layers • Popular architectures: • Convolutional Neural Net (CNN) • Recurrent Neural Net (RNN) • Fully Connected Neural Net • Autoencoders • Generative Adversarial Net (GAN) Courtesy quora.com

10 Convolutional Neural Net ConvNets are mainly used for Image
recognition/classification. No need for difficult feature engineering. Has pushed Image recognition accuracy to ~92%. Main parts of a CNN: • Convolutional Layer • Fully Connected Layer • Pooling Layer • ReLu Layer Courtesy wikipedia.org

11 Building Your own Deep Net!

12 • Fastest amongst all the available alternatives • Can
process over 60M images per day with a single NVIDIA K40 GPU • That is 1 ms/image for inference and 4 ms/image for learning. • Using CPU/GPU is as easy as switching a flag! • Simple JSON style definition of Layers • Pretrained models are available for use • Completely Open Source • Developed by Berkeley Vision Group

13 Working with Caffe • Layers defined as prototxt files.
• Just have to write two files: • NetArchitecture.prototxt • Solver.prototxt • Input data can be in the form of raw images or from database. • Bulk Image transfromation inbuilt. • Can generate visualization of our net. • Available Layers: Input, Convolutional, Fully Connected, ReLu, Pooling, Softmax, Accuracy, LRN.

14 Writing the prototxt files Net Architecture • Provide location
of input data • Provide a mean image file • Design individual layers • Provide any image transformations if necessary Solver • Location of netArchitecture • Learning Parameters • Preferences • MaxIterations • Saving the intermediate models • CPU/GPU flag

15 Simple Image Recognition Convolutional Neural Net • One of
the available examples in Caffe installation • Trained model and mean image available • BVLC Reference caffeNet: AlexNet trained on ILSVRC 2012 • Uses 227 x 227 image data

16 3D Object recognition using Multi-View Convolutional Neural Networks •
Princeton ModelNet10 dataset was used. • 12 views rendered of each of the mesh objects • First CNN extracts the feature descriptors • Second CNN uses these and gives out the class labels • Accuracy of 88.1% attained. Su, Hang, et al. "Multi-view convolutional neural networks for 3d shape recognition." Proceedings of the IEEE International Conference on Computer Vision. 2015.

17 This talk is based on the Course Project done
in collaboration with Nikunj Patel, Abhinav Kumar and Chandra Mohan Sharma for the course CS725:Foundations of Machine Learning at the Computer Science Department, IIT Bombay in Spring 2016.

18 Thank You!

Python & Caffe: Getting Started with Deep Learning

Python & Caffe: Getting Started with Deep Learning

Saurabh Kumar

More Decks by Saurabh Kumar

Other Decks in Research

Featured

Transcript

1 Python & Caffe: Getting Started with Deep Learning Saurabh

2 Machine Learning Applications • Cognition: • Face Recognition (Facebook)

3 Types of Machine Learning Supervised • Learning a desired

4 Deep Learning More Advanced tasks • Self driving cars.

5 What will we talk about today? • Perceptron •

6 Perceptron : The building block • Built at Cornell

7 Basic Logic Gates with Perceptron Courtesy inf.ed.ac.uk

8 Artificial Neural Networks • Large number of perceptron interconnected

9 Deep Neural Networks • Got popular due to availability

10 Convolutional Neural Net ConvNets are mainly used for Image

11 Building Your own Deep Net!

12 • Fastest amongst all the available alternatives • Can

13 Working with Caffe • Layers defined as prototxt files.

14 Writing the prototxt files Net Architecture • Provide location

15 Simple Image Recognition Convolutional Neural Net • One of

16 3D Object recognition using Multi-View Convolutional Neural Networks •

17 This talk is based on the Course Project done

18 Thank You!