Deep Learning - Overview of my work II

Slide 1

Slide 1 text

Deep Learning Deep Learning II

Slide 2

Slide 2 text

Deep Learning 0 10 20 30 40 50 60 70 80 90 100 2004-01 2004-03 2004-05 2004-07 2004-09 2004-11 2005-01 2005-03 2005-05 2005-07 2005-09 2005-11 2006-01 2006-03 2006-05 2006-07 2006-09 2006-11 2007-01 2007-03 2007-05 2007-07 2007-09 2007-11 2008-01 2008-03 2008-05 2008-07 2008-09 2008-11 2009-01 2009-03 2009-05 2009-07 2009-09 2009-11 2010-01 2010-03 2010-05 2010-07 2010-09 2010-11 2011-01 2011-03 2011-05 2011-07 2011-09 2011-11 2012-01 2012-03 2012-05 2012-07 2012-09 2012-11 2013-01 2013-03 2013-05 2013-07 2013-09 2013-11 2014-01 2014-03 2014-05 2014-07 2014-09 2014-11 2015-01 2015-03 2015-05 2015-07 2015-09 2015-11 2016-01 2016-03 2016-05 2016-07 2016-09 2016-11 2017-01 2017-03 Google Trends Deep learning Machine learning neural network

Slide 3

Slide 3 text

Deep Learning

Slide 4

Slide 4 text

Deep Learning

Slide 5

Slide 5 text

Deep Learning

Slide 6

Slide 6 text

Deep Learning  Artificial Narrow Intelligence (ANI): Machine intelligence that equals or exceeds human intelligence or efficiency at a specific task.  Artificial General Intelligence (AGI): A machine with the ability to apply intelligence to any problem, rather than just one specific problem (human-level intelligence).  Artificial Super Intelligence (ASI): An intellect that is much smarter than the best human brains in practically every field, including scientific creativity, general wisdom and social skills

Slide 7

Slide 7 text

Deep Learning Machine Learning is a type of Artificial Intelligence that provides computers with the ability to learn Machine Learning Supervised learning Unsupervised learning

Slide 8

Slide 8 text

Deep Learning  Part of the machine learning field of learning representations of data.  hierarchy of multiple layers that mimic the neural networks of our brain  If you provide the system tons of information, it begins to understand it and respond in useful ways.

Slide 9

Slide 9 text

Deep Learning  SuperIntelligent Devices  Best Solution for image recognition speech recognition natural language processing Big Data

Slide 10

Slide 10 text

Deep Learning

Slide 11

Slide 11 text

Deep Learning Geoffrey Hinton: University of Toronto & Google Yann LeCun: New York University & Facebook Andrew Ng: Stanford & Baidu Yoshua Bengio: University of Montreal

Slide 12

Slide 12 text

Deep Learning

Slide 13

Slide 13 text

Deep Learning Today NVidia Support my work with NVIDIA TITAN X THE MOST ADVANCED GPU EVER BUILT

Slide 14

Slide 14 text

Deep Learning TITAN X Specifications GPU Architecture Pascal Standard Memory Config 12 GB GDDR5X Memory Speed 10 Gbps Boost Clock 1531 MHz NVIDIA CUDA® Cores 3584 Transistors 12,000 million

Slide 15

Slide 15 text

Deep Learning TITAN X In Research Deep Learning Augmented Reality Machine Learning Image Recognition Computer Vision Data Science

Slide 16

Slide 16 text

Deep Learning  Deep learning (DL) is a hierarchical structure network which through simulates the human brain’s structure to extract the internal and external input data’s features

Slide 17

Slide 17 text

Deep Learning Large data set with good quality Measurable and describable goals Enough computing power Neural Network (Brain of Human)

Slide 18

Slide 18 text

Deep Learning Deep neural networks Deep belief networks Convolutional neural networks Deep Boltzmann machines Deep stacking networks

Slide 19

Slide 19 text

Deep Learning Axon Terminal Branches of Axon Dendrites S x1 x2 w1 w2 wn xn x3 w3

Slide 20

Slide 20 text

Deep Learning

Slide 21

Slide 21 text

Deep Learning  The advantages of using Rectified Linear Units in neural networks are: ReLU doesn't face gradient vanishing problem as with sigmoid and tanh function. It has been shown that deep networks can be trained efficiently using ReLU even without pre-training.

Slide 22

Slide 22 text

Deep Learning  Convolution Neural Networks (CNN) is supervised learning and a family of multi-layer neural networks particularly designed for use on two dimensional data, such as images and videos.  A CNN consists of a number of layers:  Convolutional layers.  Pooling Layers.  Fully-Connected Layers.

Slide 23

Slide 23 text

Deep Learning

Slide 24

Slide 24 text

Deep Learning

Slide 25

Slide 25 text

Deep Learning  Convolutional layer acts as a feature extractor that extracts features of the inputs such as edges, corners , endpoints.

Slide 26

Slide 26 text

Deep Learning

Slide 27

Slide 27 text

Deep Learning

Slide 28

Slide 28 text

Deep Learning  The pooling layer reduces the resolution of the image that reduce the precision of the translation (shift and distortion) effect.

Slide 29

Slide 29 text

Deep Learning

Slide 30

Slide 30 text

Deep Learning

Slide 31

Slide 31 text

Deep Learning  fully connected layer have full connections to all activations in the previous layer.  Fully connect layer act as classifier.

Slide 32

Slide 32 text

Deep Learning LeNet :The first successful applications of CNN AlexNet: The first work that popularized CNN in Computer Vision ZF Net: The ILSVRC 2013 winner GoogLeNet: The ILSVRC 2014 winner VGGNet: The runner-up in ILSVRC 2014 ResNet: The winner of ILSVRC 2015

Slide 33

Slide 33 text

Deep Learning

Slide 34

Slide 34 text

Deep Learning

Slide 35

Slide 35 text

Deep Learning

Slide 36

Slide 36 text

Deep Learning The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) evaluates algorithms for object detection and image classification at large scale.

Slide 37

Slide 37 text

Deep Learning

Slide 38

Slide 38 text

Deep Learning MNIST Handwritten digits – 60000 Training + 10000 Test Data Google House Numbers from street view - 600,000 digit images CIFAR-10 60000 32x32 colour images in 10 classes IMAGENET >150 GB Tiny Images 80 Million tiny images Flickr Data 100 Million Yahoo dataset

Slide 39

Slide 39 text

Deep Learning  MNIST is a large database of handwritten digits.  MNIST contains 60,000 training images and 10,000 testing images

Slide 40

Slide 40 text

Deep Learning  CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes  CIFAR-10 contains 50000 training images and 10000 test images

Slide 41

Slide 41 text

Deep Learning  Overfitting Problem  Larger network have a lots of weights this lead to high model complexity  Network do excellent on training data but very bad on validation data

Slide 42

Slide 42 text

Deep Learning  CNN Optimization used to reduce the overfitting problem in CNN by: 1) Dropout 2) L2 Regularization 3) Mini-batch 4) Gradient descent algorithm 5) Early stopping 6) Data augmentation

Slide 43

Slide 43 text

Deep Learning  Dropout is a technique of reducing overfitting in CNN.

Slide 44

Slide 44 text

Deep Learning  L2 Regularization: Adding a regularization term for the weights to the loss function is a way to reduce overfitting.  where w is the weight vector, λ is the regularization factor (coefficient), and the regularization function, Ω(w) is:

Slide 45

Slide 45 text

Deep Learning  Mini-batch is to divide the dataset into small batches of examples, compute the gradient using a single batch, make an update, then move to the next batch.

Slide 46

Slide 46 text

Deep Learning  The gradient descent algorithm updates the coefficients (weights and biases) so as to minimize the error function by taking small steps in the direction of the negative gradient of the loss function  where i stands for the iteration number, α > 0 is the learning rate, P is the parameter vector, and E(Pi) is the loss function.

Slide 47

Slide 47 text

Deep Learning  Early stopping monitoring the deep learning process of the network from overfitting.  If there is no more improvement, or worse, the performance on the test set degrades, then the learning process is aborted

Slide 48

Slide 48 text

Deep Learning  Data augmentation means increasing the number of dataset.

Slide 49

Slide 49 text

Deep Learning  MADBase is Arabic Handwritten Digit Dataset composed of 70,000 digits written by 700 writers.  MADBase is partitioned into two data sets:  60,000 Training Data  10,000 Testing Data

Slide 50

Slide 50 text

Deep Learning  We built a new CNN architecture:

Slide 51

Slide 51 text

Deep Learning  Confusion Matrix

Slide 52

Slide 52 text

Deep Learning  We collect a dataset that composed of 16,800 characters written by 60 participants, the age range is between 19 to 40 years.  The forms were scanned at the resolution of 300 dpi. Each block is segmented automatically using Matlab 2016a to determining the coordinates for each block.  The database is partitioned into two sets: a training set (13,440 characters to 480 images per class) and a test set (3,360 characters to 120 images per class).

Slide 53

Slide 53 text

Deep Learning  Each participant wrote each character (from ’alef’ to ’yeh’) ten times on two forms

Slide 54

Slide 54 text

Deep Learning  We built a new CNN architecture:

Slide 55

Slide 55 text

Deep Learning  Confusion Matrix  Error Rate= 5.15% Class 1 2 3 4 5 6 7 Arabic Character alef beh teh theh jeem hah khah Correct Classification 120 116 110 110 115 117 112 Wrong Classification 0 4 10 10 5 3 8 Classification Accuracy 100% 96.70% 91.70% 91.70% 95.80% 97.50% 93.30% Miss-Classification 0.00% 3.30% 8.30% 8.30% 4.20% 2.50% 6.70% Class 8 9 10 11 12 13 14 Arabic Character dal thal reh zain seen sheen sad Correct Classification 114 110 120 105 117 115 118 Wrong Classification 6 10 0 15 3 5 2 Classification Accuracy 95.00% 91.70% 100%% 87.50% 79.50% 95.80% 98.70% Miss-Classification 5.00% 8.30% 0.00% 12.50% 2.50% 4.20% 1.70% Class 15 16 17 18 19 20 21 Arabic Character dad tah zah ain ghain feh qaf Correct Classification 109 116 110 113 112 114 111 Wrong Classification 11 4 10 7 8 6 9 Classification Accuracy 90.80% 96.70% 91.70% 94.20% 93.30% 95.00% 92.50% Miss-Classification 9.20% 3.30% 8.30% 5.80% 6.70% 5.00% 7.50% Class 22 23 24 25 26 27 28 Arabic Character kaf lam meem noon heh waw yeh Correct Classification 114 119 119 106 114 115 116 Wrong Classification 6 1 1 14 6 5 4 Classification Accuracy 95.00% 99.20% 99.20% 88.30% 95.00% 95.80% 96.70% Miss-Classification 5.00% 0.80% 0.80% 11.70% 5.00% 4.20% 3.30%

Slide 56

Slide 56 text

Deep Learning  The total of wrong classification is 173 from 3187.

Slide 57

Slide 57 text

Deep Learning  Deep learning is a class of machine learning algorithms.  Harder problems such as video understanding, image understanding , natural language processing and Big data will be successfully tackled by deep learning algorithms.

Slide 58

Slide 58 text

Deep Learning facebook.com/mloey [email protected] twitter.com/mloey linkedin.com/in/mloey [email protected] mloey.github.io