An Introduction to Deep Learning and Handwritten Recognition

PUTTING KNOWLEDGE ON THE MAP Deep Learning and Handwritting Recognition
OCSSIMORE –18/02/08 1 12/02/2018 Property of Magellium

Deep Learning and Handwritting Recognition Introduction to Deep Learning A
detailled use case Handwritting Recognition Deep Learning applications in Handwritting Recognition 2 Property of Magellium 20/09/2017

PUTTING KNOWLEDGE ON THE MAP Introduction to Deep Learning

Traditional Programming vs Machine Learning 4 Property of Magellium 20/09/2017
Traditional Programming data program output Icons made by http://www.freepik.com from www.flaticon.com is licensed by CC 3.0 BY

Traditional Programming vs Machine Learning 5 Property of Magellium 20/09/2017
Traditional Programming data program output data program output Machine Learning Icons made by http://www.freepik.com from www.flaticon.com is licensed by CC 3.0 BY

Machine Learning 6 Property of Magellium 20/09/2017 Learning to reproduce
a result, a behavior Icons made by http://www.freepik.com from www.flaticon.com is licensed by CC 3.0 BY

Machine Learning 7 Property of Magellium 20/09/2017 Natural Language Processing
Image classification Object detection/segmentation Many tasks Learning to reproduce a result, a behavior Icons made by http://www.freepik.com from www.flaticon.com is licensed by CC 3.0 BY

Machine Learning 8 Property of Magellium 20/09/2017 Natural Language Processing
Image classification Object detection/segmentation Many tasks Training From training data Adjust internal parameters Goal : find generalization ! Inference From new data (not seen) Evaluation / Production Two major phases Learning to reproduce a result, a behavior Icons made by http://www.freepik.com from www.flaticon.com is licensed by CC 3.0 BY

Machine Learning 9 Property of Magellium 20/09/2017 program estimates evaluation
model data output fitting loop Icons made by http://www.freepik.com from www.flaticon.com is licensed by CC 3.0 BY

Machine Learning Timeline 10 Property of Magellium Turing test by
Alan Turing to determine if a machine is "intelligent". 1950 Google creates AlphaGo, a program capable of beating 5 times in a row the Go World Champion, the most complex game in the world. 2016 Facebook develops DeepFace, a program capable of recognizing faces in photos with the same performance as a human being. 2015 Development of Google Brain and neural networks for object recognition 2014 Amazon and Microsoft launch their Machine Learning tools 2011 Appearance of the term Deep Learning (Hinton) 2006 Deep Blue (IBM) defeats Gary Kasparov, then world champion, in chess. 1997 Invention of the perceptron by Frank Rosenblatt (Cornell University). First neural network. Inspired by cognitive theories. 1957 Arthur Samuel creates the first program of Machine Learning to play chess and progress as the games are played. 1952 20/09/2017 First computing chips dedicated to neural calculus (Apple A11 bionic neural engine, NPU, TPU...) 2017

Machine Learning vs Deep Learning 11 Property of Magellium 20/09/2017
Machine Learning input output Feature Extractor features ML algorithm Source : https://speakerdeck.com/toulousedatascience/number-27-transfert-de-style-et-generative-adversarial-networks Icons made by http://www.freepik.com from www.flaticon.com is licensed by CC 3.0 BY

Machine Learning input output Feature Extractor features ML algorithm Deep Learning input output Deep Learning Algorithm Source : https://speakerdeck.com/toulousedatascience/number-27-transfert-de-style-et-generative-adversarial-networks Icons made by http://www.freepik.com from www.flaticon.com is licensed by CC 3.0 BY

Machine Learning Feature space Input space Output space Feature engineering (hand designed) Algorithm (learned) Domain dependence Need expert domain to tune Hard to extract complex patterns For images : HOG features, SIFT methods, Histograms, LBP features, …. Source : https://speakerdeck.com/toulousedatascience/number-27-transfert-de-style-et-generative-adversarial-networks

Machine Learning Feature space Input space Output space Learned feature extractor (learned) Algorithm (learned) Learn a new representation of the data Ex: PCA (Principal Component Analysis) Source : https://speakerdeck.com/toulousedatascience/number-27-transfert-de-style-et-generative-adversarial-networks

Machine Learning Feature space Input space Output space Learned feature extractor (learned) Algorithm (learned) Learn a new representation of the data Ex: PCA (Principal Component Analysis) Deep Learning Feature space 1 Input space Output space Learn a hierarchy of representations Can be done with Neural Networks Feature space N … Source : https://speakerdeck.com/toulousedatascience/number-27-transfert-de-style-et-generative-adversarial-networks

From a linear model to a neural network 16 Property
of Magellium 1 + 1 = Let’s try a simple linear classifier 20/09/2017 = ()

of Magellium 1 + 1 = Let’s try a simple linear classifier 20/09/2017 Add a non-linear activation function to the output (1 + 1 ) = = ()

of Magellium 1 + 1 = Let’s try a simple linear classifier 20/09/2017 (2 + 2 ) = = () Then compose with a second model, and add softmax to get probabilities-like output Add a non-linear activation function to the output (1 + 1 ) = = () That’s it ! We have a two-layer fully connected neural network

Neural network 19 Property of Magellium 20/09/2017 (1 , 1
) (2 , 2 ) ReLU Sigmoid tanh Biologically inspired representation:

) (2 , 2 ) ReLU Sigmoid tanh Biologically inspired representation: Weights and biases are learned !

) (2 , 2 ) ReLU Sigmoid tanh Biologically inspired representation: Weights and biases are learned ! Hyperparameters to tune: How many hidden layers ? How many neurons per layer ? Which activation ? Regularization ?

Convolutional Neural Networks 22 Property of Magellium Intuition: local information
matters most 20/09/2017

matters most Convolutional filters (aka kernels) Slide over layers Extract information locally but with an overall invariance 20/09/2017

matters most Convolutional filters (aka kernels) Slide over layers Extract information locally but with an overall invariance 20/09/2017 New hyperparameters to tune: Filter size ? Stride ? Padding ? How many filters ?

A deep network 25 Property of Magellium Millions of parameters
! Able to model everything … if you succeed to optimize it ! 20/09/2017

A data revolution ? 27 Property of Magellium Cause 20/09/2017

A data revolution ? 28 Property of Magellium Cause Effect
business expertise data data expertise Before After 20/09/2017

PUTTING KNOWLEDGE ON THE MAP A detailed use case

Use case Building footprint extraction from satellite imagery 30 Property
of Magellium 12/02/2018

Data management Training dataset 31 Property of Magellium 12/02/2018 Raster
BD-Ortho (PHR, 50cm) Haute-Garonne 328 images 10000x10000 px RGB Vector Open Street Map, Buildings ~1M polygons Goal Create 224x224 samples (image/building mask)

Data management Training dataset 32 Property of Magellium 12/02/2018 Raster
BD-Ortho (PHR, 50cm) Haute-Garonne 328 images 10000x10000 px RGB Vector Open Street Map, Buildings ~1M polygons Goal Create 224x224 samples (image/building mask) + checks, filtering + data augmentation + classes balance

Framework 33 Property of Magellium 20/09/2017 Many frameworks, most of
them with Python API

Framework 34 Property of Magellium 20/09/2017 Many frameworks, most of
them with Python API Our choice + Keras

To do La 1ère fois La 2ème fois Network research
Training flow Data management 1st time 2nd time Framework 35 Property of Magellium 20/09/2017 Many frameworks, most of them with Python API Our choice + Keras

Network architecture Many existing structures, many on-going research 36 Property
of Magellium 12/02/2018 http://www.asimovinstitute.org/neural-network-zoo/ The network structure depends of The task to solve (segmentation, classification, etc…) Amount of training data GPU/CPU constraints for train and inference  Operating cost

Network architecture Our choice : U-Net Architecture 37 Property of
Magellium Image Down-sampling (convolutions) Up-sampling Segmentation Features 20/09/2017

Results 40 Property of Magellium 12/02/2018

Summary 43 Property of Magellium 12/02/2018 Construction of neural networks
Requires strong know-how and experience Deep learning solves complicated problems, but with complicated models Complicated model are complicated to train : start simple and increase complexity

Requires strong know-how and experience Deep learning solves complicated problems, but with complicated models Complicated model are complicated to train : start simple and increase complexity Network learning Frameworks helps a lot but many tricks remains to implement Requires strong know-how on methodology The learning rate is probably the most important parameter to tune

Requires strong know-how and experience Deep learning solves complicated problems, but with complicated models Complicated model are complicated to train : start simple and increase complexity Network learning Frameworks helps a lot but many tricks remains to implement Requires strong know-how on methodology The learning rate is probably the most important parameter to tune Datasets Data management may be a mess ! Class balance and reliability of datasets Data augmentation is a must do

Activities/ Building Footprint Extraction Internal Research Project Building footprints Airborne
+ Pléiades OCS-GE + OSM DBs Robust and solid results 98% good classification

Activities/ Ship Detection Airbus DS Spot 6/7 Imagery + Pleiades
from Airbus OneAtlas Detection aiming high recall before experts validation Locks : high size variability, haze, rough seas

Activities/ Plane Segmentation Study on behalf of CNES Pleiades imagery

Activities/ Real-time Object Detection On-going works Locks : computation time,
bandwith issues Networks architectures optimization

PUTTING KNOWLEDGE ON THE MAP Handwritting Recognition

Handwriting recognition … Source : http://www.tbluche.com/files/MeetupSaoPaulo2017.pdf Property of Magellium 20/09/2017
Goal Convert scanned document (image) to text

… is full of challenges Source : http://www.tbluche.com/files/MeetupSaoPaulo2017.pdf Property of
Magellium 20/09/2017

Offline Handwritting Recognition Source : http://www.tbluche.com/files/MeetupSaoPaulo2017.pdf Property of Magellium 20/09/2017
Challenges The input is a variable-sized two-dimensionnal image The output is a variable-sized sequence of characters The cursive nature of handwriting makes a prior segmentation into characters difficult

Handwritting Recognition Source : http://www.tbluche.com/files/MeetupSaoPaulo2017.pdf Property of Magellium 20/09/2017

Handwritting Recognition Historical System Input image Preprocessing Sliding window Feature
extraction Hidden Markov Models  Emission model = Gaussian mixtures  Transition models = states characters Vocabulary Language model Source : http://www.tbluche.com/files/MeetupSaoPaulo2017.pdf Property of Magellium 20/09/2017

Handwritting Recognition SOTA Input line paragraph image Preprocessing Sliding window
Feature extraction Hidden Markov Models  Emission model = Gaussian mixtures  Transition models = states characters Vocabulary Language model Graves, A., & Schmidhuber, J. (2009). Offline handwriting recognition with multidimensional recurrent neural networks. In Advances in neural information processing systems (pp. 545-552). Source : http://www.tbluche.com/files/MeetupSaoPaulo2017.pdf Deep Learning Property of Magellium 20/09/2017

PUTTING KNOWLEDGE ON THE MAP Deep Learning applications in Handwritting
Recognition

Handwritten Digit Recognition 58 Property of Magellium 20/09/2017

The SHOM has several tens of thousands of bathymetric charts
with an average of several thousand depth readings. End of planned manual digitization : 10-20y The challenge was to create an ergonomic production software for the digitization of the survey values. 59 propriété Magellium 09/09/2015 Handwritten Digit Recognition

60 propriété Magellium 09/09/2015 Handwritten Digit Recognition

61 propriété Magellium 12/02/2018 Handwritten Digit Recognition

Digit recognition > Convolutional Network Digit recognition in deep learning
is merely a sanity check use case It is trivial to get >99% accuracy with common datasets (MNIST, SD19…) But.. This is previous centuries handwriting, with heterogeneous handwritten digits (quill pen, normograph…), and a small learning data set (~2000 examples)  Learning on exogenous data and fine tuning on SHOM’s dataset 62 propriété Magellium 12/02/2018 Handwritten Digit Recognition

63 propriété Magellium 12/02/2018 Offline Handwritten Signature Verification Source :
Hafemann & al. http://arxiv.org/abs/1705.05787 Samples from the GPDS-960 dataset. Each row contains three genuine signatures from the same user and a skilled forgery.

64 propriété Magellium 12/02/2018 Offline Handwritten Signature Verification Source :
Hafemann & al. http://arxiv.org/abs/1705.05787 Goal Learn to identify signature forgery How-to Learn signature features (CNN) Learn to distinguish between genuine and forgery signatures (carefully crafted loss function) SOTA 1.72 – 4.64% EER (False Acceptance Rate = False Rejection Rate)

PUTTING KNOWLEDGE ON THE MAP Sébastien Bosch Unité GEO [email protected]
François De Vieilleville Unité EO franç[email protected] Thomas RISTORCELLI Unité IA [email protected]

An Introduction to Deep Learning and Handwritte...

An Introduction to Deep Learning and Handwritten Recognition

More Decks by Magellium

Other Decks in Science

Featured

Transcript