Upgrade to Pro — share decks privately, control downloads, hide ads and more …

An Introduction to Deep Learning and Handwritten Recognition

Magellium
February 08, 2018

An Introduction to Deep Learning and Handwritten Recognition

We had the pleasure to do the closing talk of the 1st ocSSImore coding challenge (digital security research and cooperation pole in the Occitania region).

This was an opportunity to introduce deep learning, a quick insight on handwritten recognition, and some mixed use cases.

Magellium

February 08, 2018
Tweet

More Decks by Magellium

Other Decks in Science

Transcript

  1. PUTTING KNOWLEDGE ON THE MAP Deep Learning and Handwritting Recognition

    OCSSIMORE –18/02/08 1 12/02/2018 Property of Magellium
  2. Deep Learning and Handwritting Recognition Introduction to Deep Learning A

    detailled use case Handwritting Recognition Deep Learning applications in Handwritting Recognition 2 Property of Magellium 20/09/2017
  3. Traditional Programming vs Machine Learning 4 Property of Magellium 20/09/2017

    Traditional Programming data program output Icons made by http://www.freepik.com from www.flaticon.com is licensed by CC 3.0 BY
  4. Traditional Programming vs Machine Learning 5 Property of Magellium 20/09/2017

    Traditional Programming data program output data program output Machine Learning Icons made by http://www.freepik.com from www.flaticon.com is licensed by CC 3.0 BY
  5. Machine Learning 6 Property of Magellium 20/09/2017 Learning to reproduce

    a result, a behavior Icons made by http://www.freepik.com from www.flaticon.com is licensed by CC 3.0 BY
  6. Machine Learning 7 Property of Magellium 20/09/2017 Natural Language Processing

    Image classification Object detection/segmentation Many tasks Learning to reproduce a result, a behavior Icons made by http://www.freepik.com from www.flaticon.com is licensed by CC 3.0 BY
  7. Machine Learning 8 Property of Magellium 20/09/2017 Natural Language Processing

    Image classification Object detection/segmentation Many tasks Training From training data Adjust internal parameters Goal : find generalization ! Inference From new data (not seen) Evaluation / Production Two major phases Learning to reproduce a result, a behavior Icons made by http://www.freepik.com from www.flaticon.com is licensed by CC 3.0 BY
  8. Machine Learning 9 Property of Magellium 20/09/2017 program estimates evaluation

    model data output fitting loop Icons made by http://www.freepik.com from www.flaticon.com is licensed by CC 3.0 BY
  9. Machine Learning Timeline 10 Property of Magellium Turing test by

    Alan Turing to determine if a machine is "intelligent". 1950 Google creates AlphaGo, a program capable of beating 5 times in a row the Go World Champion, the most complex game in the world. 2016 Facebook develops DeepFace, a program capable of recognizing faces in photos with the same performance as a human being. 2015 Development of Google Brain and neural networks for object recognition 2014 Amazon and Microsoft launch their Machine Learning tools 2011 Appearance of the term Deep Learning (Hinton) 2006 Deep Blue (IBM) defeats Gary Kasparov, then world champion, in chess. 1997 Invention of the perceptron by Frank Rosenblatt (Cornell University). First neural network. Inspired by cognitive theories. 1957 Arthur Samuel creates the first program of Machine Learning to play chess and progress as the games are played. 1952 20/09/2017 First computing chips dedicated to neural calculus (Apple A11 bionic neural engine, NPU, TPU...) 2017
  10. Machine Learning vs Deep Learning 11 Property of Magellium 20/09/2017

    Machine Learning input output Feature Extractor features ML algorithm Source : https://speakerdeck.com/toulousedatascience/number-27-transfert-de-style-et-generative-adversarial-networks Icons made by http://www.freepik.com from www.flaticon.com is licensed by CC 3.0 BY
  11. Machine Learning vs Deep Learning 12 Property of Magellium 20/09/2017

    Machine Learning input output Feature Extractor features ML algorithm Deep Learning input output Deep Learning Algorithm Source : https://speakerdeck.com/toulousedatascience/number-27-transfert-de-style-et-generative-adversarial-networks Icons made by http://www.freepik.com from www.flaticon.com is licensed by CC 3.0 BY
  12. Machine Learning vs Deep Learning 13 Property of Magellium 20/09/2017

    Machine Learning Feature space Input space Output space Feature engineering (hand designed) Algorithm (learned) Domain dependence Need expert domain to tune Hard to extract complex patterns For images : HOG features, SIFT methods, Histograms, LBP features, …. Source : https://speakerdeck.com/toulousedatascience/number-27-transfert-de-style-et-generative-adversarial-networks
  13. Machine Learning vs Deep Learning 14 Property of Magellium 20/09/2017

    Machine Learning Feature space Input space Output space Learned feature extractor (learned) Algorithm (learned) Learn a new representation of the data Ex: PCA (Principal Component Analysis) Source : https://speakerdeck.com/toulousedatascience/number-27-transfert-de-style-et-generative-adversarial-networks
  14. Machine Learning vs Deep Learning 15 Property of Magellium 20/09/2017

    Machine Learning Feature space Input space Output space Learned feature extractor (learned) Algorithm (learned) Learn a new representation of the data Ex: PCA (Principal Component Analysis) Deep Learning Feature space 1 Input space Output space Learn a hierarchy of representations Can be done with Neural Networks Feature space N … Source : https://speakerdeck.com/toulousedatascience/number-27-transfert-de-style-et-generative-adversarial-networks
  15. From a linear model to a neural network 16 Property

    of Magellium 1 + 1 = Let’s try a simple linear classifier 20/09/2017 = ()
  16. From a linear model to a neural network 17 Property

    of Magellium 1 + 1 = Let’s try a simple linear classifier 20/09/2017 Add a non-linear activation function to the output (1 + 1 ) = = ()
  17. From a linear model to a neural network 18 Property

    of Magellium 1 + 1 = Let’s try a simple linear classifier 20/09/2017 (2 + 2 ) = = () Then compose with a second model, and add softmax to get probabilities-like output Add a non-linear activation function to the output (1 + 1 ) = = () That’s it ! We have a two-layer fully connected neural network
  18. Neural network 19 Property of Magellium 20/09/2017 (1 , 1

    ) (2 , 2 ) ReLU Sigmoid tanh Biologically inspired representation:
  19. Neural network 20 Property of Magellium 20/09/2017 (1 , 1

    ) (2 , 2 ) ReLU Sigmoid tanh Biologically inspired representation: Weights and biases are learned !
  20. Neural network 21 Property of Magellium 20/09/2017 (1 , 1

    ) (2 , 2 ) ReLU Sigmoid tanh Biologically inspired representation: Weights and biases are learned ! Hyperparameters to tune: How many hidden layers ? How many neurons per layer ? Which activation ? Regularization ?
  21. Convolutional Neural Networks 23 Property of Magellium Intuition: local information

    matters most Convolutional filters (aka kernels) Slide over layers Extract information locally but with an overall invariance 20/09/2017
  22. Convolutional Neural Networks 24 Property of Magellium Intuition: local information

    matters most Convolutional filters (aka kernels) Slide over layers Extract information locally but with an overall invariance 20/09/2017 New hyperparameters to tune: Filter size ? Stride ? Padding ? How many filters ?
  23. A deep network 25 Property of Magellium Millions of parameters

    ! Able to model everything … if you succeed to optimize it ! 20/09/2017
  24. A data revolution ? 28 Property of Magellium Cause Effect

    business expertise data data expertise Before After 20/09/2017
  25. Data management Training dataset 31 Property of Magellium 12/02/2018 Raster

    BD-Ortho (PHR, 50cm) Haute-Garonne 328 images 10000x10000 px RGB Vector Open Street Map, Buildings ~1M polygons Goal Create 224x224 samples (image/building mask)
  26. Data management Training dataset 32 Property of Magellium 12/02/2018 Raster

    BD-Ortho (PHR, 50cm) Haute-Garonne 328 images 10000x10000 px RGB Vector Open Street Map, Buildings ~1M polygons Goal Create 224x224 samples (image/building mask) + checks, filtering + data augmentation + classes balance
  27. To do La 1ère fois La 2ème fois Network research

    Training flow Data management 1st time 2nd time Framework 35 Property of Magellium 20/09/2017 Many frameworks, most of them with Python API Our choice + Keras
  28. Network architecture Many existing structures, many on-going research 36 Property

    of Magellium 12/02/2018 http://www.asimovinstitute.org/neural-network-zoo/ The network structure depends of The task to solve (segmentation, classification, etc…) Amount of training data GPU/CPU constraints for train and inference  Operating cost
  29. Network architecture Our choice : U-Net Architecture 37 Property of

    Magellium Image Down-sampling (convolutions) Up-sampling Segmentation Features 20/09/2017
  30. Summary 43 Property of Magellium 12/02/2018 Construction of neural networks

    Requires strong know-how and experience Deep learning solves complicated problems, but with complicated models Complicated model are complicated to train : start simple and increase complexity
  31. Summary 44 Property of Magellium 12/02/2018 Construction of neural networks

    Requires strong know-how and experience Deep learning solves complicated problems, but with complicated models Complicated model are complicated to train : start simple and increase complexity Network learning Frameworks helps a lot but many tricks remains to implement Requires strong know-how on methodology The learning rate is probably the most important parameter to tune
  32. Summary 45 Property of Magellium 12/02/2018 Construction of neural networks

    Requires strong know-how and experience Deep learning solves complicated problems, but with complicated models Complicated model are complicated to train : start simple and increase complexity Network learning Frameworks helps a lot but many tricks remains to implement Requires strong know-how on methodology The learning rate is probably the most important parameter to tune Datasets Data management may be a mess ! Class balance and reliability of datasets Data augmentation is a must do
  33. Activities/ Building Footprint Extraction Internal Research Project Building footprints Airborne

    + Pléiades OCS-GE + OSM DBs Robust and solid results 98% good classification
  34. Activities/ Ship Detection Airbus DS Spot 6/7 Imagery + Pleiades

    from Airbus OneAtlas Detection aiming high recall before experts validation Locks : high size variability, haze, rough seas
  35. Activities/ Real-time Object Detection On-going works Locks : computation time,

    bandwith issues Networks architectures optimization
  36. Offline Handwritting Recognition Source : http://www.tbluche.com/files/MeetupSaoPaulo2017.pdf Property of Magellium 20/09/2017

    Challenges The input is a variable-sized two-dimensionnal image The output is a variable-sized sequence of characters The cursive nature of handwriting makes a prior segmentation into characters difficult
  37. Handwritting Recognition Historical System Input image Preprocessing Sliding window Feature

    extraction Hidden Markov Models  Emission model = Gaussian mixtures  Transition models = states characters Vocabulary Language model Source : http://www.tbluche.com/files/MeetupSaoPaulo2017.pdf Property of Magellium 20/09/2017
  38. Handwritting Recognition SOTA Input line paragraph image Preprocessing Sliding window

    Feature extraction Hidden Markov Models  Emission model = Gaussian mixtures  Transition models = states characters Vocabulary Language model Graves, A., & Schmidhuber, J. (2009). Offline handwriting recognition with multidimensional recurrent neural networks. In Advances in neural information processing systems (pp. 545-552). Source : http://www.tbluche.com/files/MeetupSaoPaulo2017.pdf Deep Learning Property of Magellium 20/09/2017
  39. The SHOM has several tens of thousands of bathymetric charts

    with an average of several thousand depth readings. End of planned manual digitization : 10-20y The challenge was to create an ergonomic production software for the digitization of the survey values. 59 propriété Magellium 09/09/2015 Handwritten Digit Recognition
  40. Digit recognition > Convolutional Network Digit recognition in deep learning

    is merely a sanity check use case It is trivial to get >99% accuracy with common datasets (MNIST, SD19…) But.. This is previous centuries handwriting, with heterogeneous handwritten digits (quill pen, normograph…), and a small learning data set (~2000 examples)  Learning on exogenous data and fine tuning on SHOM’s dataset 62 propriété Magellium 12/02/2018 Handwritten Digit Recognition
  41. 63 propriété Magellium 12/02/2018 Offline Handwritten Signature Verification Source :

    Hafemann & al. http://arxiv.org/abs/1705.05787 Samples from the GPDS-960 dataset. Each row contains three genuine signatures from the same user and a skilled forgery.
  42. 64 propriété Magellium 12/02/2018 Offline Handwritten Signature Verification Source :

    Hafemann & al. http://arxiv.org/abs/1705.05787 Goal Learn to identify signature forgery How-to Learn signature features (CNN) Learn to distinguish between genuine and forgery signatures (carefully crafted loss function) SOTA 1.72 – 4.64% EER (False Acceptance Rate = False Rejection Rate)
  43. PUTTING KNOWLEDGE ON THE MAP Sébastien Bosch Unité GEO [email protected]

    François De Vieilleville Unité EO franç[email protected] Thomas RISTORCELLI Unité IA [email protected]