Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Deep Neural Networks for Video Applications by...

Pycon ZA
October 10, 2019

Deep Neural Networks for Video Applications by Alex Conway

Most CCTV video cameras exist as a sort of time machine for insurance purposes. Deep neural networks make it easy to convert video into actionable data which can be used to trigger real-time anomaly alerts and optimize complex business processes. In addition to commercial applications, deep learning can be used to analyze large amounts of video recorded from the point of view of animals to study complex behavior patterns impossible to otherwise analyze. This talk will present some theory of deep neural networks for video applications as well as academic research and several applied real-world industrial examples, with code examples in python.

Pycon ZA

October 10, 2019
Tweet

More Decks by Pycon ZA

Other Decks in Programming

Transcript

  1. DEEP NEURAL NETWORKS Alex Conway alex @ numberboost.com PYCONZA Keynote

    2019 Neither confidential nor proprietary - please distribute ;) for Video Applications
  2. 2016 MultiChoice Innovation Competition 1st Prize Winners 2017 Mercedes-Benz Innovation

    Competition 1st Prize Winners 2018 Lloyd’s Register Innovation Competition 1st Prize Winners 2019 NTT & Dimension Data Innovation Competition 1st Prize Winners
  3. ORIGINAL FILM Rear Window (1954) PIX2PIX MODEL OUTPUT Fully Automated

    RE-MASTERED BY HAND Painstakingly https://hackernoon.com/remastering-classic-films-in- tensorflow-with-pix2pix-f4d551fa0503
  4. NEURAL NETWORKS Set of connected Neurons with randomly initialized weights

    and non-linear activation functions connected in a Network that are optimized (learned) using training data to minimize prediction error
  5. Inputs outputs hidden layer 1 hidden layer 2 hidden layer

    3 Note: Outputs of one layer are inputs into the next layer This (non-convolutional) architecture is called a “multi-layered perceptron” (DEEP) NEURAL NETWORKS
  6. HOW DOES A NEURAL NETWORK LEARN? New weight = Old

    weight Learning rate - ( ) x “How much error increases when we increase this weight”
  7. 1 1, 3, 3, 7, … [[1, 2, 3 ]

    [3, 2, 1] [3, 4, 5] [7, 8, 9] …] [[1, 2, 3 ] [3, 2, 1] [3, 4, 5] [7, 8, 9] …] [[1, 2, 3 ] [3, 2, 1] [3, 4, 5] [7, 8, 9] …]
  8. image tensor 500 x 500 x 3 = 750’000 60

    second video at 10 FPS tensor 500 x 500 x 3 x 10 x 60 = 450’000’000
  9. 79

  10. 80 Zeiler, M.D. and Fergus, R., 2014, September. Visualizing and

    understanding convolutional networks. In European conference on computer vision (pp. 818-833).
  11. 81 Zeiler, M.D. and Fergus, R., 2014, September. Visualizing and

    understanding convolutional networks. In European conference on computer vision (pp. 818-833).
  12. IMAGENET TOP-5 ERROR RATE Traditional Image Processing Methods AlexNet 8

    Layers ZFNet 8 Layers GoogLeNet 22 Layers ResNet 152 Layers SENet Ensamble TSNet Ensamble
  13. 97

  14. Fine-tuning A CNN To Solve A New Problem 96.3% accuracy

    in under 2 minutes for classifying products into categories (WITH ONLY 3467 TRAINING IMAGES!!1!)
  15. CNN … P(A) = 0.005 P(B) = 0.002 P(C) =

    0.98 P(9) = 0.001 P(0) = 0.03
  16. https://www.pyimagesearch.com/2018/07/23/simple-object-tracking-with-opencv/ “CENTROID TRACKING” For each object with ID in frame

    t, compute distance to centroid of every object in frame t + 1 and assign same ID provided distance less than threshold, else assign new ID
  17. 144

  18. Few-Shot Adversarial Learning of Realistic Neural Talking Head Models Network

    1: CNN embedder compresses faces & landmarks to vector
  19. Few-Shot Adversarial Learning of Realistic Neural Talking Head Models Network

    2: Generator takes landmarks and synthesizes photo
  20. Few-Shot Adversarial Learning of Realistic Neural Talking Head Models Network

    3: Discriminator learns to tell apart real and synthesized photos
  21. Deep Learning Indaba http://www.deeplearningindaba.com Jeremy Howard & Rachel Thomas http://course.fast.ai

    Andrej Karpathy’s Class on Computer Vision http://cs231n.github.io Richard Socher’s Class on NLP (great RNN resource) http://web.stanford.edu/class/cs224n/ Keras docs https://keras.io/ GREAT FREE RESOURCES