Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Deep Learning Introduction with KotlinDL

Deep Learning Introduction with KotlinDL

Alexey Zinoviev

February 23, 2021
Tweet

More Decks by Alexey Zinoviev

Other Decks in Education

Transcript

  1. Introduction to Deep Learning
    with KotlinDL
    Alexey Zinovyev, ML Engineer, Apache Ignite PMC
    JetBrains

    View Slide

  2. Bio
    1. Java & Kotlin developer
    2. Distributed ML enthusiast
    3. Apache Ignite PMC
    4. TensorFlow Contributor
    5. ML engineer at JetBrains
    6. Happy father and husband
    7. https://github.com/zaleslaw

    View Slide

  3. Motivation
    1. Kotlin took a course to become a convenient language for data science
    2. No modern data science without Neural Networks
    3. All deep learning frameworks are good enough at image recognition
    4. Convolutional neural networks (CNNs) are the gold standard for image
    recognition
    5. Training, Transfer Learning, and Inference are now available for different CNN
    architectures on Kotlin with KotlinDL library

    View Slide

  4. Agenda
    1. Neural Network Intro
    2. Deep Learning
    3. Required and optional math knowledge
    4. Primitives or building blocks
    a. Activation Functions
    b. Loss Functions
    c. Initializers
    d. Optimizers
    e. Layers
    5. 5 Major Scientific breakthroughs in DL
    6. Kotlin DL Demo

    View Slide

  5. Some basic terms
    1. Model

    View Slide

  6. Some basic terms
    1. Model
    2. Inference

    View Slide

  7. Some basic terms
    1. Model
    2. Inference
    3. Training

    View Slide

  8. Some basic terms
    1. Model
    2. Inference
    3. Training
    4. Transfer Learning

    View Slide

  9. Some basic terms
    1. Model
    2. Inference
    3. Training
    4. Transfer Learning
    5. Evaluation

    View Slide

  10. Some basic terms
    1. Model
    2. Inference
    3. Training
    4. Transfer Learning
    5. Evaluation
    6. Train/validation/test datasets

    View Slide

  11. Neural Network Intro

    View Slide

  12. The life of one neuron

    View Slide

  13. The place of one neuron in his family

    View Slide

  14. Forward propagation

    View Slide

  15. Cat/Dog neural network architecture in Kotlin

    View Slide

  16. MNIST example

    View Slide

  17. MNIST Subset

    View Slide

  18. Backward propagation

    View Slide

  19. Full training with some maths

    View Slide

  20. Deep Learning is just...

    View Slide

  21. Way to approximate unknown function

    View Slide

  22. math

    View Slide

  23. Need to keep in mind

    Some trigonometry, exponentials and logarithms;

    Linear Algebra: vectors, vector space;

    Linear Algebra: inverse and transpose matrices, matrix decomposition,
    eigenvectors, Kronecker-Capelli’s theorem;

    Mathematical Analysis: continuous, monotonous, differentiable functions;

    Mathematical Analysis: derivative, partial derivative, Jacobian;

    Methods of one-dimensional and multidimensional optimization;

    Gradient Descent and all its variations;

    Optimization methods and convex analysis will not be superfluous in your
    luggage

    View Slide

  24. N-dimensional space in theory

    View Slide

  25. Matrix multiplication (friendly reminder)

    View Slide

  26. Loss Functions

    View Slide

  27. Loss Functions

    Each loss function could be reused as metric

    Should be differentiable

    Not every metric could be a loss function ( metrics could have not the derivative )

    Loss function could be very complex

    Are different for regression and classification tasks

    View Slide

  28. An optimization problem [Loss Optimization Problem]

    View Slide

  29. Most widely used:

    View Slide

  30. Gradients

    View Slide

  31. Gradient Descent

    View Slide

  32. Optimizers

    View Slide

  33. Optimizers: SGD with memory

    View Slide

  34. Optimizers

    SGD

    SGD with Momentum

    Adam

    RMSProp

    AdaDelta
    ...

    View Slide

  35. It’s faster as a result

    View Slide

  36. Wire it together with KotlinDL

    View Slide

  37. Activation Functions

    View Slide

  38. Activation functions

    Activation functions change the outputs coming out of each layer of a neural
    network.

    Required to add non-linearity

    Should have a derivative (to be used in backward propagation)

    View Slide

  39. Linear

    View Slide

  40. Sigmoid

    View Slide

  41. Tanh

    View Slide

  42. ReLU

    View Slide

  43. Hm..

    View Slide

  44. Initializers

    View Slide

  45. Vanishing gradient problem

    View Slide

  46. Exploding gradient problem

    View Slide

  47. Initializers

    Zeros

    Ones

    Random [Uniform or Normal]

    Xavier / Glorot

    He
    ...

    View Slide

  48. Layers

    View Slide

  49. Dense

    View Slide

  50. Cat/Dog neural network architecture in Kotlin

    View Slide

  51. Conv2d: filters

    View Slide

  52. Output with filters

    View Slide

  53. Pooling (subsampling layer)

    View Slide

  54. Dropout

    View Slide

  55. Everything now available in Kotlin

    View Slide

  56. How it works?

    View Slide

  57. How it works?

    View Slide

  58. How it works?

    View Slide

  59. How it works?

    View Slide

  60. KotlinDL Demo

    View Slide

  61. KotlinDL Limitations
    1. Now useful for Image Recognition task and Regression ML
    2. Limited number of layers is supported
    3. Tiny number of preprocessing methods
    4. Only VGG-like architectures are supported
    5. No Android support

    View Slide

  62. KotlinDL Roadmap
    1. New models: Inception, ResNet, DenseNet
    2. Rich Dataset API
    3. GPU settings
    4. Maven Central Availability
    5. Functional API
    6. New layers: BatchNorm, Add, Concatenate, DepthwiseConv2d
    7. Regularization for layers will be added
    8. New metrics framework
    9. Conversion to TFLite (for mobile devices)
    10. ONNX support
    11. ML algorithms

    View Slide

  63. Useful links
    1. https://github.com/JetBrains/KotlinDL
    2. https://kotlinlang.org/docs/data-science-overview.html#kotlin-libraries
    3. #deeplearning channel on Kotlin slack (join it, if you are not yet)
    4. Feel free to join discussions on Github
    5. Follow @zaleslaw and @KotlinForData on Twitter

    View Slide

  64. The End

    View Slide

  65. LSTM consist of LSTM “neurons”

    View Slide

  66. 5 Major Scientific breakthroughs

    View Slide

  67. 5 Major Scientific breakthroughs
    1. Non-linearity in early 90’s (could solve new problems)
    2. ReLU and simplest and cheap non-linearity (could converge on new tasks)
    3. Batch Normalization (could help convergence and give more acceleration)
    4. Xe Initialization (solves vanishing/exploding gradient problem)
    5. Adam optimizer (give more performance)

    View Slide