Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Deep Learning with PyTorch

Deep Learning with PyTorch

Introduction to Machine and Deep Learning basics, and overview of main PyTorch features.

This deck is the introductory overview for the Deep Learning with PyTorch course @ FBK

Valerio Maggio

April 22, 2020
Tweet

More Decks by Valerio Maggio

Other Decks in Research

Transcript

  1. Me Pun Who? Background in CS PhD in Machine Learning

    for Software Engineering Research: ML/DL for BioMedicine wearing face masks since…
  2. ML and Data Science Data Science Venn diagram By Drew

    Conway Data Loading Preprocessing Model learning API Interface (Model Adapted from “What about tests in Machine Learning projects?” Sarah Diot-Girard - EuroSciPy 2019 (toy) Data Science Pipeline
  3. Machine Learning “Machine learning is the science (and art) of

    programming computers so they can learn from data” Aurélien Géron, Hands-on Machine Learning with Scikit-Learn and TensorFlow Source: bit.ly/ml-simple-definition “(ml) focuses on teaching computers how to learn without the need to be programmed for specific tasks” S. Pal & A. Gulli, Deep Learning with Keras
  4. Machine Learning Machine learning teaches machines how to carry out

    tasks by themselves. It is that simple. The complexity comes with the details Louis Pedro Coelho, Building Machine Learning Systems with Python (and that’s probably one of the reason why you’re here :)
  5. The (Machine) Learning is about DATA Data are one of

    the most important part of a ml solution Importance: Data >> Model ? Learning by examples Data Preparation is crucial! data algorithms
  6. BioMedicine: another data case ? Contemporary Life Science is about

    data recent advances in sequencing techs and instruments (e.g. “bio-images”) huge datasets generated at incredible pace from human observation to data analysis cheminformatics (drug discovery) Research Impact —> Social and Human Impact
  7. Why Deep Learning, btw? Subset of ML w/ very specific

    model: (Deep?) Neural Networks State of the art Theory ’50 / ’80 hw acceleration to train (~new) learning structure + composability (2018/19)
  8. Deep Learning A multi-layer feed-forward neural network that starts w/

    an input layer fully connected, which is followed by multiple hidden layer of non-linear transformation
  9. More details… Summary: A Neural Network is: • Built from

    layers; each of which is: • a matrix multiplication, • then add bias • then apply non-linearity Learn values for parameters; W and b (for each layer using Back-Propagation)
  10. Machine Learning for dummies (a.k.a. ML explained to computer scientists)

    Note: I *am* a computer scientist + Matrix Multiplication Random Number Generation Machine Learning = ( ) t≅2k Deep
  11. features labels (raw) data ML/DL Model Training Trained Model (unseen)

    data Test Predictions Supervised learning supervision
  12. features labels (raw) data ML/DL Model Training Trained Model (unseen)

    data Test Similarities/likelihood UnSupervised learning
  13. labels (raw) data DL Model Training Trained Model (unseen) data

    Test Predictions supervision features Deep Supervised learning
  14. Supervised Training Loop breakdown.. (raw) Data - a.k.a. Observations /

    Input Items about which we want to predict something. We usually will denote observation with x. Labels - a.k.a. Targets (i.e. Ground Truth) Labels corresponding to observations. These are usually the things being predicted. Following standard notations in ML/DL, we will use y to refer to these. Model f(x) = ˆy A mathematica expression or a function that takes an observation x and predicts the value of its target label. Predictions - a.k.a. Estimates: Values of the Targets generated by the model - usually referred to as ˆy Parameters - a.k.a. Weights (in DL terminology) Parameters of the Model. We will refer to them using the w. Loss Function L(y, ˆy): Function that compares how far off a prediction is from its target for observations in the training data. The loss function assigns a scalar real value called the loss. The lower the value of the loss, the better the model is predicting. The Loss is usually referred to as L Source:D. Rao et al. - Natural Language Processing with PyTorch, O’Reilly 2019
  15. (DL) Terms: everyone on the same page? also ref: bit.ly/nvidia-dl-glossary

    Epochs Batches and mini-batch learning Parameters vs HyperParameters (e.g. weights vs layers) Loss & Optimiser (e.g. Cross Entropy & SGD) Transfer learning Gradient & Backward Propagation Tensor
  16. Python has its say Machine Learning Deep Learning “There should

    be one, and preferably one, way to do it” The Zen of Python
  17. Deep Learning Frameworks Static Graph Dynamic Graph X b W

    * + σ xTW + b (xTW + b) σ Computational Graph Models Linear (or Dense) + + y L y’ fc1 fc2 fc3 fc4 fc5 + + y1 L y’ fc2 fc3 fc4 fc5 fc1 X1 epoch 1, batch 1 + + L y’ fc2 fc3 fc4 fc5 fc1
  18. Deep Learning Frameworks Static Graph Dynamic Graph X b W

    * + σ xTW + b (xTW + b) σ Computational Graph Models Linear (or Dense) + + y L y’ fc1 fc2 fc3 fc4 fc5 + + L y’ fc2 fc3 fc4 fc5 fc1 + + y2 L y’ fc2 fc3 fc4 fc5 fc1 X2 epoch 1, batch 2
  19. Deep Learning Frameworks Static Graph Dynamic Graph X b W

    * + σ xTW + b (xTW + b) σ Backwards and Gradients Calculation Linear (or Dense) + + y L y’ fc1 fc2 fc3 fc4 fc5 + + y L y’ fc2 fc3 fc1 X fc5 fc4 Backprop Autograd Record
  20. Deep Learning Frameworks Static Graph Dynamic Graph X b W

    * + σ xTW + b (xTW + b) σ Backwards and Gradients Calculation Linear (or Dense) + + y L y’ fc1 fc2 fc3 fc4 fc5 + + y L y’ fc2 fc3 fc1 X fc5 fc4 Record Replay Backprop Autograd &
  21. Deep Learning Course Approach: Data Scientist Always prefer dev/practical aspects

    (tools & sw) Work on full pipeline (e.g. data preparation) Emphasis on the implementation Perspective: Researcher No off-the-shelf (so no “black-box”) solutions” References and Further Readings to know more features