Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Deep Learning for Data Scientists

Deep Learning for Data Scientists

Neural networks have seen renewed interest from data scientists and machine learning researchers for their ability to accurately classify high-dimensional data, including images, sounds and text. In this session we will discuss the fundamental algorithms behind neural networks, such as back-propogation and gradient descent. We will develop an intuition for how to train a deep neural network using large data sets. We will then use the algorithms we have developed to train a simple handwritten digit recognizer, and illustrate how to generalize this same technique across larger types of images using convolutional neural nets. In the second and final part of this presentation, we will show you how to apply the same algorithms we have implemented using Keras and TensorFlow, a Python library for deep learning on large datasets. Attendees will learn how to implement a simple neural network, monitor its training progress and test for accuracy over time. Prior experience with Python and some basic algebra is a pre-requirement.

Breandan Considine

April 28, 2017
Tweet

More Decks by Breandan Considine

Other Decks in Technology

Transcript

  1. Early Speech Recognition • Requires lots of handmade feature engineering

    • Poor results: >25% WER for HMM architectures
  2. Machine learning, for humans • Self-improvement • Language learning •

    Computer training • Special education • Reading comprehension • Content generation
  3. • A “tensor’ is just an n-dimensional array • Useful

    for working with complex data • We use (tiny) tensors every day! What’s a Tensor?
  4. 't' What’s a Tensor? • A “tensor’ is just an

    n-dimensional array • Useful for working with complex data • We use (tiny) tensors every day!
  5. 't' What’s a Tensor? • A “tensor’ is just an

    n-dimensional array • Useful for working with complex data • We use (tiny) tensors every day!
  6. 't' What’s a Tensor? • A “tensor’ is just an

    n-dimensional array • Useful for working with complex data • We use (tiny) tensors every day!
  7. 't' What’s a Tensor? • A “tensor’ is just an

    n-dimensional array • Useful for working with complex data • We use (tiny) tensors every day!
  8. What’s a Tensor? 't' • A “tensor’ is just an

    n-dimensional array • Useful for working with complex data • We use (tiny) tensors every day!
  9. 0 1

  10. Cool learning algorithm def classify(datapoint, weights): prediction = sum(x *

    y for x, y in zip([1] + datapoint, weights)) if prediction < 0: return 0 else: return 1
  11. Cool learning algorithm def classify(datapoint, weights): prediction = sum(x *

    y for x, y in zip([1] + datapoint, weights)) if prediction < 0: return 0 else: return 1
  12. Cool learning algorithm def train(data_set): class Datum: def __init__(self, features,

    label): self.features = [1] + features self.label = label
  13. Cool learning algorithm def train(data_set): weights = [0] * len(data_set[0].features)

    total_error = threshold + 1 while total_error > threshold: total_error = 0 for item in data_set: error = item.label – classify(item.features, weights) weights = [w + RATE * error * i for w, i in zip(weights, item.features)] total_error += abs(error)
  14. Cool learning algorithm def train(data_set): weights = [0] * len(data_set[0].features)

    total_error = threshold + 1 while total_error > threshold: total_error = 0 for item in data_set: error = item.label – classify(item.features, weights) weights = [w + RATE * error * i for w, i in zip(weights, item.features)] total_error += abs(error)
  15. Cool learning algorithm def train(data_set): weights = [0] * len(data_set[0].features)

    total_error = threshold + 1 while total_error > threshold: total_error = 0 for item in data_set: error = item.label – classify(item.features, weights) weights = [w + RATE * error * i for w, i in zip(weights, item.features)] total_error += abs(error)
  16. weights = [w + RATE * error * i for

    w, i in zip(weights, item.features)] Cool learning algorithm * 1 i1 i2 in * * * w0 w1 w2 wn Σ
  17. Cool learning algorithm def train(data_set): weights = [0] * len(data_set[0].features)

    total_error = threshold + 1 while total_error > threshold: total_error = 0 for item in data_set: error = item.label – classify(item.features, weights) weights = [w + RATE * error * i for w, i in zip(weights, item.features)] total_error += abs(error)
  18. Cool learning algorithm def train(data_set): weights = [0] * len(data_set[0].features)

    total_error = threshold + 1 while total_error > threshold: total_error = 0 for item in data_set: error = item.label – classify(item.features, weights) weights = [w + RATE * error * i for w, i in zip(weights, item.features)] total_error += abs(error)
  19. Even Cooler Algorithm! (Backprop) train(trainingSet) : initialize network weights randomly

    until average error stops decreasing (or you get tired): for each sample in trainingSet: prediction = network.output(sample) compute error (prediction – sample.output) compute error of (hidden -> output) layer weights compute error of (input -> hidden) layer weights update weights across the network save the weights
  20. What is a kernel? • A kernel is just a

    matrix • Used for edge detection, blurs, filters
  21. Data Science/Engineering • Data selection • Data processing • Formatting

    & Cleaning • Sampling • Data transformation • Feature scaling & Normalization • Decomposition & Aggregation • Dimensionality reduction
  22. Common Mistakes • Training set – 70%/30% split • Test

    set – Do not show this to your model! • Sensitivity vs. specificity • Overfitting
  23. Training your own model •Requirements • Clean, labeled data set

    • Clear decision problem • Patience and/or GPUs •Before you start, ask yourself: • Can I solve this problem more easily?
  24. Preparing data for ML •Generating Labels •Dimensionality reduction •Determining salient

    features •Visualizing the shape of your data •Correcting statistical bias •Getting data in the right format
  25. A brief look at unsupervised learning • Where did my

    labels go? • Mostly clustering, separation, association • Many different methods • Self organizing map • Expectation-maximization • Association rule learning • Reccomender systems
  26. Data pre-processing • Data selection • Data processing • Formatting

    & Cleaning • Sampling • Data transformation • Feature scaling & Normalization • Decomposition & Aggregation • Dimensionality reduction
  27. • CS231 Course Notes • Deeplearning4j Examples • Visualizing MNIST

    • Neural Networks and Deep Learning • Andrew Ng’s Machine Learning class • Awesome Public Datasets • Hackers Guide to Neural Networks Further resources
  28. Further resources • Code for slides github.com/breandan/ml-exercises • Hacker’s Guide

    to Neural Networks, Andrej Karpathy • Neural Networks Demystified, Stephen Welch, • Machine Learning, Andrew Ng https://www.coursera.org/learn/machine-learnin • Awesome public data sets github.com/caesar0301/awesome-public-datasets