Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Deep Learning on Java [JavaDay Tokyo 2017]

Deep Learning on Java [JavaDay Tokyo 2017]

機械学習は、さまざまな現実世界のアプリケーションにおいて、コンピュータ・ビジョン、音声認識や言語処理をはじめとして、目ざましい進歩を遂げています。Javaでも、deeplearning4java(DL4J)のような新しいSparkベースのツールを使用したライブラリによって、大規模なデータ・セットにこれらの手法を適用できます。

本セッションでは、最急降下法、誤差逆伝播法、モデル訓練と評価といった機械学習の基本的な構成要素を学びます。いかに「教師あり学習」モデルを構築するかなどを含め、Deep Learningの概要をご紹介します。機械学習のこれまでの経験は必要ありません。ビッグデータから新しい洞察や未知のパターンのカスタム・モデルを開発する方法について、ヒントになれば幸いです。

Breandan Considine

May 16, 2017
Tweet

More Decks by Breandan Considine

Other Decks in Programming

Transcript

  1. Who am I? • Background in Computer Science, Machine Learning

    • Worked for a small ad-tech startup out of university • Spent two years as Developer Advocate @JetBrains • Interested in machine learning and speech recognition • Enjoy writing code, traveling to conferences, reading • Say hello! @breandan | breandan.net | [email protected]
  2. Early Speech Recognition • Requires lots of handmade feature engineering

    • Poor results: >25% WER for HMM architectures
  3. What is machine learning? • Prediction • Categorization • Anomaly

    detection • Personalization • Adaptive control • Playing games
  4. Machine learning, for humans • Self-improvement • Language learning •

    Computer training • Special education • Reading comprehension • Content generation
  5. • A “tensor’ is just an n-dimensional array • Useful

    for working with complex data • We use (tiny) tensors every day! What’s a Tensor?
  6. 't' What’s a Tensor? • A “tensor’ is just an

    n-dimensional array • Useful for working with complex data • We use (tiny) tensors every day!
  7. 't' What’s a Tensor? • A “tensor’ is just an

    n-dimensional array • Useful for working with complex data • We use (tiny) tensors every day!
  8. 't' What’s a Tensor? • A “tensor’ is just an

    n-dimensional array • Useful for working with complex data • We use (tiny) tensors every day!
  9. 't' What’s a Tensor? • A “tensor’ is just an

    n-dimensional array • Useful for working with complex data • We use (tiny) tensors every day!
  10. What’s a Tensor? 't' • A “tensor’ is just an

    n-dimensional array • Useful for working with complex data • We use (tiny) tensors every day!
  11. 0 1

  12. Cool learning algorithm def classify(datapoint, weights): y for x, y

    in prediction = sum(x * zip([1] + datapoint, weights))
  13. Cool learning algorithm def classify(datapoint, weights): y for x, y

    in prediction = sum(x * zip([1] + datapoint, weights)) if prediction < 0: return 0 else: return 1
  14. Cool learning algorithm def classify(datapoint, weights): y for x, y

    in prediction = sum(x * zip([1] + datapoint, weights)) if prediction < 0: return 0 else: return 1
  15. Cool learning algorithm def train(data_set): class Datum: def init (self,

    features, label): self.features = [1] + features self.label = label
  16. Cool learning algorithm def train(data_set): weights = [0] * len(data_set[0].features)

    total_error = threshold + 1 while total_error > threshold: total_error = 0 for item in data_set: weights) error = item.label – classify(item.features, weights = [w + RATE for w, i * error * i in zip(weights, item.features)] total_error += abs(error)
  17. Cool learning algorithm def train(data_set): weights = [0] * len(data_set[0].features)

    total_error = threshold + 1 while total_error > threshold: total_error = 0 for item in data_set: weights) error = item.label – classify(item.features, weights = [w + RATE for w, i * error * i in zip(weights, item.features)] total_error += abs(error)
  18. Cool learning algorithm def train(data_set): weights = [0] * len(data_set[0].features)

    total_error = threshold + 1 while total_error > threshold: total_error = 0 for item in data_set: weights) error = item.label – classify(item.features, weights = [w + RATE for w, i * error * i in zip(weights, item.features)] total_error += abs(error)
  19. weights zip(weights, item.features)] Cool learning algorithm 1 i1 i2 in

    * * * * = [w + RATE * error * i for w, i in w0 w1 w2 wn Σ
  20. Cool learning algorithm def train(data_set): weights = [0] * len(data_set[0].features)

    total_error = threshold + 1 while total_error > threshold: total_error = 0 for item in data_set: weights) error = item.label – classify(item.features, weights = [w + RATE for w, i * error * i in zip(weights, item.features)] total_error += abs(error)
  21. Cool learning algorithm def train(data_set): weights = [0] * len(data_set[0].features)

    total_error = threshold + 1 while total_error > threshold: total_error = 0 for item in data_set: weights) error = item.label – classify(item.features, weights = [w + RATE for w, i * error * i in zip(weights, item.features)] total_error += abs(error)
  22. Backpropogation train(trainingSet) : initialize network weights randomly until average error

    stops decreasing (or you get tired): for each sample in trainingSet: prediction = network.output(sample) compute error (prediction – sample.output) compute error of (hidden -> output) layer weights compute error of (input -> hidden) layer weights update weights across the network save the weights
  23. What is a kernel? • A kernel is just a

    matrix • Used for edge detection, blurs, filters
  24. Data Science/Engineering • Data selection • Data processing • Formatting

    & Cleaning • Sampling • Data transformation • Feature scaling & Normalization • Decomposition & Aggregation • Dimensionality reduction
  25. Common Mistakes • Training set – 70%/30% split • Test

    set – Do not show this to your model! • Sensitivity vs. specificity • Overfitting
  26. Training your own model •Requirements • Clean, labeled data set

    • Clear decision problem • Patience and/or GPUs •Before you start
  27. Preparing data for ML •Generating Labels •Dimensionality reduction •Determining salient

    features •Visualizing the shape of your data •Correcting statistical bias •Getting data in the right format
  28. Further resources • CS231 Course Notes • Deeplearning4j Examples •

    Visualizing MNIST • Neural Networks and Deep Learning • Andrew Ng’s Machine Learning class • Awesome Public Datasets • Hackers Guide to Neural Networks