Deep Learning on Java (ConFoo Vancouver)

Deep Learning on Java Breandan Considine ConFoo Vancouver 2016

Who am I? • Background in Computer Science, Machine Learning
• Worked for a small ad-tech startup out of university • Spent two years as Developer Advocate at JetBrains • Interested machine learning and speech recognition • Enjoy writing code, traveling to conferences, reading • Say hello! @breandan | breandan.net | [email protected]

Outline for this talk • Introduce an algorithm for classifying
stuff • A brief introduction to deep neural networks • Learning in a nutshell • Data selection and preparation • Let’s see some code!

Cool learning algorithm classify(datapoint, weights) :

Cool learning algorithm classify(datapoint, weights) : prediction = 0 for
i from 0 to weights.size: prediction += weights[i] * datapoint[i]

Cool learning algorithm classify(datapoint, weights) : prediction = 0 for
i from 0 to weights.size: prediction += weights[i] * datapoint[i] if prediction < 0 return 0 else return 1

Cool learning algorithm train(List of samples) : [x=1, y=0], output=1

Cool learning algorithm train(List of samples) : weights = array[samples[0].inputs.length
+ 1] [0, 0, 0]

+ 1] while totalError is less than some threshold: totalError = 0 for each sample in samples :

+ 1] while totalError is less than some threshold: totalError = 0 for each sample in samples : sample.input.prepend(1) // “Bias”

+ 1] while totalError is less than some threshold: totalError = 0 for each sample in samples : sample.input.prepend(1) // “Bias” error = sample.output - classify(sample.input, weights)

+ 1] while totalError is less than some threshold: totalError = 0 for each sample in samples : sample.input.prepend(1) // “Bias” error = sample.output - classify(sample.input, weights) for i from 0 to weights.length : weights[i] += RATE * error * sample.inputs[i]

+ 1] while totalError is less than some threshold: totalError = 0 for each sample in samples : sample.input.prepend(1) // “Bias” error = sample.output - classify(sample.input, weights) for i from 0 to weights.length : weights[i] += RATE * error * sample.inputs[i] totalError += |error|

Even Cooler Algorithm! (Backpropogation) train(trainingSet) : // One epoch of
training initialize network weights randomly until average error stops decreasing (or you get tired): for each sample in trainingSet: prediction = network.output(sample) compute error (prediction – sample.output) compute error of (hidden -> output) layer weights compute error of (input -> hidden) layer weights update weights across the network save the weights

Learning in a nutshell

Gradient Descent http://cs231n.github.io/

Data pre-processing • Data selection • Data processing • Formatting
& Cleaning • Sampling • Data transformation • Feature scaling & Normalization • Decomposition & Aggregation • Dimensionality reduction

Common Mistakes •Training set – 70%/30% split •Test set –
Do not show this to your model! •Sensitivity vs. specificity •Overfitting

Evaluating the results

ImageNet Large Scale Visual Recognition

Year over year Top-5 Recognition Error

Training your own model •Requirements •Clean, labeled data set •Clear
decision problem •Patience and/or GPUs •Before you start

Multi-layer Network Configuration MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder() .optimizationAlgo(STOCHASTIC_GRADIENT_DESCENT) .learningRate(0.006)
.list() .layer(0, new DenseLayer.Builder() .nIn(numRows * numColumns).nOut(1000) .activation("relu") .weightInit(WeightInit.XAVIER).build()) .layer(1, new Builder(NEGATIVELOGLIKELIHOOD) .nIn(1000).nOut(outputNum).activation("softmax") .weightInit(WeightInit.XAVIER).build()) .pretrain(false).backprop(true)

Model Initialization MultiLayerNetwork model = new MultiLayerNetwork(conf); model.init(); model.setListeners(Arrays.asList( new
ScoreIterationListener(1), new HistogramIterationListener(1)));

Training the model DataSetIterator dataSetIterator = ... log.info("Training model..."); for(int
i=0; i < numEpochs; i++) { model.fit(dataSetIterator); }

Evaluation log.info("Evaluating model...."); Evaluation evaluator = new Evaluation(outputNum); while(dataSetIterator.hasNext()){ DataSet
next = dataSetIterator.next(); evalaluator.eval(next.getLabels(), output); } log.info(eval.stats());

Demo time!

References • Andrej Karpathy, CS231 Course Notes: http://cs231n.github.io/ • DL4j:
https://github.com/deeplearning4j/deeplearning4j • Michael Nielsen, Neural Networks and Deep Learning, http://neuralnetworksanddeeplearning.com/ • Andrew Ng, Machine Learning class, Stanford/Coursera https://class.coursera.org/ml-003/lecture

Special Thanks Hanneli Tavante

Deep Learning on Java (ConFoo Vancouver)

Deep Learning on Java (ConFoo Vancouver)

Breandan Considine

More Decks by Breandan Considine

Other Decks in Programming

Featured

Transcript

Deep Learning on Java Breandan Considine ConFoo Vancouver 2016

Who am I? • Background in Computer Science, Machine Learning

Outline for this talk • Introduce an algorithm for classifying

Cool learning algorithm classify(datapoint, weights) :

Cool learning algorithm classify(datapoint, weights) : prediction = 0 for

Cool learning algorithm classify(datapoint, weights) : prediction = 0 for

Cool learning algorithm classify(datapoint, weights) : prediction = 0 for

Cool learning algorithm train(List of samples) : [x=1, y=0], output=1

Cool learning algorithm train(List of samples) : weights = array[samples[0].inputs.length

Cool learning algorithm train(List of samples) : weights = array[samples[0].inputs.length

Cool learning algorithm train(List of samples) : weights = array[samples[0].inputs.length

Cool learning algorithm train(List of samples) : weights = array[samples[0].inputs.length

Cool learning algorithm train(List of samples) : weights = array[samples[0].inputs.length

Cool learning algorithm train(List of samples) : weights = array[samples[0].inputs.length

Even Cooler Algorithm! (Backpropogation) train(trainingSet) : // One epoch of

Learning in a nutshell

Learning in a nutshell

Gradient Descent http://cs231n.github.io/

Data pre-processing • Data selection • Data processing • Formatting

Common Mistakes •Training set – 70%/30% split •Test set –

Evaluating the results

ImageNet Large Scale Visual Recognition

Year over year Top-5 Recognition Error

Training your own model •Requirements •Clean, labeled data set •Clear

Multi-layer Network Configuration MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder() .optimizationAlgo(STOCHASTIC_GRADIENT_DESCENT) .learningRate(0.006)

Model Initialization MultiLayerNetwork model = new MultiLayerNetwork(conf); model.init(); model.setListeners(Arrays.asList( new

Training the model DataSetIterator dataSetIterator = ... log.info("Training model..."); for(int

Evaluation log.info("Evaluating model...."); Evaluation evaluator = new Evaluation(outputNum); while(dataSetIterator.hasNext()){ DataSet

Demo time!

References • Andrej Karpathy, CS231 Course Notes: http://cs231n.github.io/ • DL4j:

Special Thanks Hanneli Tavante