Deep Learning on Java

Deep Learning on Java Breandan Considine JavaOne 2016

Cool learning algorithm classify(datapoint, weights) :

Cool learning algorithm classify(datapoint, weights) : prediction = 0 for
i from 0 to weights.size: prediction += weights[i] * datapoint[i]

Cool learning algorithm classify(datapoint, weights) : prediction = 0 for
i from 0 to weights.size: prediction += weights[i] * datapoint[i] if prediction < 0 return 0 else return 1

Cool learning algorithm train(List of samples) : [x=1, y=0], output=1

Cool learning algorithm train(List of samples) : weights = array[samples[0].inputs.length
+ 1] [0, 0, 0]

+ 1] while totalError is less than some threshold: totalError = 0 for each sample in samples :

+ 1] while totalError is less than some threshold: totalError = 0 for each sample in samples : sample.input.prepend(1) // “Bias”

+ 1] while totalError is less than some threshold: totalError = 0 for each sample in samples : sample.input.prepend(1) // “Bias” error = sample.output - classify(sample.input, weights)

+ 1] while totalError is less than some threshold: totalError = 0 for each sample in samples : sample.input.prepend(1) // “Bias” error = sample.output - classify(sample.input, weights) for i from 0 to weights.length : weights[i] += RATE * error * sample.inputs[i]

+ 1] while totalError is less than some threshold: totalError = 0 for each sample in samples : sample.input.prepend(1) // “Bias” error = sample.output - classify(sample.input, weights) for i from 0 to weights.length : weights[i] += RATE * error * sample.inputs[i] totalError += |error|

Even Cooler Algorithm! train(trainingSet) : initialize network weights randomly until
average error stops decreasing (or you get tired): for each sample in trainingSet: prediction = network.output(sample) compute error (prediction – sample.output) compute error of (hidden -> output) layer weights compute error of (input -> hidden) layer weights update weights across the network save the weights

Gradient Descent http://cs231n.github.io/

Google Inception Model

ImageNet Large Scale Visual Recognition

Year over year Top-5 Recognition Error

Training your own model •Requirements •Clean, labeled data set •Clear
decision problem •Patience and/or GPUs •Before you start

Common Mistakes •Training set – 70%/30% split •Test set –
Do not show this to your model! •Sensitivity vs. specificity •Overfitting

Multi-layer Network Configuration MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder() .optimizationAlgo(STOCHASTIC_GRADIENT_DESCENT) .learningRate(0.006)
.list() .layer(0, new DenseLayer.Builder() .nIn(numRows * numColumns).nOut(1000) .activation("relu") .weightInit(WeightInit.XAVIER).build()) .layer(1, new Builder(NEGATIVELOGLIKELIHOOD) .nIn(1000).nOut(outputNum).activation("softmax") .weightInit(WeightInit.XAVIER).build()) .pretrain(false).backprop(true)

Model Initialization MultiLayerNetwork model = new MultiLayerNetwork(conf); model.init(); model.setListeners(Arrays.asList( new
ScoreIterationListener(1), new HistogramIterationListener(1)));

Training the model DataSetIterator dataSetIterator = ... log.info("Training model..."); for(int
i=0; i < numEpochs; i++) { model.fit(dataSetIterator); }

Evaluation log.info("Evaluating model...."); Evaluation evaluator = new Evaluation(outputNum); while(dataSetIterator.hasNext()){ DataSet
next = dataSetIterator.next(); evalaluator.eval(next.getLabels(), output); } log.info(eval.stats());

Demo time!

References • Andrej Karpathy, CS231 Course Notes: http://cs231n.github.io/ • DL4j:
https://github.com/deeplearning4j/deeplearning4j • Michael Nielsen, Neural Networks and Deep Learning, http://neuralnetworksanddeeplearning.com/ • Andrew Ng, Machine Learning class, Stanford/Coursera https://class.coursera.org/ml-003/lecture

Special Thanks JavaOne Program Committee Sharat Chandler Hanneli Tavante

Deep Learning on Java

Deep Learning on Java

Breandan Considine

More Decks by Breandan Considine

Other Decks in Programming

Featured

Transcript