Slide 1

Slide 1 text

jgs SER 594 Software Engineering for Machine Learning Lecture 10: Connecting the Dots Dr. Javier Gonzalez-Sanchez [email protected] javiergs.engineering.asu.edu | javiergs.com PERALTA 230U Office Hours: By appointment

Slide 2

Slide 2 text

Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 2 jgs Assignment 03 § Create 3 model for the Iris Dataset (a CSV table) § Play with activation function, loss/error function, weight initialization, updaters, and epochs and create a good model, a bad model, an some in between. § Explain in detail your choices and the results for the 3 models § Submit a paper (PDF) with your models (DL4J code) and their description/explanations. § You have a week to work as usual

Slide 3

Slide 3 text

jgs Previously …

Slide 4

Slide 4 text

Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 4 jgs Weight Initialization a) UNIFORM § A too-large initialization leads to exploding (partial derivatives) § A too-small initialization leads to vanishing (partial derivatives) b) XAVIER § The mean of the activations should be zero. § The variance of the activations should stay the same across every layer. / / statistical measurement of / / the spread between numbers in a data set

Slide 5

Slide 5 text

Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 5 jgs Activation Functions

Slide 6

Slide 6 text

Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 6 jgs Error Function § Mean squared error (MSE) § Negative Log Likelihood (NLL) Likelihood of observed data y would be produced by parameter values w L(y, w) Likelihood can be in range 0 to 1. Log facilitates the derivatives Log likelihood values are then in range -Infinite to 0. Negative make it Infinite to 0 Used in tandem with the SoftMax.

Slide 7

Slide 7 text

Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 7 jgs Updater | GDA, ADAM, NADAM, Nesterovs Momentum / /Original Choice / /Adaptive LR / /Velocity Momentum coefficient / /Nesterovs Momentum / / Momentum + Adaptive

Slide 8

Slide 8 text

Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 8 jgs Number of hidden layers

Slide 9

Slide 9 text

jgs One more thing

Slide 10

Slide 10 text

Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 10 jgs Epoch § An epoch is a term used in machine learning and indicates the number of passes of the entire training dataset the machine learning algorithm has completed.

Slide 11

Slide 11 text

Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 11 jgs Epoch § An epoch is a term used in machine learning and indicates the number of passes of the entire training dataset the machine learning algorithm has completed. How many Epochs in our XOR made from Scratch?

Slide 12

Slide 12 text

Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 12 jgs Epoch § An epoch is a term used in machine learning and indicates the number of passes of the entire training dataset the machine learning algorithm has completed. How many Epochs in our MNIST example?

Slide 13

Slide 13 text

Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 13 jgs Epoch value § When training algorithms can run into several (hundred, thousands) of epochs, and the process is set to continue until the model error is sufficiently minimized. § Tutorials and examples use values like 10

Slide 14

Slide 14 text

Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 14 jgs Epoch value DataSetIteratior DataSet

Slide 15

Slide 15 text

jgs Connecting the Dots

Slide 16

Slide 16 text

Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 16 jgs Iris dataset / / replace the label for 0, 1, 2

Slide 17

Slide 17 text

Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 17 jgs Remember this IrisDataSetIterator (a, b)

Slide 18

Slide 18 text

Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 18 jgs A Second Option RecordReader recordReader = new CSVRecordReader (0, ',’)); // skipNumLines, delimiter recordReader.initialize( new FileSplit( new ClassPathResource("iris.txt").getFile()) ); DataSetIterator iterator = new RecordReaderDataSetIterator( recordReader, // source 150, // rows 4, // inputs 3 // labels );

Slide 19

Slide 19 text

Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 19 jgs A Second Option DataSet allData = iterator.next(); allData.shuffle(42); //System.currentTimeMillis() // 65% training and 35% testing SplitTestAndTrain testAndTrain = allData.splitTestAndTrain(0.65); DataSet trainingData = testAndTrain.getTrain(); DataSet testData = testAndTrain.getTest();

Slide 20

Slide 20 text

Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 20 jgs Homework 03 Write a paper: § Report 3 attempts to create a model (good, bad, regular). § Compare the differences and how did you select the configuration parameters § Explain your network architecture and how you decided it § Add pictures of your code and the eval.stats() per model Do not forget Academic Integrity!

Slide 21

Slide 21 text

jgs Model Listeners The Observer Pattern

Slide 22

Slide 22 text

Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 22 jgs Listeners https://refactoring.guru/design-patterns/observer

Slide 23

Slide 23 text

Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 23 jgs // add this before calling fit() model.setListeners(new MyListener());

Slide 24

Slide 24 text

Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 24 jgs MyListener

Slide 25

Slide 25 text

Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 25 jgs Output …

Slide 26

Slide 26 text

Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 26 jgs class ScoreIterationListener § Score iteration listener. Reports the score (value of the loss function ) during training every N iterations //print the score with every 1 iteration model.setListeners(new ScoreIterationListener(1));

Slide 27

Slide 27 text

Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 27 jgs class TimeIterationListener § It displays into INFO logs the remaining time in minutes and the date of the end of the process. § Remaining time is estimated from the amount of time for training so far, and the total number of iterations specified by the user

Slide 28

Slide 28 text

jgs Deploy the Model

Slide 29

Slide 29 text

Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 29 jgs Save the Model

Slide 30

Slide 30 text

Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 30 jgs Load the Model Load the updater

Slide 31

Slide 31 text

Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 31 jgs Questions

Slide 32

Slide 32 text

jgs SER 594 Software Engineering for Machine Learning Javier Gonzalez-Sanchez, Ph.D. [email protected] Spring 2022 Copyright. These slides can only be used as study material for the class CSE205 at Arizona State University. They cannot be distributed or used for another purpose.