Upgrade to Pro — share decks privately, control downloads, hide ads and more …

JGS594 Lecture 10

JGS594 Lecture 10

Software Engineering for Machine Learning
Connecting the Dots
(202202)

Javier Gonzalez-Sanchez

February 10, 2022
Tweet

More Decks by Javier Gonzalez-Sanchez

Other Decks in Programming

Transcript

  1. jgs
    SER 594
    Software Engineering for
    Machine Learning
    Lecture 10: Connecting the Dots
    Dr. Javier Gonzalez-Sanchez
    [email protected]
    javiergs.engineering.asu.edu | javiergs.com
    PERALTA 230U
    Office Hours: By appointment

    View full-size slide

  2. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 2
    jgs
    Assignment 03
    § Create 3 model for the Iris Dataset (a CSV table)
    § Play with activation function, loss/error function, weight initialization,
    updaters, and epochs and create a good model, a bad model, an some in
    between.
    § Explain in detail your choices and the results for the 3 models
    § Submit a paper (PDF) with your models (DL4J code) and their
    description/explanations.
    § You have a week to work as usual

    View full-size slide

  3. jgs
    Previously …

    View full-size slide

  4. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 4
    jgs
    Weight Initialization
    a) UNIFORM
    § A too-large initialization leads to exploding (partial derivatives)
    § A too-small initialization leads to vanishing (partial derivatives)
    b) XAVIER
    § The mean of the activations should be zero.
    § The variance of the activations should stay the same across every layer.
    /
    / statistical measurement of
    /
    / the spread between numbers in a data set

    View full-size slide

  5. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 5
    jgs
    Activation Functions

    View full-size slide

  6. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 6
    jgs
    Error Function
    § Mean squared error (MSE)
    § Negative Log Likelihood (NLL)
    Likelihood of observed data y would be produced by parameter values w
    L(y, w)
    Likelihood can be in range 0 to 1.
    Log facilitates the derivatives
    Log likelihood values are then in range -Infinite to 0.
    Negative make it Infinite to 0
    Used in tandem with the SoftMax.

    View full-size slide

  7. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 7
    jgs
    Updater | GDA, ADAM, NADAM, Nesterovs Momentum
    /
    /Original Choice
    /
    /Adaptive LR
    /
    /Velocity
    Momentum coefficient
    /
    /Nesterovs Momentum
    /
    / Momentum + Adaptive

    View full-size slide

  8. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 8
    jgs
    Number of hidden layers

    View full-size slide

  9. jgs
    One more thing

    View full-size slide

  10. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 10
    jgs
    Epoch
    § An epoch is a term used in machine learning and indicates the number of
    passes of the entire training dataset the machine learning algorithm has
    completed.

    View full-size slide

  11. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 11
    jgs
    Epoch
    § An epoch is a term used in machine learning and indicates the number of
    passes of the entire training dataset the machine learning algorithm has
    completed.
    How many Epochs in our
    XOR made from Scratch?

    View full-size slide

  12. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 12
    jgs
    Epoch
    § An epoch is a term used in machine learning and indicates the number of
    passes of the entire training dataset the machine learning algorithm has
    completed.
    How many Epochs in our
    MNIST example?

    View full-size slide

  13. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 13
    jgs
    Epoch value
    § When training algorithms can run into several (hundred, thousands) of
    epochs, and the process is set to continue until the model error is
    sufficiently minimized.
    § Tutorials and examples use values like 10

    View full-size slide

  14. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 14
    jgs
    Epoch value
    DataSetIteratior
    DataSet

    View full-size slide

  15. jgs
    Connecting the Dots

    View full-size slide

  16. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 16
    jgs
    Iris dataset
    /
    / replace the label for 0, 1, 2

    View full-size slide

  17. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 17
    jgs
    Remember this
    IrisDataSetIterator (a, b)

    View full-size slide

  18. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 18
    jgs
    A Second Option
    RecordReader recordReader = new CSVRecordReader (0, ',’));
    // skipNumLines, delimiter
    recordReader.initialize(
    new FileSplit( new ClassPathResource("iris.txt").getFile())
    );
    DataSetIterator iterator = new RecordReaderDataSetIterator(
    recordReader, // source
    150, // rows
    4, // inputs
    3 // labels
    );

    View full-size slide

  19. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 19
    jgs
    A Second Option
    DataSet allData = iterator.next();
    allData.shuffle(42); //System.currentTimeMillis()
    // 65% training and 35% testing
    SplitTestAndTrain testAndTrain = allData.splitTestAndTrain(0.65);
    DataSet trainingData = testAndTrain.getTrain();
    DataSet testData = testAndTrain.getTest();

    View full-size slide

  20. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 20
    jgs
    Homework 03
    Write a paper:
    § Report 3 attempts to create a model (good, bad, regular).
    § Compare the differences and how did you select the configuration
    parameters
    § Explain your network architecture and how you decided it
    § Add pictures of your code and the eval.stats() per model
    Do not forget Academic Integrity!

    View full-size slide

  21. jgs
    Model Listeners
    The Observer Pattern

    View full-size slide

  22. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 22
    jgs
    Listeners
    https://refactoring.guru/design-patterns/observer

    View full-size slide

  23. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 23
    jgs
    // add this before calling fit()
    model.setListeners(new MyListener());

    View full-size slide

  24. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 24
    jgs
    MyListener

    View full-size slide

  25. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 25
    jgs
    Output

    View full-size slide

  26. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 26
    jgs
    class ScoreIterationListener
    § Score iteration listener. Reports the score (value of the loss function ) during
    training every N iterations
    //print the score with every 1 iteration
    model.setListeners(new ScoreIterationListener(1));

    View full-size slide

  27. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 27
    jgs
    class TimeIterationListener
    § It displays into INFO logs the remaining time in minutes and the date of the
    end of the process.
    § Remaining time is estimated from the amount of time for training so far, and
    the total number of iterations specified by the user

    View full-size slide

  28. jgs
    Deploy the Model

    View full-size slide

  29. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 29
    jgs
    Save the Model

    View full-size slide

  30. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 30
    jgs
    Load the Model
    Load the updater

    View full-size slide

  31. Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 31
    jgs
    Questions

    View full-size slide

  32. jgs
    SER 594 Software Engineering for Machine Learning
    Javier Gonzalez-Sanchez, Ph.D.
    [email protected]
    Spring 2022
    Copyright. These slides can only be used as study material for the class CSE205 at Arizona State University.
    They cannot be distributed or used for another purpose.

    View full-size slide