Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 5 jgs W, X, Y https://medium.com/analytics-vidhya/neural-networks-in-a-nutshell-with-java-b4a635a2c4af
Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 7 jgs Neurons and Layers https://medium.com/analytics-vidhya/neural-networks-in-a-nutshell-with-java-b4a635a2c4af
Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 9 jgs Back propagation Did I mention something called learning rate? Review Details (Math) Here: https://medium.com/analytics-vidhya/neural-networks-in-a-nutshell-with-java-b4a635a2c4af
Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 14 jgs DL4J | Our Model * Dense layer – a layer that is deeply connected with its preceding layer which means the neurons of the layer are connected to every neuron of its preceding layer.
Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 25 jgs Definition positive negative TP FP FN TN positive negative FN X FP X TN / / How much we can trust the model when predict a Positive Precision = TP / TP + FP / / Measure the ability of the model to find all Positive units Recall = TP / TP + FN
Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 32 jgs Weight Initialization | Xavier § A too-large initialization leads to exploding (partial derivatives) § A too-small initialization leads to vanishing (partial derivatives) Advice: § The mean of the activations should be zero. § The variance of the activations should stay the same across every layer. / / statistical measurement of / / the spread between numbers in a data set
Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 33 jgs Activation Functions | RELU § ReLU –– Rectified linear activation function § Popular activation function for hidden layers
Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 34 jgs Activation Functions | SoftMax § Sigmoid is independent § Most popular activation function for output layers handling multiple classes. § Probabilities.
Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 35 jgs Error Function | Negative Log-Likelihood § the SoftMax function is used in tandem with the negative log-likelihood. § Likelihood of observed data y would be produced by parameter values w L(y, w) Likelihood can be in range 0 to 1. § Log facilitates the derivatives § The Log likelihood values are then in range -Infinite to 0. § Negative make it Infinite to 0 https://hea-www.harvard.edu/AstroStat/aas227_2016/lecture1_Robinson.pdf
Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 38 jgs Epoch § An epoch is a term used in machine learning and indicates the number of passes of the entire training dataset the machine learning algorithm has completed.
Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 50 jgs Can we create a GUI? SEPAL LENGTH: SEPAL WIDTH: PETAL LENGTH: PETAL WIDTH: IRIS TYPE: Calculate
Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 53 jgs Filters § There are a set of few filters that are used to perform a few tasks. blur sharp borders 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0 -1 0 -1 5 -1 0 -1 0 -1 0 1 -2 0 2 -1 0 1 -1 -2 -1 0 0 0 1 2 1 horizontal vertical
Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 54 jgs Stride § The number of pixels which are shift over the input matrix. § When the stride is equaled to 1, then we move the filters to 1 pixel at a time and similarly, if the stride is equaled to 2, then we move the filters to 2 pixels at a time, etc. 11 21 31 41 51 12 22 32 42 52 13 23 33 43 53 14 24 34 44 54 15 25 35 45 54 61 62 63 64 65 0 1 2 3 4 16 26 36 46 55 10 20 30 40 50 66 60 5 6 1 1 1 1 1 1 1 1 1 99 117 135 279 * =
Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 55 jgs Padding § The pixel in the corner will only get covers one time, but the middle pixels will get covered more than once. § Padding refers to the number of pixels added to an image when it is being processed 1 0 0 0 0 1 1 0 0 1 1 1 1 1 1 0 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Javier Gonzalez-Sanchez | SER 594 | Spring 2022 | 56 jgs Pooling § Pooling is downscaling of the image obtained from the previous layers. § It can be compared to shrinking an image to reduce its pixel density § Options: Max-pooling, Average-pooling, Sum-pooling 11 21 31 12 22 32 13 23 33 0 2 3 10 20 30 1 2x2 Max pooling 33 11 31 13
jgs SER 594 Software Engineering for Machine Learning Javier Gonzalez-Sanchez, Ph.D. [email protected] Spring 2022 Copyright. These slides can only be used as study material for the class CSE205 at Arizona State University. They cannot be distributed or used for another purpose.