NMM Lunch & Learn: Machine Learning Techniques

Machine learning Haider Al-Saidi “Learning is a change in cognitive
structures that occurs as a result of experience.” Author Unknown

Which one is written by human? 1. “Kitty couldn’t fall
asleep for a long time. Her nerves were strained as two tight strings, and even a glass of hot wine, that Vronsky made her drink, did not help her. Lying in bed she kept going over and over that monstrous scene at the meadow.” 2. “A shallow magnitude 4.7 earthquake was reported Monday morning five miles from Westwood, California, according to the U.S. Geological Survey. The temblor occurred at 6:25 a.m. Pacific time at a depth of 5.0 miles.” 3. “I was laid out sideways on a soft American van seat, several young men still plying me with vodkas that I dutifully drank, because for a Russian it is impolite to refuse.” NY Times quiz: which one of the above versus written by human?

Machine learning at Red River College  Coming up 15
hour introductory Machine Learning course  Various research projects in Machine learning  Planning a Centre for Machine Learning Studies  Hardware high computational capabilities to support ML activities  More academic courses and seminars to be offered in the field

The Field of Artificial Intelligence Artificial Intelligence Machine Learning Deep
Learning

Charles Babbage (1791-1871)

Lady Ada Lovelace - 1837 “It can do whatever we
know how to order it to perform. It can follow analysis; but it has no power of anticipating any analytical relations or truths.” Lady Lovelace describing the “Analytical Engine” by Charles Babbage

Channel 4 Humanoid

ML Applications  Natural Language Processing (NLP)  Virtual Agents
 Decision Management  Robotic Process Automation  Weather and Earthquake Prediction  Netflix movie suggestions  Spam messages identifier  Biometric recognition  Medical diagnosis and analysis  Machine vision  Understanding human Genome “Hitchhiker’s Guide To the Galaxy” A book by Douglas Adams

Plato vs. Aristotle

Scientific Modeling Observe a phenomenon 1 Construct a model for
that phenomenon 2 Make predictions using this model 3 “Introduction to Statistical Learning Theory” by Bousquet, Boucheron, & Lugosi

Crossing a street  Pedestrian crossing from the green side
of the street to the red side.  ∆ = ∆ ------------ (No acceleration)  ∆ = 0 . ∆ + ⁄ 1 2 . . ∆2 d1 d2 C1 C2 m ∆

Identifying the features space  Feature: is a measurable property
of an observable.  Examples: distance, speed, colour, weight, … ∆ m 20 meters - 20 km/hr + 5 m 2 km/hr 30 meters + 20 km/hr + 5 m 2 km/hr 40 meters + 35 km/hr - 5 m 2 km/hr 50 meters + 30 km/hr - 5 m 2 km/hr Feature Space

Collecting the training set ∆ m 20 meters - 20
km/hr + 5 m 2 km/hr Don’t cross 30 meters + 20 km/hr + 5 m 2 km/hr Cross 40 meters + 35 km/hr - 5 m 2 km/hr Don’t cross 50 meters + 30 km/hr - 5 m 2 km/hr Cross Features Labels/Classes Training Set

Decision Boundary Cross Don’t cross ∆ m 20 meters -
20 km/hr + 5 m 2 km/hr Don’t cross 30 meters + 20 km/hr + 5 m 2 km/hr Cross 40 meters + 35 km/hr - 5 m 2 km/hr Don’t cross 50 meters + 30 km/hr - 5 m 2 km/hr Cross

Machine Learning Approach (Classification) Apply Apply the new rules to
new unclassified data Classify The decision boundaries will become the classification rules Divide Use the training set to divide the feature space into regions separated by decision boundaries. Collect Collect a training set (Features + Labels) Identify Identify features space Training Classification

Training-Prediction system overview Training Dataset Training Algorithm Classification New Data
Result

Logistic Regression Features Space Cross Don’t cross

Optimizing Logistic Regression A B C

Support Vector Machine (SVP)  The idea of SVM is
to find the Line, Plane, or Hyperplane that maximizes the margin. Margin Decision Boundary

From Regression to Perceptron 1 , 2 , … ,
= 1 1 + 2 2 + ⋯ + = . R L x 1 x 2 x 3 x n . . . w 1 w 2 w 3 w n � . Output R/L

The Perceptron Model Output Source: Wikipedia x 1 x 2
x 3 x n . . . w 1 w 2 w 3 w n � . Training path Output

Neural Network Source: “Using neural nets to recognize handwritten digits”
Book by: Michael Nielsen

Deep Learning Part of Machine Learning Multiple layers to determine
what it needs to learn Unsupervised (adaptive) learning techniques

Deep Learning  Deep learning is a type of machine
learning in which a model learns to perform classification tasks directly from images, text, or sound. Deep learning is usually implemented using a neural network architecture. The term “deep” refers to the number of layers in the network—the more layers, the deeper the network. Traditional neural networks contain only 2 or 3 layers, while deep networks can have hundreds. “Introducing Deep Learning with Matlab” Mathworks  In Machine Learning we feed the features to the machine, in Deep Learning the machine will learn the features itself.

K- Nearest Neighbors (KNN)  Based on closest K neighbors
majority votes  Let k = 5  The unknown point is u  Out of the 5 closest points to u, {a, b, d, e} belong to one class while c belongs to different class  u must belong to the same class of a, b, d, and e d1 d2 d3 d4 d5 b a c u d e

So, is the green thing an apple or an orange?
The picture is one of PMI ads signifying their uniqueness

Invoking probability theory  How about problems where Euclidian metric
is not an option?  Examples: Identifying spam messages, separating objects by colours  In these cases we rely on probability theory for classification.

Decision Tree Using Shannon Entropy to split the tree based
on paths that minimize the information Entropy

Identifying SPAM e-mails Document Class Please don’t forget the milk,
red label is the best h Secure your investment with our bank s Our bank offers the best mortgage rates s You need to pick up the kids h No need to think, our bank is the best s Using Naïve Bayes approach to identify spam messages

Using Bayesian Probabilities please don’t forget the milk red label
is best secure your investment with our bank offers mortgage rates you need to Pick up kids no think Stats s 0 0 0 2 0 0 0 1 2 1 1 1 1 3 3 1 1 1 0 1 1 0 0 0 1 1 h 1 1 1 2 1 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 Probability s 0 0 0 2 26 0 0 0 1 26 2 26 1 26 1 26 1 26 1 26 3 26 3 26 1 26 1 26 1 26 0 1 26 1 26 0 0 0 1 26 1 26 h 1 26 1 26 1 26 2 26 1 26 1 26 1 26 1 26 1 26 0 0 0 0 0 0 0 0 0 1 26 1 26 1 26 1 26 1 26 1 26 0 0

Eliza the Psychotherapist- 1966 http://www.masswerk.at/elizabot/eliza.html Weizenbaum, Joseph "ELIZA – A
Computer Program For the Study of Natural Language Communication Between Man and Machine" Communications of the ACM; Volume 9 , Issue 1 (January 1966): p 36-45

Tools • I like Python, but any other language will
do • You may want to implement all the algorithms from scratch

Differential Analyser - 1935

Entsceidungsproblem (Decision Problem) Posed by David Hilbert in 1928 •Is
there such an algorithm that takes logical statements as an input and produces, based on this input, a “YES” or “NO” answer that is always correct. The problem is stated as:

Alan Turing – Father of computer Science (1935) Worked on
a solution to the Entsceidungsproblem (Decision Problem) In 1935 he proposed what is known now as the “Turing Machine” Turing test: “A computer would deserve to be called intelligent if it could deceive a human into believing that it was a human.”

Turing Machine “Unless there is something mystical or magical about
human thinking, intelligence can be achieved by computer” [Our Final Invention, by James Barrat] It’s a hypothetical machine consists of an infinite recording tape divided into cells. In each cell there is a symbol that instructs the machine to move the tape back and forth and places the machine in one state of pre- defined table of states.

State A Write 1 If 0 move R If 1
move L State C Write 1 If 0 move R If 1 move L State B If 0 Write 1 If 1 write 0 If 0 move R If 1 move L HALT 0 1 1 0 0 Turing Machine - Example 0 1 1 0 1 0 1 Recording Head

Elliot 401 - 1954

Expert System 1970’s-1980’s  Emulates human experts  Contains a
knowledge base which contains data represents the rules and the facts.  Never gained momentum  Lack intelligence  Few applications in:  Medicine  Business  Law

Tensorflow playground  Tensorflow Playground app  Java script app
 Free

Data classification  Nominal data {True , False}, {Red, Green,
Blue}, (1, 0}, {Apple, Orange, Kiwi, Mango}  Continuous data  Temperature readings  Humidity data  Probabilities

Features, and Inputs  Feature: is a measurable property of
an observable.  Example: Colour is a feature of an apple (observable).  Input: is the measurement observed for a certain feature that belongs to an observable.  Example: The input of the colour feature is Red.  The inputs vector is denoted by = 1 , 2 , … , . Where n is the number of features.  We also define the set of input vectors: ()where j represents the jth input vector.

Adding Sigmoid Function5 x0 x1 x2 xn . . .
w0 w1 w2 wn � . Sigmoid Output Training Algorithm

Outputs, Targets, Labels, and Training Examples  Output ()is the
output of the jth input vector.  If the output is a desired value for a known input vector then it is called a target or a label.  Training example is the set that contains an input vector and a target: 𝑇𝑇 𝐸𝐸𝐸𝐸 = { , }

() Classification Training h(x) () Classification Training h(x) 1 ()
2 () 3 () () ()

Data sets  Dataset: set or collection of value sets
 Datasets presented as tables where each column in the table contains measurements for particular feature.  One of the columns is reserved for the Label  The Data set could be training data set if it contains labels column or testing data set if it does not contain a label column Dataset Colour Shape Weight Label

The MNIST Datasets  It’s a set of hand written
digits with the corresponding labels  MNIST set: http://yann.lecun.com/exdb/mnist/

The Iris Dataset Introduced by the British statistician and biologist
Ronal d Fisher in his 1936 paper Source: Wikipedia

Classification vs. Regression  If the target variable of the
output () is continuous, then the learning problem is a regression problem.  If the target variable of the output () is nominal (discrete), then the learning problem is a classification problem.

What is noise?  Any unwanted or undesirable outcome of
an experiment  Noise is associated with measurements  Example:  Assume the desired outcome of an experiment to identify motorized vehicle is a “car”  If the outcome is “bicycle”, then the dataset is a noise

Training-Prediction system overview Supervised system Training Dataset Training Algorithm Classification
New Data Prediction Action

NMM Lunch & Learn: Machine Learning Techniques

NMM Lunch & Learn: Machine Learning Techniques

Other Decks in Technology

Featured

Transcript