Machine Learning: The Bare Math Behind Libraries Part 2 - Unsupervised learning

achine Learning: The Bare Math Behind Libraries 30.04.2020 @CoffeeJUG by
Piotr Czajka & Łukasz Gebel

Super Heroes & Super Powers continued

Agenda • History of unsupervised learning • Modern solutions •
Q&A

Unsupervised learning

What? They can learn by themselves?

Why would we let them? • Less complex mathematical apparatus
than in supervised learning. • It is similar to discovering world on your own.

Why would we let them? Used mostly for sorting and
grouping when: • Sorting key can’t be easily figured out. • Data is very complex and finding the key is not trivial.

Idea behind – groups

Hebbian learning

Hebbian learning • Works similar to the nature • Great
for beginners and biological simulations :) • Simple Hebbian learning algorithm Δ w ij =η⋅x ij ⋅y i Δ w ij −change of j weight of ineuron η−learningcoefficient x ij − jinput of ineuron y i −output of i neuron

Hebbian learning • Works similar to the nature • Great
for beginners and biological simulations :) • Generalised Hebbian learning algorithm Δ w ij =F(x ij , y j ) Δ w ij −change of j weight of ineuron η−learningcoefficient x ij − jinput of ineuron y i −output of i neuron

Hebb’s neuron model w 1 w 2 w n w
0 Δ w ij =F(x ij , y j ) Σ x 1 x 2 x n ... 1 s y

Hebb’s neuron model 0.230 0.010 0.900 0.110 Δ w ij
=F(x ij , y j ) Σ 0.200 0.300 0.100 1

=F(x ij , y j ) Σ 1 0.046 0.003 0.090 0.110 0.200 0.300 0.100

=F(x ij , y j ) Σ 1 0.046 0.003 0.090 0.110 0.249 0.200 0.300 0.100

=F(x ij , y j ) Σ 1 0.046 0.003 0.090 0.110 0.249 0.562 0.562 0.200 0.300 0.100

=F(x ij , y j ) Σ 1 0.562 0.200 0.300 0.100

=F(x ij , y j ) Σ 1 0.562 +0.011 +0.016 +0.005 +0.056 0.300 0.100 0.200

=F(x ij , y j ) Σ x 1 x 2 x 3 1

Imagine superhero teams

Marvel database to the rescue Intelligence Strength Speed Durability Energy
projection Fighting skills Iron Man 6 6 5 6 6 4 Spiderman 4 4 3 3 1 4 Black Panter 5 3 2 3 3 5 Wolverine 2 4 2 4 1 7 Thor 2 7 7 6 6 4 Dr Strange 4 2 7 2 6 6 Hulk 2 7 3 7 5 4 Cpt. America 3 3 2 3 1 6 Mr Fantastic 6 2 2 5 1 3 Human Torch 2 2 5 2 5 3 Invisible Woman 3 2 3 6 5 3 The Thing 3 6 2 6 1 5 Luke Cage 3 4 2 5 1 4 She Hulk 3 7 3 6 1 4 Ms Marvel 2 6 2 6 1 4 Daredevil 3 3 2 2 4 5

Hebbian learning weaknesses • Unstable. • Prone to rise the
weights ad infinitum. • Some groups can trigger no response. • Some groups may trigger response from too many neurons.

And now – self organising networks

Self organising?

Learning with concurrency • You try to generalize input vector
in weights vector. • Instead of checking the reaction to input - you check distance between both vectors. • Ideally – each neuron specializes in one class generalization. • Two main strategies: – Winner Takes All (WTA) – Winner Takes Most (WTM)

Idea behind Neuron weights 3.0 2.0 2.0 Example 1.0 2.0
3.0

Example 1.0 2.0 3.0 Idea behind Distance 2.0 0.0 -1.0
Neuron weights 3.0 2.0 2.0 d i =w i −x i Euclidiandistance √∑ i=1 n d i 2 =√5

Distance 2.0 0.0 -1.0 Idea behind Neuron weights 3.0 2.0
2.0 Learning coefficient η=0.100 Learning Step 0.2 0.0 -0.1 Δ w i =η⋅d i Example 1.0 2.0 3.0 d i =w i −x i Learning coefficient η=0.100

Idea behind Distan ce 2.0 0.0 -1.0 Neuron weights 2.8
= 3.0 – 0.2 2.0 = 2.0 – 0.0 2.1 = 2.0 -(-0.1) Exam ple 1.0 2.0 3.0 d i =w i −x i Learning coefficient η=0.100 Learning Step 0.2 0.0 -0.1 w' i =w i −Δw i Δ w i =η⋅d i Example 1.0 2.0 3.0 Learning Step 0.2 0.0 -0.1 Δ w i =η⋅d i Distance 2.0 0.0 -1.0 d i =w i −x i

Idea behind Neuron weights 2.8 2.0 2.1

Idea behind – initial setup 1 2 3

Idea behind – mid learning 1 2 3

Idea behind – homeostasis 1 2 3

Demo time

Learning with concurrency • Gives more diverse groups. • Less
prone to clustering (than Hebb’s). • Searches wider spectrum of answers. • First step to more complex networks.

Learning with concurency - weaknesses • WTA – works best
if teaching examples are evenly distributed in solution space. • WTM – works best if weights’ vectors are evenly distributed in solution space. • Still can stick to local optimum.

Teuvo Kohonen’s SOM Created by this nice guy here

Kohonen’s self-organizing map • The most popular self-organizing network with
concurrency algorithm. • It teaches groups of neurons with WTM alghoritm • Special features: – Neurons are organised in a grid – Nevertheless – they are treated as a single layer

Kohonen’s self-organizing map w ij (s+1)=w ij (s)+Θ(k best ,i
, s)⋅η(s)⋅(I j (s)−w ij (s)) s−epochnumber k best −best neuron w ij (s)− j weight of ineuron Θ(k best ,i,s)−neighbourhood function η(s)−learning coefficient for sepoch I j (s)− jchunk of example for s epoch

SOM model By Mcld - Own work, CC BY-SA 3.0,
https://commons.wikimedia.org/w/index.php?curid=10373592

Common weaknesses of artificial neuron systems • We are still
dependant on randomized weights. • All algorithms can stick to local optimum.

Summary • Grouped superheroes using historical algorithm.

Summary • Grouped superheroes using recent algorithms.

Call to Action! • Implement WTA algorithm in your favourite
programming language • Add ranking to WTA to create WTM algorithm. • Don’t care about performance • Check how it works – generate your data sets, get simple ones from the Internet • You’ll gain intuition and understand basic mathematical aparathus • If you feel comfortable try coding self-organising map from WTM.

Call to Action! • Play with Machine Learning and become
superhero ;)

Bibliography • Presentation + code: https://bitbucket.org/medwith/public/downloads/mluvr-coffeJugPart2.zip • https://www.coursera.org/learn/machine-learning • https://www.coursera.org/specializations/deep-learning
• Math for Machine Learning - Amazon Training and Certification • Linear and Logistic Regression - Amazon Training and Certification • Grus J., Data Science from Scratch: First Principles with Python • Patterson J., Gibson A., Deep Learning: A Practitioner's Approach • Trask A., Grokking Deep Learning • Stroud K. A., Booth D. J, Engineering Mathematics • https://github.com/massie/octave-nn- neural network Octave implementation • https://www.desmos.com/calculator/dnzfajfpym - Nanananana … Batman equation ;) • https://xkcd.com/605/ - extrapolating ;) • http://dilbert.com/strip/2013-02-02 - Dilbert & Machine Learning

Thank you

Machine Learning: The Bare Math Behind Librarie...

Machine Learning: The Bare Math Behind Libraries Part 2 - Unsupervised learning

More Decks by Medwith

Other Decks in Programming

Featured

Transcript