Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Machine Learning: The Bare Math Behind Libraries Part 2 - Unsupervised learning

Machine Learning: The Bare Math Behind Libraries Part 2 - Unsupervised learning

Second part of a talk about basics of Machine Learning. It covers: Hebbian learning, WTA, WTM and introduces the concept of Kohonen's network.

Medwith

April 30, 2020
Tweet

More Decks by Medwith

Other Decks in Programming

Transcript

  1. Why would we let them? • Less complex mathematical apparatus

    than in supervised learning. • It is similar to discovering world on your own.
  2. Why would we let them? Used mostly for sorting and

    grouping when: • Sorting key can’t be easily figured out. • Data is very complex and finding the key is not trivial.
  3. Hebbian learning • Works similar to the nature • Great

    for beginners and biological simulations :) • Simple Hebbian learning algorithm Δ w ij =η⋅x ij ⋅y i Δ w ij −change of j weight of ineuron η−learningcoefficient x ij − jinput of ineuron y i −output of i neuron
  4. Hebbian learning • Works similar to the nature • Great

    for beginners and biological simulations :) • Generalised Hebbian learning algorithm Δ w ij =F(x ij , y j ) Δ w ij −change of j weight of ineuron η−learningcoefficient x ij − jinput of ineuron y i −output of i neuron
  5. Hebb’s neuron model w 1 w 2 w n w

    0 Δ w ij =F(x ij , y j ) Σ x 1 x 2 x n ... 1 s y
  6. Hebb’s neuron model 0.230 0.010 0.900 0.110 Δ w ij

    =F(x ij , y j ) Σ 0.200 0.300 0.100 1
  7. Hebb’s neuron model 0.230 0.010 0.900 0.110 Δ w ij

    =F(x ij , y j ) Σ 1 0.046 0.003 0.090 0.110 0.200 0.300 0.100
  8. Hebb’s neuron model 0.230 0.010 0.900 0.110 Δ w ij

    =F(x ij , y j ) Σ 1 0.046 0.003 0.090 0.110 0.249 0.200 0.300 0.100
  9. Hebb’s neuron model 0.230 0.010 0.900 0.110 Δ w ij

    =F(x ij , y j ) Σ 1 0.046 0.003 0.090 0.110 0.249 0.562 0.562 0.200 0.300 0.100
  10. Hebb’s neuron model 0.230 0.010 0.900 0.110 Δ w ij

    =F(x ij , y j ) Σ 1 0.562 0.200 0.300 0.100
  11. Hebb’s neuron model 0.230 0.010 0.900 0.110 Δ w ij

    =F(x ij , y j ) Σ 1 0.562 +0.011 +0.016 +0.005 +0.056 0.300 0.100 0.200
  12. Marvel database to the rescue Intelligence Strength Speed Durability Energy

    projection Fighting skills Iron Man 6 6 5 6 6 4 Spiderman 4 4 3 3 1 4 Black Panter 5 3 2 3 3 5 Wolverine 2 4 2 4 1 7 Thor 2 7 7 6 6 4 Dr Strange 4 2 7 2 6 6 Hulk 2 7 3 7 5 4 Cpt. America 3 3 2 3 1 6 Mr Fantastic 6 2 2 5 1 3 Human Torch 2 2 5 2 5 3 Invisible Woman 3 2 3 6 5 3 The Thing 3 6 2 6 1 5 Luke Cage 3 4 2 5 1 4 She Hulk 3 7 3 6 1 4 Ms Marvel 2 6 2 6 1 4 Daredevil 3 3 2 2 4 5
  13. Hebbian learning weaknesses • Unstable. • Prone to rise the

    weights ad infinitum. • Some groups can trigger no response. • Some groups may trigger response from too many neurons.
  14. Learning with concurrency • You try to generalize input vector

    in weights vector. • Instead of checking the reaction to input - you check distance between both vectors. • Ideally – each neuron specializes in one class generalization. • Two main strategies: – Winner Takes All (WTA) – Winner Takes Most (WTM)
  15. Example 1.0 2.0 3.0 Idea behind Distance 2.0 0.0 -1.0

    Neuron weights 3.0 2.0 2.0 d i =w i −x i Euclidiandistance √∑ i=1 n d i 2 =√5
  16. Distance 2.0 0.0 -1.0 Idea behind Neuron weights 3.0 2.0

    2.0 Learning coefficient η=0.100 Learning Step 0.2 0.0 -0.1 Δ w i =η⋅d i Example 1.0 2.0 3.0 d i =w i −x i Learning coefficient η=0.100
  17. Idea behind Distan ce 2.0 0.0 -1.0 Neuron weights 2.8

    = 3.0 – 0.2 2.0 = 2.0 – 0.0 2.1 = 2.0 -(-0.1) Exam ple 1.0 2.0 3.0 d i =w i −x i Learning coefficient η=0.100 Learning Step 0.2 0.0 -0.1 w' i =w i −Δw i Δ w i =η⋅d i Example 1.0 2.0 3.0 Learning Step 0.2 0.0 -0.1 Δ w i =η⋅d i Distance 2.0 0.0 -1.0 d i =w i −x i
  18. Learning with concurrency • Gives more diverse groups. • Less

    prone to clustering (than Hebb’s). • Searches wider spectrum of answers. • First step to more complex networks.
  19. Learning with concurency - weaknesses • WTA – works best

    if teaching examples are evenly distributed in solution space. • WTM – works best if weights’ vectors are evenly distributed in solution space. • Still can stick to local optimum.
  20. Kohonen’s self-organizing map • The most popular self-organizing network with

    concurrency algorithm. • It teaches groups of neurons with WTM alghoritm • Special features: – Neurons are organised in a grid – Nevertheless – they are treated as a single layer
  21. Kohonen’s self-organizing map w ij (s+1)=w ij (s)+Θ(k best ,i

    , s)⋅η(s)⋅(I j (s)−w ij (s)) s−epochnumber k best −best neuron w ij (s)− j weight of ineuron Θ(k best ,i,s)−neighbourhood function η(s)−learning coefficient for sepoch I j (s)− jchunk of example for s epoch
  22. SOM model By Mcld - Own work, CC BY-SA 3.0,

    https://commons.wikimedia.org/w/index.php?curid=10373592
  23. Common weaknesses of artificial neuron systems • We are still

    dependant on randomized weights. • All algorithms can stick to local optimum.
  24. Call to Action! • Implement WTA algorithm in your favourite

    programming language • Add ranking to WTA to create WTM algorithm. • Don’t care about performance • Check how it works – generate your data sets, get simple ones from the Internet • You’ll gain intuition and understand basic mathematical aparathus • If you feel comfortable try coding self-organising map from WTM.
  25. Bibliography • Presentation + code: https://bitbucket.org/medwith/public/downloads/mluvr-coffeJugPart2.zip • https://www.coursera.org/learn/machine-learning • https://www.coursera.org/specializations/deep-learning

    • Math for Machine Learning - Amazon Training and Certification • Linear and Logistic Regression - Amazon Training and Certification • Grus J., Data Science from Scratch: First Principles with Python • Patterson J., Gibson A., Deep Learning: A Practitioner's Approach • Trask A., Grokking Deep Learning • Stroud K. A., Booth D. J, Engineering Mathematics • https://github.com/massie/octave-nn- neural network Octave implementation • https://www.desmos.com/calculator/dnzfajfpym - Nanananana … Batman equation ;) • https://xkcd.com/605/ - extrapolating ;) • http://dilbert.com/strip/2013-02-02 - Dilbert & Machine Learning