Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Munich Datageeks - Introduction to SVM using Python

Munich Datageeks - Introduction to SVM using Python

Accompanying IPython Notebook found at:
http://bit.ly/1g7VEfG

D0ab1fbc41764f8ea112824449b33e18?s=128

Miguel Cabrera

March 25, 2014
Tweet

Transcript

  1. INTRODUCTION TO Support vector machines USING PYTHON USING THOR‘s HAMMER

    Miguel Cabrera @mfcabrera http://mfcabrera.com
  2. OVERVIEW Tools Intro to Machine Learning Intuition behind SVM Basic

    usage Some situations Examples
  3. TOOLS Every hero needs weapons

  4. None
  5. None
  6. •  Powerful interactive shells (terminal and Qt-based). •  A browser-based

    notebook with support for code, text, mathematical expressions, inline plots and other rich media. •  Support for interactive data visualization and use of GUI toolkits. •  Flexible, embeddable interpreters to load into your own projects. •  Easy to use, high performance tools for parallel computing. From http://ipython.org :
  7. Machine LEARNING

  8. Machine Learning Supervised Classification Regression Unsupervised Clustering Feature Learning Reinforcement

    Learning, Recommender Systems, etc. No reason for Natalie Portman to be
  9. Supervised x1 x2

  10. UNSUPERVISED x1 x2

  11. labels labels Supervised Training Tex, Documents, Imgages Training Tex, Documents,

    Imgages Training Tex, Documents, Imgages, Sounds labels Machine Learning Algorithm Predictive Model Features Vectors New Text Document, Images, Sounds Feature Vector Expected Label Adapted from: https://speakerdeck.com/ogrisel/machine-learning-in-python-with-scikit-learn
  12. Applications Source: https://speakerdeck.com/ogrisel/trends-in-machine-learning-2

  13. Spam Classification

  14. TOPIC Classification

  15. Sentiment analysis

  16. OBJECT CLASSIFICATION

  17. Support vector machines

  18. Support vector machines? •  Effective in high dimensional spaces. • 

    Still effective in cases where number of dimensions is greater than the number of samples. •  Uses a subset of training points in the decision function (called support vectors), so it is also memory efficient. •  Easy to use and train •  Versatile: different kernel functions can be specified for the decision function.
  19. WHEN?

  20. Source: http://peekaboo-vision.blogspot.de/2013/01/machine-learning-cheat-sheet-for-scikit.html

  21. THOR LIKES HIS HAMMER....

  22. FOR THE FOLLOWING SECTIONS SEE THE IPYTHON NOTEBOOK FOUND AT:

    http://bit.ly/1g7VEfG
  23. INTUITION BEHIND SVM

  24. KERNELS

  25. MULTICLASS

  26. UNBALANCED DATA

  27. TRAINING AND CROSS VALIDATIOn

  28. Conclusion •  (I)Python / SciPy / Numpy / Scikit-Learn are

    awesome :D! •  SVM is mature algorithm, straightforward to use and works well in most of the cases. •  Being able to use Kernels allows SVM to learn complex decision boundaries. •  Using LibSVM / LibLinear based libraries allow for reusing models across languages or at least prototyping in Python.
  29. QUESTIONS?

  30. THANK YOU