Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Munich Datageeks - Introduction to SVM using Python

Munich Datageeks - Introduction to SVM using Python

Accompanying IPython Notebook found at:
http://bit.ly/1g7VEfG

Miguel Cabrera

March 25, 2014
Tweet

More Decks by Miguel Cabrera

Other Decks in Technology

Transcript

  1. INTRODUCTION TO Support vector machines
    USING PYTHON
    USING THOR‘s HAMMER
    Miguel Cabrera
    @mfcabrera
    http://mfcabrera.com

    View full-size slide

  2. OVERVIEW
    Tools
    Intro to Machine Learning
    Intuition behind SVM
    Basic usage
    Some situations
    Examples

    View full-size slide

  3. TOOLS
    Every hero needs weapons

    View full-size slide

  4. •  Powerful interactive shells (terminal
    and Qt-based).
    •  A browser-based notebook with
    support for code, text, mathematical
    expressions, inline plots and other
    rich media.
    •  Support for interactive data
    visualization and use of GUI toolkits.
    •  Flexible, embeddable interpreters to
    load into your own projects.
    •  Easy to use, high performance tools
    for parallel computing.
    From http://ipython.org :

    View full-size slide

  5. Machine LEARNING

    View full-size slide

  6. Machine Learning
    Supervised
    Classification
    Regression
    Unsupervised
    Clustering
    Feature Learning
    Reinforcement Learning, Recommender Systems, etc.
    No reason for Natalie Portman to be

    View full-size slide

  7. Supervised
    x1
    x2

    View full-size slide

  8. UNSUPERVISED
    x1
    x2

    View full-size slide

  9. labels
    labels
    Supervised
    Training
    Tex,
    Documents,
    Imgages
    Training
    Tex,
    Documents,
    Imgages
    Training
    Tex,
    Documents,
    Imgages,
    Sounds
    labels
    Machine
    Learning
    Algorithm
    Predictive
    Model
    Features
    Vectors
    New Text
    Document,
    Images,
    Sounds
    Feature Vector Expected
    Label
    Adapted from:
    https://speakerdeck.com/ogrisel/machine-learning-in-python-with-scikit-learn

    View full-size slide

  10. Applications
    Source: https://speakerdeck.com/ogrisel/trends-in-machine-learning-2

    View full-size slide

  11. Spam Classification

    View full-size slide

  12. TOPIC Classification

    View full-size slide

  13. Sentiment analysis

    View full-size slide

  14. OBJECT CLASSIFICATION

    View full-size slide

  15. Support vector machines

    View full-size slide

  16. Support vector machines?
    •  Effective in high dimensional spaces.
    •  Still effective in cases where number of
    dimensions is greater than the number of
    samples.
    •  Uses a subset of training points in the decision
    function (called support vectors), so it is also
    memory efficient.
    •  Easy to use and train
    •  Versatile: different kernel functions can be
    specified for the decision function.

    View full-size slide

  17. Source: http://peekaboo-vision.blogspot.de/2013/01/machine-learning-cheat-sheet-for-scikit.html

    View full-size slide

  18. THOR LIKES HIS HAMMER....

    View full-size slide

  19. FOR THE FOLLOWING SECTIONS
    SEE THE IPYTHON NOTEBOOK
    FOUND AT:
    http://bit.ly/1g7VEfG

    View full-size slide

  20. INTUITION BEHIND SVM

    View full-size slide

  21. UNBALANCED DATA

    View full-size slide

  22. TRAINING AND
    CROSS VALIDATIOn

    View full-size slide

  23. Conclusion
    •  (I)Python / SciPy / Numpy / Scikit-Learn are
    awesome :D!
    •  SVM is mature algorithm, straightforward to
    use and works well in most of the cases.
    •  Being able to use Kernels allows SVM to learn
    complex decision boundaries.
    •  Using LibSVM / LibLinear based libraries allow
    for reusing models across languages or at least
    prototyping in Python.

    View full-size slide