Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Intro to Machine Learning Classification with Python

Intro to Machine Learning Classification with Python

How to get started with classification whether you're a developer or an aspiring data scientist. Start by breaking the problem down by asking the right questions.

Dana Engebretson

August 24, 2016
Tweet

More Decks by Dana Engebretson

Other Decks in Technology

Transcript

  1. Objectives @bigdana classification vs regression navigate better - use an

    api or scikit-learn? multi-class vs multi-label classification performance evaluation down sampling over fitting
  2. The Data @bigdana 9 Democratic Debates 12 Republican Debates annotated

    with: “applause” “cheering” “laughter” “booing” source: http://www.presidency.ucsb.edu/debates.php total words: 56118 Hillary Donald total words: 40699 total turns: 564 total turns: 774
  3. Binary Classification @bigdana Hillary Donald “I think the big problem

    this country has is being politically correct.”
  4. Binary Classification @bigdana No Applause Applause “Oh, for goodness —

    that's not going to happen. I'm not even answering that question.”
  5. multi-class classification @bigdana positive controversial neutral negative “First of all,

    there's nobody on this stage that's more pro Israel than I am. OK. There's nobody. I am pro-Israel.” laughter, cheering applause OR (laughter, cheering applause) OR AND booing booing
  6. Try a Machine Learning API @bigdana IBM Watson Azure Machine

    Learning API AWS API BIGML Google Prediction API
  7. Pros and Cons of Google Prediction API @bigdana pros: hosted

    model stream training data cons: standard rate limit: 1 prediction per second confusion matrix is not super intuitive to evaluate performance
  8. What you’ll need to know/learn @bigdana pandas feature selection how

    to work with an api - oauth credentials how to interpret a confusion matrix
  9. I want to work with prebuilt models @bigdana Scikit-learn models:

    Logistic Regression Support Vector Machines Stochastic Gradient Descent Nearest Neighbors Decision Trees Ensemble Methods: Gradient Tree Boosting
  10. What you will need to know/learn @bigdana pandas feature selection

    down sampling which models to play with and how how to interpret precision, recall and f1 scores for performance evaluation
  11. I want to learn how to build my own models!

    @bigdana Machine Learning by Andrew Ng on Coursera: Tensor Flow: https://github.com/aymericdamien/TensorFlow-Examples https://www.coursera.org/learn/machine-learning Hybrid approach: Scikit-Flow
  12. What you will need to know/learn @bigdana matrix algebra calculus

    Machine Learning by Andrew Ng on Coursera: Tensor Flow: A machine learning course?
  13. 80/20 @bigdana Training set Test set Well, first of all,

    it's great to be here in New York, and I am delighted to...... have this chance to discuss the issues that are important to our future. I was so honored to serve as a Well, it is true that now that the spotlight is pretty bright here in New York, some things have been said and Senator Sanders did call me unqualified. I've been called a Well, let me... ...let me just say... ...let me...let me say... 1 1 0 0 0 X y Clinton
  14. 80/20 @bigdana Training set Test set I fully understand. I

    cannot say. I have to respect the person that, if it's not me, the person that wins, if I do win, and I'm leading by quite a bit, that's what I want to do. I can totally make that pledge. If Well, I've given him plenty of money. I will not make the pledge at this time. Only Rosie O'Donnell. 0 0 0 0 0 X y Trump
  15. Excellent Precision @bigdana If you say it gets an applause

    - it does. You might miss some that should get an applause.
  16. Excellent Recall @bigdana You are really good at catching all

    the applauses! You sometimes catch a lot that aren’t applauses though.