Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ISI Programming Course - 06 - Scikit Learn

ISI Programming Course - 06 - Scikit Learn

Jungwon Seo

October 22, 2018
Tweet

More Decks by Jungwon Seo

Other Decks in Technology

Transcript

  1. Supervised vs. Unsupervised Supervised Learning Unsupervised Learning Known number of

    classes Unknown number of classes Uses training dataset Uses input dataset For prediction For analysis
  2. Supervised Learning {features, class} Model {features} Model Class A? Class

    B? Class C? Training Testing Result Training Data Testing Data
  3. Classifier vs. Regressor Trained Classifier Trained Regressor Pass / Fail

    Score 
 ( e.g. 95/100) Student info Student info
  4. Classification algorithms • Linear Classifiers: Logistic Regression, Naive Bayes Classifier.

    • Decision Trees • Random Forest • Support Vector Machines • Neural Networks • Nearest Neighbor • Genetic Algorithm • …….
  5. Logistic Regression • To find weights (B) we train the

    model. • The goal is making function that divide the data the best.
  6. Decision Tree <Mansplaining Decision Tree> Did she ask? Not mansplaining

    Do you know better than her? Did you ask 
 if she needed it explained? Not mansplaining Mansplaining Mansplaining Mansplaining Yes Yes Yes No About the same No No
  7. Decision Tree • The most important part is determining the

    split node. • e.g. : Gender ? F : M, Height ? over 170 : under 170. • There are several index that you can use. • e.g. : Entropy, Gini, Misclassification error • The training data is used for building the tree.
  8. Random Forest • The randomness affects while building each tree.

    • It is improved version of decision tree • But! You can’t explain about the process. (Blackbox)
  9. Support Vector Machine 1. Find the border with the largest

    margin 2. For non-linear case, mapping inputs into high-dimensional feature spaces.
  10. SVM vs. Logistic Reg. • Logistic regression focuses on maximizing

    the probability of the data. The farther the data lies from the separating hyperplane (on the correct side), the happier LR is. • An SVM tries to find the separating hyperplane that maximizes the distance of the closest points to the margin (the support vectors). If a point is not a support vector, it doesn’t really matter. http://www.cs.toronto.edu/~kswersky/wp-content/uploads/svm_vs_lr.pdf
  11. K-Nearest Neighbor 1. How to determine K? 2. What kind

    of distance shall we use? Euclidean? Manhattan?
  12. 2. Calculating similarities • Minkowski Distance • Manhattan Distance •

    Euclidean Distance • Chebyshev Distance • Cosine Distance Source Code