Slide 1

Slide 1 text

Welcome to the Covid Coding Program

Slide 2

Slide 2 text

Let’s Start Basics of Machine Learning! I’m, Charmi Chokshi An ML Engineer at Shipmnts.com and a passionate Tech-speaker. A Critical Thinker and your mentor of the day! Let’s connect: @CharmiChokshi

Slide 3

Slide 3 text

Let’s classify some points!

Slide 4

Slide 4 text

Which hyperplane to choose?

Slide 5

Slide 5 text

Which hyperplane to choose?

Slide 6

Slide 6 text

Margin

Slide 7

Slide 7 text

Which hyperplane to choose?

Slide 8

Slide 8 text

Which hyperplane to choose?

Slide 9

Slide 9 text

Robust to Outliers

Slide 10

Slide 10 text

Which hyperplane to choose?

Slide 11

Slide 11 text

Tada!! Introducing a new feature

Slide 12

Slide 12 text

Kernel trick

Slide 13

Slide 13 text

Hyperplane

Slide 14

Slide 14 text

Large Margin - In logistic regression, we take the output of the linear function and squash the value within the range of [0,1] using the sigmoid function. If the squashed value is greater than a threshold value(0.5) we assign it a label 1, else we assign it a label 0. - In SVM, we take the output of the linear function and if that output is greater than 1, we identify it with one class and if the output is -1, we identify is with another class. Since the threshold values are changed to 1 and -1 in SVM, we obtain this reinforcement range of values([-1,1]) which acts as margin.

Slide 15

Slide 15 text

Cost Function - In the SVM algorithm, we are looking to maximize the margin between the data points and the hyperplane. The loss function that helps maximize the margin is hinge loss.

Slide 16

Slide 16 text

Cost Function - The cost is 0 if the predicted value and the actual value are of the same sign. If they are not, we then calculate the loss value. We also add a regularization parameter the cost function. The objective of the regularization parameter is to balance the margin maximization and loss. After adding the regularization parameter, the cost functions looks as below.

Slide 17

Slide 17 text

Pros and Cons Pros: ○ It works really well with a clear margin of separation ○ It is effective in high dimensional spaces. ○ It uses a subset of training points in the decision function (called support vectors), so it is also memory efficient. Cons: ○ It doesn’t perform well when we have large data set because the required training time is higher ○ It also doesn’t perform very well, when the data set has more noise i.e. target classes are overlapping ○ SVM doesn’t directly provide probability estimates, these are calculated using an expensive five-fold cross-validation. It is included in the related SVC method of Python scikit-learn library.

Slide 18

Slide 18 text

Comparing 10 Classification algos

Slide 19

Slide 19 text

Thank You!