Things that you should know in ml

Krunal Kapadiya Engineer @ @krunal3kapadiya #IndiaMLCC #UFest18 1

Input Output Magic What is Machine Learning? 3

Types of machine learning Based on Human supervision Supervised ML
Unsupervised ML Semi supervised Reinforcement learning 4

Types of machine learning Based on learning Online learning Offline
learning Based on Data patterns Instance based Model based 5

Measures of central tendency What is Mean? - Average value
from dataset What is Median? - Middle value of dataset What is Mode? - Repeated value of dataset http://bit.ly/MeasureCentralTendency 6

Skewness Remember Negative Skew: Mean is less than mode Positive
Skew: Mean is greater than mode 7

Data scientist ≠ Data Engineer Machine learning ≠ Data scientist
8

Training set and Testing set 9

“A” - Auto regression - Time series regression model -
Activation Function - Sigmoid - Use in binary classification - ReLU - Helps in hidden layers - Softmax - Mostly used in multiclass classification - A/B Testing - Which technique perform better - Accuracy - correctly predicted values 14

When you learn ML in 24 hours 15

“B” - Bagging - multiple models and final prediction combining
all predictions - Box plot - displays range of variations in data - Backpropagation - update weights reduce errors - Batches - small chunks and splitted data - Batch normalization - improve performance and stability of DNN 16

“C” - CNN - Convolutional Neural network - Classification -
Labels are known - Cost function - cost function minimum, models accuracy best - Confusion Matrix - displays performance of the model 17

“D” - Dropout - Hidden layer dropped to prevent overfitting
- Data Augmentation 18

“E” - Eager execution - operations runs immediately, waiting for
graph execution - Epochs - single training iteration - Early stopping - prevent overfitting “F” - Forward propagation - only one way input to output, no backward 19

“G” - Gradient Descendant - Batch GD - Stochastic Gradient
Descendant - Mini batch GD 20

“G”eneralization (a.k.a out of sample error) - Measure of accuracy
for previously unseen data - Difference between expected and proven error - Mostly occurs in deep learning model, training sets working fine, but not fitting in real data 21

“H” - Hyperparameters - values set before training model, e.g.
batch size, number of tree - Histogram - use to determine skewness “I” - Imputation - wrangling data, filling missing values 22

“L” - Learning rate - amount of minimizing in cost
function - LSTM - building units in RNN, speech pred, rhythm learning “M” - MLP (Multilevel perception) - aka fully connected layers 23

“N” - Numpy - Linspace - Random - Array -
Arange “O” - Outliers - value that far away from dataset pattern 24

“P” - Pandas - Dataframes - Series - Pooling -
use to reduce parameters and prevent overfitting “R” - Regression - predicting values, typically in floating points 25

Pooling 26

Validating model Confusion Matrix - If false negatives are ok,
requires high precision, e.g. Spam filter - If false positives are ok, requires high recall, e.g. Medical Diagnosis Precision Recall F-1 Score Accuracy Accuracy = Ratio of correctly classified points / total points 27

Let’s Go For It 1. Look at the dataset 2.
Write down columns and it’s correlation 3. Make questions derived from the dataset 4. Explanatory Analysis with visualization 5. Frame problem 6. Create solution by creating model 28

Explanatory Analysis - Look at the rows and tables -
Find correlated columns - Display it in charts - Give summary based on graph 29

TMDB Notebook (Dataset) 30

Reference https://www.analyticsvidhya.com/blog/2017/05/25-must-know-terms-concepts-for-begi nners-in-deep-learning/ https://ml-cheatsheet.readthedocs.io 31

Thank You 33 https://krunal3kapadiya.app/ @krunal3kapadiya #IndiaMLCC #Ufest18

Things that you should know in ml

Things that you should know in ml

Krunal Kapadiya

More Decks by Krunal Kapadiya

Other Decks in Technology

Featured

Transcript

Krunal Kapadiya Engineer @ @krunal3kapadiya #IndiaMLCC #UFest18 1

2

Input Output Magic What is Machine Learning? 3

Types of machine learning Based on Human supervision Supervised ML

Types of machine learning Based on learning Online learning Offline

Measures of central tendency What is Mean? - Average value

Skewness Remember Negative Skew: Mean is less than mode Positive

Data scientist ≠ Data Engineer Machine learning ≠ Data scientist

Training set and Testing set 9

10

11

12

13

“A” - Auto regression - Time series regression model -

When you learn ML in 24 hours 15

“B” - Bagging - multiple models and final prediction combining

“C” - CNN - Convolutional Neural network - Classification -

“D” - Dropout - Hidden layer dropped to prevent overfitting

“E” - Eager execution - operations runs immediately, waiting for

“G” - Gradient Descendant - Batch GD - Stochastic Gradient

“G”eneralization (a.k.a out of sample error) - Measure of accuracy

“H” - Hyperparameters - values set before training model, e.g.

“L” - Learning rate - amount of minimizing in cost

“N” - Numpy - Linspace - Random - Array -

“P” - Pandas - Dataframes - Series - Pooling -

Pooling 26

Validating model Confusion Matrix - If false negatives are ok,

Let’s Go For It 1. Look at the dataset 2.

Explanatory Analysis - Look at the rows and tables -

TMDB Notebook (Dataset) 30

Reference https://www.analyticsvidhya.com/blog/2017/05/25-must-know-terms-concepts-for-begi nners-in-deep-learning/ https://ml-cheatsheet.readthedocs.io 31

32

Thank You 33 https://krunal3kapadiya.app/ @krunal3kapadiya #IndiaMLCC #Ufest18