Introduction to Machine Learning for Data Science

Slide 1

Slide 1 text

Workshop Eps. Introduction to Machine Learning for Data Science 29th of June, 2020, Purwadhika Classroom @Hangout Google Hello data geeks!

Slide 2

Slide 2 text

greetings! just call me Fiqry (with/without sufﬁx) I currently working as a Data Scientist @Bukalapak, also had been working as Technical Content Reviewer @Packt Publishing (working remotely) I also passionate on Time Series Analytics, Immersive Computing (VR & AR), and Gamiﬁcation Business. That’s it ya!

Slide 3

Slide 3 text

disclaimer

Slide 4

Slide 4 text

Please expect these things: Everything will be delivered in introduction level, please don’t expect you will be an expert Data Scientist/ML Engineer after attending this workshop There will be a hands-on session, coding with python language. If you are not familiar with python (Or don’t even have any programming background), please expect you just understand the syntax selection Please expect that there won’t be any content repetition during the workshop once you were disconnected, you are able to rewind the material later using recorded video by Purwadhika No repetition For Disconnect Accident Know the Workshop Scope Programming Language using Python

Slide 5

Slide 5 text

This talk will be conducted in Three Different levels: Talk in various spectrums, from Technology, Business, Sociology, Economy, to widen and enlarge the point of view and paradigm 100% Theory, 0% Practice @ 1 hour (Modul 1) Talk speciﬁc in one or two domains, describe the process from the upstream to downstream, and a bit of coding 50% Theory, 50% Practice @ 1 hour (Modul 2) Talk fundamentally in one domain, answering “how” question, get ready to make hands dirty 0% Theory, 100% Practice @ 1 hour (Modul 3) Low-Level High-Level Med-Level

Slide 6

Slide 6 text

Bookmark The Star Slides that have a star on the top left corner, is very important to give more focus

Slide 7

Slide 7 text

T-Shape Outcome This workshop expect you tobe be a T-shape person in Data Science Industry

Slide 8

Slide 8 text

“In God we trust. All others must bring data” —W. Edwards Deming

Slide 9

Slide 9 text

Dismantle Machine Learning Engine Table of contents Low-level discussion: Get your hands dirty on data forecasting and predictive modelling. High-level discussion: How to get inspired by data science in a different perspective. Data Science, an unpredictable tale Med-level discussion: An end-to-end process on how does machine learning works. Try to code. Hands-On real industry case 01 02 03

Slide 10

Slide 10 text

High-level discussion: How to get inspired by data science in a different perspective Data Science, an unpredictable tale 01

Slide 11

Slide 11 text

Data Science is changing the world, would you mind to take a part of it?

Slide 12

Slide 12 text

Med-Level Discussion: An end-to-end process on how does machine learning works. Try to code. Dismantle the Machine Learning Engine 02

Slide 13

Slide 13 text

Dismantle the Machine Learning Engine: Understanding Level of Data Science Understand the differences between Supervised, Unsupervised, Deep Learning. Know how to determine and present best model. Early level of EDA (Exploratory Data Analysis) Understand how to increase model accuracy, handle data problems (imbalance data, missing value), and proﬁcient in model selections. Able to do Feature Engineering. Expert level of increasing model accuracy (New Deep Learning Arch), very proﬁcient on handling data problems, able to propose new algorithm in different data case Entry medior Senior Managerial Route Technical Route

Slide 14

Slide 14 text

Dismantle the Machine Learning Engine: Things to Touch and Say Hello with Greetings to Machine Learning types, such as Supervised Learning, Unsupervised Learning Get in touch the end-to-end process of doing Machine Learning things, along with the most recent tools Understand the mechanism of data behavior and model selection by its metrics Data, Metrics, and Model Selection Say hello to Machine Learning Tools and End-to-End Process

Slide 15

Slide 15 text

Variable a medium to store data/information/value Data is a value/information, can be numeric/alphabetic/picture/video/etc Machine Learning Model a simpliﬁed program that can be taught data (input) to predict output Common used Terms

Slide 16

Slide 16 text

Variable X open variable that used to predict Y (independent/feature variable) Variable Y label variable that determined by X (dependent/response variable) Common used Terms

Slide 17

Slide 17 text

Algorithm a sequence of steps to solve problem Train & Testing Data Train data is data that used to Train ML Model Testing data is data that used to Test ML Model Accuracy Feature Engineering a speciﬁc domain to produce new features based on existing features Common used Terms

Slide 18

Slide 18 text

Say Hello to Machine Learning!

Slide 19

Slide 19 text

What do you think about Machine Learning?

Slide 20

Slide 20 text

No content

Slide 21

Slide 21 text

Machine that can makes own decision Machine that learn from data Machine that can predict data living computer algorithm math and stats stuff

Slide 22

Slide 22 text

Say Hello to Machine Learning: Definition and Its Derivatives Input (Data) Static Code/Syntax Output (Data) Traditional Programming Input (Data) + Output (Data) Train ML Model (Learn) ML Model (Program) Prediction (from Input) Machine Learning Programming

Slide 23

Slide 23 text

https://christophm.github.io/interpretable-ml-book/terminology.html Normal Programming vs Machine Learning

Slide 24

Slide 24 text

Say Hello to Machine Learning: Definition and Its Derivatives Input: Height = [145,154,177,150,170] Static Program: IF Logic IF Height < 150 then Short IF Height >= 150 or Height <= 175 then Average IF Height > 175 then Tall Output: [Short, Average, Tall, Average, Average] Input: Height = [145,154,177,150,170] Classification_Label = [Short, Average, Tall, Average, Average] Machine Learning Algorithm: Train the Model using Height and Classification_Label Prediction using Trained ML Model: New Height = 190 New Classification Label = Tall Traditional Programming Machine Learning Programming

Slide 25

Slide 25 text

Say Hello to Machine Learning: Machine Learning Venn Diagram

Slide 26

Slide 26 text

Say Hello to Machine Learning: Definition and Its Derivatives Supervised Machine Learning Unsupervised Machine Learning Teach ML Model by using Predictor Variable X and Label Variable Y Teach ML Model by using Predictor Variable X only, let the model predict the Label Variable Y

Slide 27

Slide 27 text

Say Hello to Machine Learning: Supervised Machine Learning Supervised Machine Learning

Slide 28

Slide 28 text

Say Hello to Machine Learning: Supervised Machine Learning COW Group A COW Group B COW Group C COW DATA (Height, Color, Weight) COW CLASS (A,B,C) “Supervised” ML MODEL

Slide 29

Slide 29 text

Say Hello to Machine Learning: Supervised Machine Learning

Slide 30

Slide 30 text

Say Hello to Machine Learning: Supervised Machine Learning

Slide 31

Slide 31 text

Data, Metrics, and Model Selection supervised Machine Learning Popular Algorithm Supervised Machine Learning Linear Regression XGBoost (XGB) Random Forest (RF) Support Vector Machine (SVM)

Slide 32

Slide 32 text

Say Hello to Machine Learning: Supervised Machine Learning Regression Dynamic Pricing (Surge Price) House Price Prediction Classification Captcha Security Email Spam Filtering

Slide 33

Slide 33 text

Say Hello to Machine Learning: UnSupervised Machine Learning Unsupervised Machine Learning

Slide 34

Slide 34 text

Say Hello to Machine Learning: Supervised Machine Learning COW Group A COW Group B COW Group C COW DATA (Height, Color, Weight) COW CLASS (A,B,C) “Unsupervised” ML MODEL

Slide 35

Slide 35 text

Say Hello to Machine Learning: Unsupervised Machine Learning

Slide 36

Slide 36 text

Say Hello to Machine Learning: Unsupervised Machine Learning Clustering Dimensional Reduction

Slide 37

Slide 37 text

Say Hello to Machine Learning: Unsupervised Machine Learning Clustering User Segmentation Dimensional Reduction User Segmentation

Slide 38

Slide 38 text

Data, Metrics, and Model Selection Unsupervised Machine Learning Popular Algorithm Unsupervised Machine Learning Hierarchical Clustering t-SNE K-Means Principal Component Analysis (PCA)

Slide 39

Slide 39 text

Say Hello to Machine Learning: UnSupervised Machine Learning RECAP Supervised & Unsupervised

Slide 40

Slide 40 text

Say Hello to Machine Learning: Supervised and Unsupervised Recap

Slide 41

Slide 41 text

Say Hello to Machine Learning: Machine learning Model space

Slide 42

Slide 42 text

Tools and End-to-End Process

Slide 43

Slide 43 text

Tools and End-to-End Process: Data Science Workflow in industry Adjustment of model weight matrices to be stored in microservice, create an architecture workﬂow to be data pipeline, ready to deploy 3.Adjustment and Deployment 25% Utilizing machine learning algorithm to build an automation, also evaluating the built model accuracy 2.Modeling and Evaluation 25% Starts from collecting data, preprocessing, and doing exploratory data analysis 1.Ingestion and Analysis 50%

Slide 44

Slide 44 text

Tools and End-to-End Process: Step 1 - Ingestion and Analysis 5 Analysis and Visualization Making analysis from the preprocessed data, drive and proof the research hypothesis rightness by visualize it by some graphs or descriptive one 3 Data Retrieval Retrieving data from query schema, could be from data warehouse, or scraping from the internet 1 Research Hypothesis Conducting research ﬂow along with the hypotheses that might solve the problems 4 Data Preprocessing Cleaning the whole data, such as control the outlier, transform or standardize, null values handling, etc. 2 Data Query Schema Determine which data to take, which tables, which features, etc.

Slide 45

Slide 45 text

Tools and End-to-End Process: Step 2 - Modeling and Evaluation 5 M odel Selection Select best m odel by highest accuracy/interpretation 4 M odel Evaluation Evaluate M odel Accuracy using Test Data 3 Train M achine Learning M odel Train M L M odel by Train Data Feature Engineering Produce new features by existing feature 2 1 Research M ethodology Choose a Proper M L Algorithm to Research Objective

Slide 46

Slide 46 text

Tools and End-to-End Process: Step 3 - Adjustment and Deployment Production 1 2 Deployment to Production Store the model weight matrices into container that runs their requirements and dependencies Ensure the model pipeline runs smoothly from upstream to downstream Adjustment and Communication

Slide 47

Slide 47 text

Data, Metrics, and Model Selection

Slide 48

Slide 48 text

Data, Metrics, and Model Selection Data, and its derivative DATA NUMERICAL (numbers) cATEGORICAL (text,alphabetic) Discrete [0, 1, 2, 3, 4, … N] CONTINUOUS [0.1,0.001,...,1] NOMINAL no hierarchy [gender, address] ORDINAL have hierarchy [education level] No Encoding

Slide 49

Slide 49 text

Data, Metrics, and Model Selection Data, and its derivative DATA Image, Video Sound Need Encoding

Slide 50

Slide 50 text

Data, Metrics, and Model Selection Data, and its derivative DATA Problems Missing Value Duplicated High-Value Gap Imbalanced

Slide 51

Slide 51 text

Data, Metrics, and Model Selection Data, and its derivative Missing Value NA = Not Available. Might be NULL, NaN or etc. Global symbol of missing value

Slide 52

Slide 52 text

Data, Metrics, and Model Selection Data, and its derivative Missing Value Substitute with Statistical Ways: Mean | Median | Mode (Most Frequent) Drop Columns/Rows from Missing Values 1 2

Slide 53

Slide 53 text

Data, Metrics, and Model Selection Data, and its derivative Missing Value AVG = 20.25 AVG = 53.5 AVG = 74

Slide 54

Slide 54 text

Data, Metrics, and Model Selection Data, and its derivative Missing Value

Slide 55

Slide 55 text

Data, Metrics, and Model Selection Data, and its derivative Duplicated Just Remove it!

Slide 56

Slide 56 text

Data, Metrics, and Model Selection Data, and its derivative High Value Gap Target Variable Predictor Variable Task: Predict the competition winner by given a set of predictor variables and target variable!

Slide 57

Slide 57 text

Data, Metrics, and Model Selection Data, and its derivative High Value Gap Normalization Technique Normalizing will ensure that a convergence problem does not have a massive variance, making optimization feasible.

Slide 58

Slide 58 text

Slide 59

Slide 59 text

Data, Metrics, and Model Selection Data, and its derivative Imbalanced

Slide 60

Slide 60 text

Data, Metrics, and Model Selection Data, and its derivative Train: part on which your ML algorithms are actually trained to build a model (60% of your data) Validation: to validate our various model ﬁts (20% of your data) Test: to test our model hypothesis. left untouched and unseen until the model and hyperparameters are decided (20% of your data) Train Test Split

Slide 61

Slide 61 text

Data, Metrics, and Model Selection Metrics, and its derivative ML Metrics NUMERICAL (numbers) cATEGORICAL (text,alphabetic) Distance-based (show condition) RMSE, MAE, .. percentage (interpretable) MAPE, R2, ... Interpretability (Meaningful) Precision, Recall, .. Reliability (Stable) AUC, ROC, ... Based on Target Variable

Slide 62

Slide 62 text

Data, Metrics, and Model Selection Metrics, and its derivative Numerical MAPE = Example: Model A evaluation: MAPE = 7.9% ~ The model is only wrong 7.9% to predict Y R2 = 88% ~ The given Variable X could precisely (88%) illustrate the variance of variable target (Y) Distance Principal “The lower, the better” Percentage Principal “Have a rule”

Slide 63

Slide 63 text

Data, Metrics, and Model Selection Metrics, and its derivative Categorical Case: No pregnancy (event), A person (man/woman) False Positive (FP): Predict an event when there is no event (bad) False Negative (FN): Predict no event when there is an event (bad) True Positive (TP): Predict an event when there is an event (good) True Negative (TN): Predict no event when there is no event (good) Event: Pregnancy Logic: - Man can’t pregnant - Woman can pregnant FP: ML Model predict man pregnant FN: ML Model predict woman not pregnant (but the reality is pregnant)

Slide 64

Slide 64 text

Case: No pregnancy (event), A person (man/woman) False Positive (FP): Man is pregnant, actual is not pregnant False Negative (FN): Woman is not pregnant, actual is pregnant True Positive (TP): Man is not pregnant, actual is not pregnant True Negative (TN): Woman is pregnant, actual is pregnant Data, Metrics, and Model Selection Metrics, and its derivative Categorical Precision = TP / (TP+FP) → Stay Aggressive Recall = TP / (TP + FN) → Stay Careful

Slide 65

Slide 65 text

Precision: We want ML model could predict an event with aggressively (event exist or not exist, our prediction must predict an event is exist) Recall: We want ML model could predict an event carefully (better if not to predict an event, rather than wrong prediction) ---------------------------------------------------------------------------------------------- Another Example: A rain prediction, A man False Positive (FP): A man told to bring an umbrella, but the actual is no rain in the whole day. False Negative (FN): A man told not to bring an umbrella. but the actual is rain in the whole day If you a businessman, which risk will you minimize ﬁrst ? FP/FN? Data, Metrics, and Model Selection Metrics, and its derivative Categorical Precision = TP / (TP+FP) → Stay Aggressive Recall = TP / (TP + FN) → Stay Careful

Slide 66

Slide 66 text

Data, Metrics, and Model Selection Metrics, and its derivative Precision: We want ML model could predict an event with aggressively (event exist or not exist, our prediction must be correct) Recall: We want ML model could predict an event carefully (better if not to predict an event, rather than wrong prediction) ---------------------------------------------------------------------------------------------- Another Example: An email, A spam flagger False Positive (FP): An email is flagged as spam by system, but the actual is not a spam message overall False Negative (FN): An email is not flagged as spam by system, but the actual is really spam and full of phishing links If you a businessman, which risk will you minimize first ? FP/FN? Categorical Precision = TP / (TP+FP) → Stay Aggressive Recall = TP / (TP + FN) → Stay Careful

Slide 67

Slide 67 text

Data, Metrics, and Model Selection Metrics, and its derivative Categorical AUC Score: Better if it is approaching 1.0 *Best metric to the desribe model reliability (imbalanced dataset)

Slide 68

Slide 68 text

Data, Metrics, and Model Selection Model Selection, and its derivative Model Selection Underfitting Overfitting High Bias Low Variance Low Bias High Variance Based on Bias-Variance Tradeoff

Slide 69

Slide 69 text

Data, Metrics, and Model Selection Model Selection, and its derivative Bias is the difference between the average prediction of our model and the correct value which we are trying to predict. Variance is the variability of model prediction for a given data point or a value which tells us spread of our data. Bias Variance Tradeoff

Slide 70

Slide 70 text

Data, Metrics, and Model Selection Model Selection, and its derivative Bias Variance Tradeoff

Slide 71

Slide 71 text

Data, Metrics, and Model Selection Model Selection, and its derivative Underfitting - Model unable to capture the underlying pattern of the data - High bias, Low variance - Usually less amount of data train - or, model is too simple and has very few parameters

Slide 72

Slide 72 text

Data, Metrics, and Model Selection Model Selection, and its derivative Underfitting - Model captures the noise along with the underlying pattern in data - Low bias, High variance - Have a lot over noisy dataset - or, model is too complex, and has many parameters

Slide 73

Slide 73 text

Data, Metrics, and Model Selection Model Selection, and its derivative

Slide 74

Slide 74 text

Confused by Concepts and Theories? Let’s code!

Slide 75

Slide 75 text

Special Part: How to be The most wanted data scientist

Slide 76

Slide 76 text

Special Parts: some notes of How to be an expert data scientist in this era Interpretable ML is good, but most importantly the explainable one. (This skillset is one of most prospective ﬁelds of ML) Data Science is an iterative process. Everyone could be a DS as long as they follow the guided process. If you want to be different, show your domain expertise. Having a lot of learning process is awesome, but most importantly show your side project/analysis impact which can be calculated and be a strong prove of Data Scientist. Throne your impact, not your certificate(s) Take a serious focus on Explainable AI Know your Domain Science

Slide 77

Slide 77 text

Special Parts: some notes of How to be an expert data scientist in this era Recommended Book/Course Udacity Intro to ML Coursera ML E-Book

Slide 78

Slide 78 text

“Without data, you're just another person with an opinion” —W. Edwards Deming

Slide 79

Slide 79 text

Thanks you! Contact me for further questions Linkedin: Fiqry Revadiansyah Telegram: @ﬁqryr