DevCoach 185: Machine Learning | Supervised Learning

Machine Learning Muhammad Fikry Rizal Curriculum Developer Machine Learning Supervised
Learning

Hi, I’m M. Fikry Rizal 👋 Latest Work Experiences: •
AI/ML Curriculum Developer, Dicoding 2024 - present • Data Engineer Intern, Torche Education 2022 - 2023 Education: • UIN Syarif Hidayatullah Jakarta 2020 - 2024 Bachelor Degree, Physics • Bangkit Academy 2023 2023 Machine Learning About Me Muhammad Fikry Rizal https://github.com /mfikryrz Machine Learning

Machine Learning Overview 1. Hi, Machine Learning! 2. Machine Learning
Workflow 3. Supervised Learning: Klasifikasi 4. Supervised Learning: Regresi 5. Unsupervised Learning - Clustering 6. Teknik Feature Engineering 7. Overfitting dan Underfitting 8. Optimasi Model Machine Learning

Hi, Machine Learning! 1. Hi, Machine Learning! 2. Machine Learning
Workflow 3. Supervised Learning: Classification 4. Supervised Learning: Regression Machine Learning

“A ﬁeld of study that gives computers the ability to
learn without being explicitly programmed.” Arthur Samuel Machine Learning

Categories of Machine Learning Machine Learning

Practical Use Cases of Machine Learning Machine Learning

Machine Learning Tools Machine Learning

Machine Learning Workflow Machine Learning

Data Collecting Machine Learning

Understanding Data: Sources and Formats • UC Irvine Machine Learning
Repository • Kaggle Dataset • Google Dataset Search Engine • TensorFlow Dataset • Satu Data Indonesia • Menggunakan Dataset dari Sumber Terpilih Machine Learning • CSV Comma-Separated Values) • Excel Files • JSON JavaScript Object Notation) • HTML • SQL Database

Refining Raw Data: The Art of Data Cleaning Machine Learning
Machine Learning

Exploratory & Explanatory Data Analysis • Exploratory Data Analysis EDA
◦ Understand structur, characteristics, and pattern in the data • Explanatory Data Analysis ExDA ◦ Communicating findings or insights that have been obtained to a broader audience. Machine Learning

Exploratory Machine Learning Explanatory

Refining Raw Data: The Art of Data Cleaning • Identifying
and Handling Missing Values Machine Learning

Refining Raw Data: The Art of Data Cleaning Machine Learning
Feature Scaling • Normalization ◦ Features with Different Scales. ◦ Distance-Based Models. ◦ Data Not Normally Distributed. • Standardization ◦ Normally Distributed Data. ◦ Regression-Based Models. ◦ Data with Different Scales.

Data Splitting Machine Learning • Training Set ◦ A subset
of data used to train the model. ◦ Common Percentage: Typically 6080% of the total dataset. • Validation Set ◦ A subset of data used for validation during the training process. ◦ Common Percentage: Typically 1020% of the total dataset. • Test Set ◦ A subset of data used for final testing after the model has been trained and tuned. ◦ Common Percentage: Typically 1020% of the total dataset.

Machine Learning Crafting Powerful Machine Learning Models

Deployment & Monitoring Machine Learning

Quiz #1 DevCoach 185 Dalam tahap preprocessing data, Anda menemukan
bahwa beberapa fitur memiliki missing values. Pilihan berikut mana yang paling tepat untuk menangani missing values dalam konteks model machine learning? Machine Learning a). Menghapus fitur tersebut b). Mengisinya dengan mean atau median dari fitur tersebut

Supervised Learning: Klasifikasi Machine Learning

Classification Algorithms Machine Learning

Algorithm: Decision Tree Advantages • Can Handle Both Categorical and
Numerical Data • No Need for Feature Scaling • Flexible and Customizable Disadvantages • Can Be Too Flexible Prone to Overfitting) • Sensitive to Noise • Can Grow Too Large Complex Trees) How It Works • Step 1 Initial Data Splitting • Step 2 Feature Selection and Data Partitioning • Step 3 Branch and Node Formation • Step 4 Creation of Leaf Nodes • Step 5 Using the Model for Prediction Machine Learning

Algorithm: Random Forest Advantages • High Accuracy • Robust Against
Overfitting • Ability to Handle Imbalanced Data • Handles Missing Data • Identifies Important Features Disadvantages • High Memory Requirements • Low Interpretability • Slow Prediction Speed • Less Effective on Small Datasets • Long Training Time Machine Learning

Model Evaluation Machine Learning

Model Evaluation Misalkan kita memiliki hasil prediksi model untuk sebuah
dataset dengan 100 email yang diklasifikasikan sebagai spam atau bukan spam (ham). Machine Learning

Hands-on Coding: Let's Build It Machine Learning

Quiz #2 DevCoach 185 Seorang pasien menjalani tes untuk mendeteksi
kanker. Hasil tes menunjukkan bahwa pasien tidak memiliki kanker, tetapi sebenarnya pasien tersebut memiliki kanker. Apa jenis kesalahan dalam klasifikasi ini? a) True Positive b) True Negative c) False Positive d) False Negative Machine Learning

Supervised Learning: Regression Machine Learning

Machine Learning Regression Algorithm

Algorithm: Linear Regression Machine Learning

Hands-on Coding: Let's Build It Machine Learning

Quiz #3 DevCoach 185 Jika model menghasilkan hasil dalam bentuk
persentase probabilitas, misalnya 85% kemungkinan hujan, apakah model tersebut lebih cenderung merupakan model klasifikasi atau regresi? Machine Learning

Thank You Machine Learning

Feedback! dicoding.id/devcoachfeedback Machine Learning

DevCoach 185: Machine Learning | Supervised Lea...

DevCoach 185: Machine Learning | Supervised Learning

More Decks by Zahrina

Featured

Transcript