[CS Foundation] AIML - 3 - Common Issue

Common issues in ML Lo Pang-Yun Ting X-Village

Outline • Underfitting, Overfitting • Regularization • Sparse data •
Imbalance data 2

Evaluation 3 https://www.mopik.net/pink-424.htm Underfitting https://www.tudointeressante.com.br/2016/03/parte-2-1 4-das-invencoes-mais-inuteis-de-todos-os-tempos.html Overfitting

Evaluation 4 • Examples of regression Underfitting Simple Model Complexity
Complex Just Right Overfitting

Evaluation 5 Erro r Model Complexity Training error Testing error
Underfitting Overfitting

Underfitting • Address underfitting 6 2. Increase number of features
3. Remove outlier 1. Use more complex model Underfitting

3. Remove outlier 1. Use more complex model Level 1 2 3 4 5 ATK 0 10 25 50 100 ＋ x (Level) y x 1 (Level) y x 2 (ATK)

3. Remove outliers 1. Use more complex model Outliers

Overfitting • Address overfitting 9 1. Use simpler model 3.
Validation 2. Collect more training data 4. Regularization

Example of regression Regularization • Keep all the features, but
reduce value of weights θ • Add penalty 13 original cost function → θ 3 ≈ 0 θ 4 ≈ 0 + 1000 • θ 3 + 1000 • θ 4 2 2 any big number (正規化)

Regularization • Smaller values for weights θ j 14 Σ
θ j j = 1 n 2 + λ Regularization Parameter Regularization Term [ ] Cost Function

Regularized Linear Regression • How does λ affect the performance
• Large λ works fine (e.g. λ = 1000) • Small λ leads to overfitting 15 Q: If λ extremely large ? (e.g. λ = 1010) Cost Function Hypothesis → θ 1 , θ 2 , θ 3 , θ 4 ≈ 0 Underfitting

p Regularization • Different regularization terms 16 minimize ≡ minimize
, if p Example: n = 2

Regularization • Regression uses L2 Regularization • Ridge 17 2
Minimize penalty Minimize cost Minimize cost + penalty

Regularization • Regression uses L1 Regularization • Lasso (least absolute
shrinkage and selection operator) 18 Minimize cost 1 Minimize penalty Minimize cost + penalty

Regularization • Lasso regression v.s. Ridge regression • Lasso shrinks
the less important weight to 0 19 Works well for feature selection Lasso θ 1 = 0 θ 1, θ 2 ≠ 0

Regularization • Regression use L1 + L2 Regularization • Elastic
Net 20 Ridge Lasso Elastic Net

Example • sklearn - Ridge, Lasso, ElasticNet 21 print(boston.DESCR)

Exercise - (4) 22 • TASK: Call Ridge, Lasso, ElasticNet
models • Requirements • 使用sklearn.datasets中的boston數據來訓練此三個models • 印出三個model訓練完後的weights

Exercise - (4) 23 • Output

Sparse Data

Sparse data • What is sparse data (稀疏資料)? • Many
features have zero value 25

Sparse data • The problem of sparse data • Space
complexity • Require large memory • Time complexity • Spend much time on useless calculations 26

Sparse data • Deal with sparse data 1. Use other
data structures to save the data 27 • Tool: scipy.sparse 2. Dimensionality reduction

Sparse data 28 Opinion space: A scalable tool for browsing
online comments - Scientific Figure on ResearchGate. Available from: https://www.researchgate.net/Dimensionality-reduction-from-3D-to-2D_fig4_22151902 4 [accessed 15 Aug, 2018] • Deal with sparse data 1. Use other data structures to save the data • Tool: scipy.sparse 2. Dimensionality reduction

Imbalanced Data

Imbalanced data • What is imbalanced data 30 我愛梨明！我愛梨明！
張學酉！張學酉！我們愛你！ Training data

Imbalanced data • The problem of imbalanced data 31 Testing
data 張學酉fans 張學酉fans 張學酉fans 張學酉fans Predict 梨明fans 張學酉fans 張學酉fans 張學酉fans Trut h 模型只預測多數派！

Imbalanced data 32 • Deal with imbalanced data • Collect
more training data • Resample training data • Use other evaluation metrics

Imbalanced data 33 前期後期 • Deal with imbalanced data
• Collect more training data • Resample training data • Use other evaluation metrics 每個時期的資料趨勢可能不一樣！

Imbalanced data • Deal with imbalanced data • Collect more
training data • Resample training data • Use other evaluation metrics 34 收集比例較少的樣本刪除比例較多的樣本

training data • Resample training data • Use other evaluation metrics 35 True Positive False Positive False Negative True Negative (Predicted) positive (Predicted) negative (Actual) positive (Actual) negative Precision = TP TP + FP Recall = TP TP + FN Accuracy = TP + TN TP + FN + TN + FP

training data • Resample training data • Use other evaluation metrics 36 True Positive False Positive False Negative True Negative (Predicted) positive (Predicted) negative (Actual) positive (Actual) negative Precision = TP TP + FP Recall = TP TP + FN Accuracy = TP + TN TP + FN + TN + FP

Example • sklearn - GridSearchCV 37

Exercise - (5) 38 • TASK: Practice GridSearchCV • Requirements
• 使用GridSearchCV來幫Ridge model調參，嘗試alpha = [1, 5, 10]，找出可以使結果最好的alpha值

[CS Foundation] AIML - 3 - Common Issue

[CS Foundation] AIML - 3 - Common Issue

More Decks by x-village

Other Decks in Programming

Featured

Transcript