Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[CS Foundation] AIML - 2 - Regression

[CS Foundation] AIML - 2 - Regression

x-village

August 14, 2018
Tweet

More Decks by x-village

Other Decks in Programming

Transcript

  1. Machine Learning • Classification(分類) v.s. Regression(迴歸) Lv. 1 Lv. 1

    Man! Q1:超人是否能 打敗Lv. 1怪物? 5 Seven! Neos! Man! Taro! Tiga! Q2:需要幾位超人才能 打敗Lv.1怪物?
  2. Regression • What is ‘regression’ analysis? 怪物等級 1 2 3

    4 5 6 7 8 9 10 打敗怪物所需超人數量 1 1 2 3 6 7 11 13 13 15 7
  3. Regression Features: x(i) = [x 1, … x d ]

    Outputs: y(i) linear regression (線性迴歸) polynomial regression (多項式迴歸) 9
  4. Linear Regression • Model representation • Hypothesis(假說) : maps from

    X to Y Choose θ so that h θ (x) is close to y for training examples Training examples 10 Weigh t
  5. Linear Regression • Definition of cost function 預測 真實 誤差

    mean square error (MSE) Cost Function J(θ 0 , θ 1 ) minimize J(θ 0 , θ 1 ) 12 h θ (x) Hypothesis
  6. Linear Regression • Look into cost function x 1 y

    0 1 2 3 0 1 2 3 θ 0 = 0 Simplified 13 Goal Hypothesis Weights Cost function minimize Goal Hypothesis Weights Cost function minimize
  7. Linear Regression x 1 y 0 1 2 3 0

    1 2 3 • Look into cost function 0 0.5 1 1.5 0 1 2 3 J(θ 1 ) θ 1 = 1 Hypothesis Cost function 2 2.5 θ 1 θ 1 = 0.5 θ 1 = 1.5 14 (02 + 02 + 02) 1 2 x 3 J(1) = ((0.5 - 1)2 + (1 - 2)2 + (1.5 - 3)2) 1 2 x 3 J(0.5) = ≈ 0.58 ((1.5 - 1)2 + (1 - 2)2 + (0.5 - 3)2) 1 2 x 3 J(1.5) = ≈ 0.58 = 0
  8. Linear Regression x 1 y 0 1 2 3 0

    1 2 3 • Look into cost function 0 0.5 1 1.5 0 1 2 3 J(θ 1 ) θ 1 = 1 Hypothesis Cost function 2 2.5 θ 1 θ 1 = 0.5 θ 1 = 1.5 15
  9. • Look into cost function J(θ 0, θ 1 )

    θ 1 θ 0 17 Linear Regression
  10. • Gradient descent (梯度下降法) Gradient Descent 21 Cost function J(θ

    0, θ 1 ) Goal J(θ 0, θ 1 ) minimize OUTLINE • Start with some θ 0 , θ 1 • Keep changing θ 0 , θ 1 to reduce J(θ 0, θ 1 ) until we hopefully end up at a minimum
  11. repeat until convergence { } • Gradient descent alogrithm Gradient

    Descent 22 Learning rate Assign value from right side to left side
  12. • Gradient descent intuition Gradient Descent 24 J(θ 1 )

    θ 1 · (positive value) Positive Slope repeat until convergence { } 當前 θ 值所處點的 切線斜率 θ 1 Current value θ 1 becomes smaller Cost becomes smaller
  13. • Gradient descent intuition Gradient Descent 25 J(θ 1 )

    θ 1 · (negative value) Negative Slope repeat until convergence { } 當前 θ 值所處點的 切線斜率 θ 1 Current value θ 1 becomes bigger Cost becomes smaller
  14. • Gradient descent intuition Gradient Descent 26 J(θ 1 )

    θ 1 repeat until convergence { } Learning rate If learning rate is too big It may fail to converge or even diverge θ 1 Current value
  15. • Gradient descent intuition Gradient Descent 27 J(θ 1 )

    θ 1 repeat until convergence { } Learning rate If learning rate is too small Gradient descent can be slow θ 1 Current value
  16. Exercise - (1) • Requirements 1. 完成 hypothesis function 和

    cost function 29 Hypothesis Cost function
  17. Exercise - (1) • Requirements 2. 分別測試 (θ 0 θ

    1 ) = (0, 0), (1, 1), (10, -1),印出算出的cost值 3. 觀察不同 θ 值所得到的regression line和cost之間的關係 30
  18. Ordinary Least Square • Ordinary least square (最小平方法/最小二乘法) 33 repeat

    until convergence { } Solve Gradient descent Cost function OLS
  19. Ordinary Least Square • OLS v.s. Gradient descent 34 Gradient

    descent OLS θ 1 Initial value 直接求最佳解 迭代計算求最佳解
  20. Evaluation • Evaluation metrics for regression 39 • Mean square

    error (MSE) • Root mean square error (RMSE) • Mean absolute error (MAE) 預測 真實 • 預測值和真實值的差值 • 越小越好
  21. Evaluation • Evaluation metrics for regression 40 • R-squared score

    (R2 score) • 預測值和真實數據的擬合程度 • 最佳值為1
  22. Exercise - (2) • TASK: Use sklearn to implement linear

    regression • Sample code • Requirements • 使用 Exercise - (1) 的數據來訓練LinearRegression( ) and SGDRegressor( ) • 印出兩種方法訓練完後得到的weight值 (θ) • 觀察兩種方法的結果 42
  23. Exercise - (3) 43 • TASK: Use sklearn.metrics to evaluate

    models • Requirements • 印出Exercise - (2)兩個models的RMSE (測試的資料先用training data替代)