Slide 1

Slide 1 text

4 Classification 0Ő

Slide 2

Slide 2 text

â¾ 4.1 An Overview of Classification 4.2 Why Not Linear Regression? 4.3 Logistic Regression 4.3.1 The Logistic Model 4.3.2 Estimating the Regression Coefficients 4.3.3 Making Predictions 4.3.4 Multiple Logistic Regression 4.3.5 Logistic Regression for >2 Response Classes 4.4 Linear Discriminant Analysis 4.4.1 Using Bayes’ Theorem for Classification 4.4.2 Linear Discriminant Analysis for p=1 4.4.3 Linear Discriminant Analysis for p>1 4.4.4 Quadratic Discriminant Analysis 4.5 A Comparison of Classification Methods 4.6 Lab: Logistic Regression, LDA, QDA, and KNN 4.7 Exercises 2

Slide 3

Slide 3 text

4 Classification ĂRsƻƪƿŹ3ðŲġ§Ťūƌ ũŸųŞŔâá\¢ŹĺáŲŗŮūƌŷ ŤŜŤŔ×gOōŲŹâá\¢ŝĥáŸYHƇ ŗƏƌ ŢŸðŲŹŔĥáŵâá\¢Ÿ ÍŸ¦ÈŶů Řűġ§ŦƏƌ ŢŸŢųƓ0Ő(Classification)ųLſƌ 3

Slide 4

Slide 4 text

4.1 An Overview of Classification 0ŐŸºĔ

Slide 5

Slide 5 text

4.1 An Overview of Classification ƝǀƢƨƫƛDŽƬŸDefaultƪDŽƦ(100000)Ŭƌ IncomeŹuVvBŔBalanceŹ®ŸĢÅŎŬƌ ŇŹŭƈƔų–Ůū(default=No)ŲŔƚǀǂƢ Ź–ŮűŵŘ(default=Yes)Ŭƌ 5

Slide 6

Slide 6 text

4.1 An Overview of Classification Balance(!" )ųIncome(!# )ŜƍDefault($)Ɠ ÍŦ Ə•ÈƓ4ðŲġ§ŦƏƌ $Źĺá\¢ŲŹŵŘŜƍŔ3ðŲġ§ŤūĂ RsŹ ŚŵŘƌ 6

Slide 7

Slide 7 text

4.2 Why Not Linear Regression? ĂRsťƈŬƆŵŸLJ

Slide 8

Slide 8 text

4.2 Why Not Linear Regression? ŵƔŲĂRsŹĥá\¢ŶŹ ŚŵŘŸLJ ˆ MhŸŠĈŸàÕŸƪDŽƦŶĂRsƓÐØ ƉƎ ŮűƅƏƌ âá\¢$ƓŸ»ŶfĆŦƏƌ 8

Slide 9

Slide 9 text

4.2 Why Not Linear Regression? ŢŸfĆŹŔstrokeųepileptic seizureŸ ļŶdrug overdoseƓśŘűƏƌ ũƐŹŔstrokeųdrug overdoseŸqųdrug overdoseų epileptic seizureŸqŝòŤŘŢųƓŽKŤűŘƏƌŷ ŠŴgŃŹũŸƌřŵ gŹŵŘƌ 9

Slide 10

Slide 10 text

4.2 Why Not Linear Regression? ġ§\¢Ÿ&ŝŔĩxŔ ìxŔĹxųŋwŠƓ ŤűŔĩxŜƍ ìxŸqŝ ìxŜƍĹxŸqų òŤŘųťūYHŹŔ1,2,3ųŘřfĆŹŘŘŜƇŤ ƐŵŘƌ ůŸÕƓųƏĥáŵâá\¢ŸĂRs Ÿ¦ÈŹČáŶŹŵŘƌ ůŸÕŸYHŹŘئÈŝŗƏƌ 10

Slide 11

Slide 11 text

4.2 Why Not Linear Regression? ŢŸƌřŶfĆŤŵśŦƌ ŢŸYHŹĂRsƓ řŢųŝŲŞƏƌ ŬŠŴŔ$ŝ[0,1]ŸõS]Ŷ×ƐūƎŤűŔèÖųŤ űęĸŲŞŵşŵƎƌşŵŘƌ ċKÊŘŢųŶŢŸYHŹŔ4.4ôŲġ§ŦƏĂ3 40²(LDA)ųIťû³ŝ„ƍƐƏƌ 11

Slide 12

Slide 12 text

4.3 Logistic Regression ǁƢƣƩƕƨƝRs

Slide 13

Slide 13 text

4.3 Logistic Regression ƝǀƢƨƫƛDŽƬŸdefaultƪDŽƦŶ“ŮűćŚƏƌ âá\¢Źdefault=Yes,NoŸůŶ0ŠƍƐƏƌ ǁƢƣƩƕƨƝRsŹŢŸâá\¢$Ɠãœƻƪƿ :ŦƏƔťƈŵşű$ŝÔfŸƛƩƟƾŶoŦƏèÖ Ɠƻƪƿ:ŦƏƌ 13

Slide 14

Slide 14 text

4.3 Logistic Regression pŹĂRsƓ Ůūƌ èÖŝƷƖƭƣŶŵŮűƏƌDž FŹǁƢƣƩƕƨƝRsƓ Ůūƌ ,ŌWŲ[0,1]ŶBƄŮűƏƌDž 14

Slide 15

Slide 15 text

4.3 Logistic Regression balanceŶƌƏdefaultŸèÖŹŸ»ŶŜŠƏƌ ŬŠŴåÜŤű ų¬şƌ blanceŸŽŸ&ŶƌŮűdefaultŸèÖƓ ÍŲŞ Əƌ ¹ŝdefaultŸƓ‘ĹŶ ÍŦƏYH ŸƌřŘŶŘľ&ƓĝfŦƏŢųƇŲŞƏƌ 15

Slide 16

Slide 16 text

4.3.1 The Logistic Model ǁƢƣƩƕƨƝƻƪƿ

Slide 17

Slide 17 text

4.3.1 The Logistic Model ŴřƉŮű!ųp(!)ŸĽ$Ɠƻƪƿ:ŤūƍŘŘŸLJ ©ıŸĂRsŶƌƏ ÍŲŹƧƺŬƌ ŴƔŵbalanceŸ&ŲƇdefaultŸèÖŹ[0,1]Ÿ ŶB ƄŮű¿ŤŘƌ ŢŸOōŹƁůŶdefaultƪDŽƦŬŠŸOōťƈŵŘƌ âá\¢ŝ2&ŸYHŶRs0²ƓđřųŔ?ØáŶ Źp ! > 1ƇŤşŹp ! < 0ųŵƏYHŝ…ŧŗƏŜ ƍŷ 17

Slide 18

Slide 18 text

4.3.1 The Logistic Model ŢŸOōƓęÆŦƏŶŹ,űŸ!ŶśŘűp ! Ÿ&ŝ [0,1]ŶBƄƏƌřŵĽ¢Ɠ Ůűp ! Ɠƻƪƿ:ŤŵŠ ƐźŘŠŵŘŷ ũŸƌřŵĽ¢ŹƇŭƑƔŘŮŻŘŗƏƌ ǁƢƣƩƕƨƝRsŲŹǁƢƣƩƕƨƝĽ¢Ɠ řƌ 18

Slide 19

Slide 19 text

4.3.1 The Logistic Model ŢŸƻƪƿƓ~űŹƆƏūƆŶ­mÈƓ řƌ ­mÈŹ¾ŸƥƝơƼǂŲġ§ŦƏƌ ţŮŞŸT(F)ƇǁƢƣƩƕƨƝĽ¢Ɠ~űŹƆūƌ defaultŸèÖŹĿƎŵş0ŶĭŰŘűŘƏŠŴ0Ŷ ŹŵŮűŵŘƌ balanceŝŒŘYHƇI»Ŭƌ ǁƢƣƩƕƨƝĽ¢ŹŴƔŵųŞƇSb Ɠžşƌ 19

Slide 20

Slide 20 text

4.3.1 The Logistic Model lŤţŮŞŸyƓy\ŦƏƌ 20

Slide 21

Slide 21 text

4.3.1 The Logistic Model pĬƓƚƨƤųĚřƌ ƚƨƤŹñőŬŠťƈŵşűŘƑŘƑŵųŢƑŶ ƒƐ űŞūƌ ƚƨƤŹ0Ŝƍ∞Ÿ&ƓųƏƌ 21

Slide 22

Slide 22 text

4.3.1 The Logistic Model Ĭj¢ųŮūƌ pĬŹǁƢƨƫųĚřƌ ŢřŵƏų!ŶjŤűĂŬŷ 22

Slide 23

Slide 23 text

4.3.1 The Logistic Model 3ðŸĂRsƻƪƿŲŹ-" Ź!Ÿ<[8ŶjŦƏ $Ÿ[8ĺƓĒŤűŘūŷ ǁƢƣƩƕƨƝRsƻƪƿŸYHŹũřŲŹŵŘƌ !Ÿ<[8ŶjŦƏp ! Ÿ[8ĺŹ!Ÿ&Ŷ"cŦ Əƌ ŬŠŴ!ŝ[8ŦƐźp ! Ƈ[8Ŕ !ŝÌlŦƐźp ! ƇÌlųŘřĽ $Ź’Ǝïůƌ 23

Slide 24

Slide 24 text

4.3.2 Estimating the Regression Coefficient Rs$¢Ÿf

Slide 25

Slide 25 text

4.3.2 Estimating the Regression Coefficient -. ų-" ŹĜăƪDŽƦŜƍfŤŵŠƐźŘŠŵŘƌ 3ðŲŹ$¢ŸfŶ­k ÈƓ Ůūƌ ŢŸYHƇňĂŸ­k ÈƓ řŢųƇŲŞƏŠŴ ­mÈŸ¦ŝŘŘƌ 25

Slide 26

Slide 26 text

4.3.2 Estimating the Regression Coefficient ŢŸĽ¢ŹmxĽ¢ųŘřƌ mxĽ¢Ÿ&ŝ­_ųŵƏƌřŶ-. ų-" ƓĕůŠƏƌ ­mÈŸ¢dáŵęġŹŢŸ°ŲŹ—ƒŵŘƌ ǁƢƣƩƕƨƝRsŸƻƪƿŸ~űŹƆŹRƓ Śźö <ŶŲŞƏƌ 26

Slide 27

Slide 27 text

4.3.2 Estimating the Regression Coefficient RŲǁƢƣƩƕƨƝRsƓŤūû³ƓéŦƌ ĕ¦Ź3ðŸĂRsųIťŬƌ ZýěĺŹtýěĺųIť7Ŭƌ !Śź-" ŶĽŦƏZŸ&Ź- / "/SE(- / ")Ŭƌ ŢŸ&Ÿ_ŞţŹ(þj&)ŔsÐġ3.: -" = 0Ɠ¶> ŲŞƏŜŴřŜŶĽ$ŦƏƌ 27

Slide 28

Slide 28 text

4.3.2 Estimating the Regression Coefficient 3.: -" = 0ųŘřsÐġŹ ŲŗƏŢųƓéŦƌ ŢƐŹdefaultŸèÖŹbalanceŸ&ŶƌƍŵŘŢųƓĒ ŤűŘƏƌ ŬŠŴp&ŝųűƇkţŘŜƍ3. ʶ>ŲŞƏŷ 28

Slide 29

Slide 29 text

4.3.3 Making Predictions ÍŦƏ

Slide 30

Slide 30 text

4.3.3 Making Predictions $¢ŝfŲŞƐźŽŸbalanceŶjŤűdefaultŸ èÖƓěóŦƏŸŹö<Ŭƌ !Śźbalanceŝ1000$ŸųŞŸdefaultŸèÖŹ ų1%Ŭƌ ŬŠŴ2000$ŲěóŦƏų58.6%ŶĦŷŝƏƌ 30

Slide 31

Slide 31 text

4.3.3 Making Predictions ġ§\¢ŝƧƸDŽ\¢ŲƇ_`Ŭƌ defaultƪDŽƦŶŹdÙŜŴřŜŸŒXƇŗƏƌ dÙŬų1ŔňdÙŬų0ųŤűǁƢƣƩƕƨƝRsŤūû ³ƓĒŶéŦƌ studentŸ$¢ŹÀŲp&ʝŽŬƌ 31

Slide 32

Slide 32 text

4.3.4 Multiple Logistic Regression ^ĹǁƢƣƩƕƨƝRs

Slide 33

Slide 33 text

4.3.4 Multiple Logistic Regression ¾ŹŔē¢Ÿġ§\¢Ŝƍ0,1Ÿâá\¢Ɠ ÍŦƏ OōŶůŘűćŚƏƌ ţŮŞŸyƓš|ŦƏƌ ţŮŞųİŸy\ƓŦƏƌ -., -" ,,,, -7 Ÿf¦ÈƇţŮŞųIťŲ­mÈƓ řƌ 33

Slide 34

Slide 34 text

4.3.4 Multiple Logistic Regression balance,income,studentƓġ§\¢ŔdefaultƓâá\¢ ųŤűǁƢƣƩƕƨƝRsƓ ÚŤūû³ƓéŦƌ balanceųstudentŸp&ŝkţŘŸŲdefaultųĽ$Ťű ŘƏƌ studentŸ$¢ŝƷƖƭƣŵŸŹdÙŸƂřŝňdÙ ŶÃƁűdefaultŤŶşŘŢųƓĒŤűŘƏƌ ţŮŞŸĒųİŸû³ƓéŤűŘƏŷ ŵƔųŘřŢųŲŤƋř 34

Slide 35

Slide 35 text

4.3.4 Multiple Logistic Regression ŢŸTƓ Ůűġ§ŤűŘşƌ ƚǀǂƢŝdÙŲŇŝňdÙŬƌ 35

Slide 36

Slide 36 text

4.3.4 Multiple Logistic Regression pŸTŸgĂŹuVdefaultÖƓbalanceŸĽ¢ųŤű ĒŤűŘƏƌ ^ĹǁƢƣƩƕƨƝRsŸû³ŹŔŗƏbalanceųincome ŶjŤűdÙŹňdÙƌƎdefaultŤŶşŘŢųƓéŤű Řūƌŷ pŸTŸgĂŹèŜŶũřŵŮűŘƏƌŷ 36

Slide 37

Slide 37 text

4.3.4 Multiple Logistic Regression pŸTŸÄuçĂŹŔdÙųňdÙŸuVdefaultÖƓ éŤűŘƏƌ û³ŝİŶŵŮūŷ 37

Slide 38

Slide 38 text

4.3.4 Multiple Logistic Regression FŸTŲŢŸĊŝġ§ŲŞƏƌ sutudentųbalanceŹäĽŤűŘƏŷ dÙŹbalanceŝŒŘ(JŶŗƎŢƐŹdefaultŶůŵ ŝƏƌ 38

Slide 39

Slide 39 text

4.3.4 Multiple Logistic Regression IťbalanceŸYHŔdÙŹňdÙŶÃƁűdefaultÖ ŹŘƌ ŤŜŤŔdÙŹ,áŶbalanceŝŒŘ(JŶŗƎŔ defaultÖŹŒŘƌ 39

Slide 40

Slide 40 text

4.3.4 Multiple Logistic Regression ŢŸIJŘŹƝǀƢƨƫƛDŽƬêŶųŮűŹ_ŞŘƌ ƪDŽƦŝŵŘdÙŹŔňdÙƌƎƇ=ŁŬƌ ŬŠŴũŸdÙŹŔIťbalanceƓ›ůňdÙƌƎŜŹ =ىŝŘƌ ŢŸƌřŵ×ģƓü(confounding)ųĚřƌ 40

Slide 41

Slide 41 text

4.3.5 Logistic Regression for >2 Response Class 3ƝƽƣŸǁƢƣƩƕƨƝRs

Slide 42

Slide 42 text

4.3.5 Logistic Regression for >2 Response Class 6ŸƥƝơƼǂŲġ§Ťū2ƝƽƣŸǁƢƣƩƕƨƝRs ƻƪƿŹ3ƝƽƣŶƇš|ŲŞƏƌ ŲƇgŃŶŹŗƄƎ ƒŵŘƌ ŵƔŲŜųŘřųŔ¾ŸƥƝơƼǂŲġ§ŦƏ340² ŝē¢ƝƽƣŸ0ŐŲČáŬŜƍŬƌ ŬŜƍŢŸ°ŲŹġ§ŤŵŘŠŴũřŘřŸŝŗƏŮű ŢųŬŠėŚűśŘűŷ ŭŵƅŶRŲŲŞƏƌ 42

Slide 43

Slide 43 text

4.4 Linear Discriminant Analysis Ă340²

Slide 44

Slide 44 text

4.4 Linear Discriminant Analysis ǁƢƣƩƕƨƝRsųŹ4Ÿ•ÈƓćŚƏƌ ũŸ•ÈŸû³ŹǁƢƣƩƕƨƝRsųňtŶƌşű ŘƏƌ ťƈŗŵƔŲǁƢƣƩƕƨƝRsŝŗƏŸŶ4Ÿ•ÈƓ ćŚƏŸLJ 44

Slide 45

Slide 45 text

4.4 Linear Discriminant Analysis ũƐŹ ǃƝƽƣŝŤŮŜƎ0ŜƐűŘƏ¨ŔǁƢƣƩƕƨƝRs ŲfţƐūƱƽƺDŽƦŹŜŵƎefŲŗƏŕĂ3 40²ŲŹŢŸOōŹŵŘŕ ǃnŝkţş!ŝÀĖ0rŲŗƏųŞĂ34ƻƪƿŹǁ ƢƣƩƕƨƝRsƻƪƿƌƎƇţƍŶefŤűŘƏŕ ǃĂ340²Ź3ůŸƝƽƣŝŗƏYHŶƇ Ś Ə ŜƍŬƌŕ 45

Slide 46

Slide 46 text

4.4.1 Using Bayes’ Theorem for Classification 340²ƀƵƖƤŸfØƓ ř

Slide 47

Slide 47 text

4.4.1 Using Bayes’ Theorem for Classification ĘÍƪDŽƦƓ 9 (9 >2) %ŸƝƽƣŶ0ŠūŘųŦƏƌ ůƄƎŔĥáŵâá\¢$Ź9%Ÿ&ƓųƏƌ :; ƓĘÍƪDŽƦŝƝƽƣ<ŲŗƏèÖųŦƏƌ =;(!) ≡ Pr(! = A|$ = <)ųŦƏƌ ŢƐŹƝƽƣ<ŶśŠƏ!ŸixĽ¢Ŭƌ ůƄƎŔƝƽƣ<ŶśŘű! ≈ AŶŵƏèÖŝŒŘ¨ =;(!)Ź_Şŵ&ƓųƏƌ 47

Slide 48

Slide 48 text

4.4.1 Using Bayes’ Theorem for Classification ŢŢŲŔƵƖƤŸfØƓ řƌ ŢƐŜƍŹD; ! = Pr $ = < ! ųĒŦƌ yƓĕƐźƒŜƏ»Ŷ:; ų=;(A)ƓfŤūŘƌ ŦƏųD; ! ŝƒŜƏŜƍ :; ƓfŦƏŢųŹö<Ŭƌ <øŶƝƽƣŸ7HŬƌ =;(A)ƓfŦƏŢųŹƍŜŸixĽ¢ƓfŤŵŘ ųņŤŘƌ ũŸ¦ÈŶůŘűŸƥƝơƼǂŲġ§ŤűŘşƌ 48

Slide 49

Slide 49 text

4.4.2 Linear Discriminant Analysis for D = 1 Ă340²(D = 1)

Slide 50

Slide 50 text

4.4.2 Linear Discriminant Analysis for D = 1 ƄŧŹD = 1ųfŦƏƌ ůƄƎġ§\¢ŹíŬƌ =;(A)ƓÀĖ0rųfŦƏƌ ţƍŶŔE" # = E# # =, …, = EG # ≡ E#ųfŦƏƌ ,űŸƝƽƣŲ0¡ŝòŤŘųŘřŢųŬƌ 50

Slide 51

Slide 51 text

ůŸyƓHƒŨƏų ŢřŵƏƌ 4.4.2 Linear Discriminant Analysis for D = 1 51

Slide 52

Slide 52 text

4.4.2 Linear Discriminant Analysis for D = 1 j¢ƓųŮű<ŶĽŦƏķ0ŬŠćŚƏų ŢřŵƏƌ AŝÆƄŮū¨ŶƝƽƣŝ<ŬŮūYHH; Ÿ&Ź_Şş ŵƏƌ ŭŵƅŶH; ŝAŸĂŵĽ¢ŬŜƍĂ340²Ůű Ěřƌ 52

Slide 53

Slide 53 text

!Śź9 = 2Ų:" = :# ųŦƏƌ ŢŸųŞŔ AŝƝƽƣ1ŲŗƏųŞH" > H# ųŵƏƌ ůƄƎ ųŵƏƌ Ɲƽƣ2ŸųŞƇI»Ŭƌ ZÛĂŹH" = H# ƓAŶůŘűęŘű ųÅƄƏƌ 4.4.2 Linear Discriminant Analysis for D = 1 53

Slide 54

Slide 54 text

!ƓŸTŶéŦƌ J" = −1.25 , J# = 1.25 , E" # = E" # = 1 ųŤūƌ Ƅū:" = :# = 0.5ųŦƏųT ŸœŸçĂ(ÆfZÛ) ŝzŠƏƌ 4.4.2 Linear Discriminant Analysis for D = 1 54

Slide 55

Slide 55 text

FŸTŹ20%ŸĘÍƪDŽƦŸƲƣƫƞƽƹŬƌ ŢŸƌřŶƱƽƺDŽƦŝ,ű0ŜŮűŘƏYHŹƵƖƤ 0ŐQƓö<ŶěóŦƏŢųŝŲŞƏƌ „ƍƐūƪDŽƦŜƍĂ340²ŶƌƏÆfZÛƓœ ŸgĂŲéŦƌ 4.4.2 Linear Discriminant Analysis for D = 1 55

Slide 56

Slide 56 text

gŃŶŹ!ŝÀĖ0rŶƒřų0ŜŮűŘūųŤűƇ J", … , JG, :", … , :G, E#ƓfŦƏ…ĔŝŗƏƌ Ă340²(LDA)ŹƱƽƺDŽƦŸf&ƓţŮŞŸy ŶĴÚŤűƵƖƤ0ŐQƓƏƌ Ÿf&ŝƌşÚŘƍƐƏƌ 4.4.2 Linear Discriminant Analysis for D = 1 56

Slide 57

Slide 57 text

NŹ,ĘÍ&Ÿ¢Ŕ N; ŹƝƽƣ<ŸĘÍ&Ÿ¢ ŸƓfŦƏ…ĔŝŗƏƌ J; ŹƝƽƣ<Ÿ<øŵuVŬƌ E#ŹGƝƽƣŸ¼°0¡Ÿ8ĹuVųƅƐƏƌ :; ŹƪDŽƦŸ7HŲŘŘƌ ůƄƎ: O; = N;/NŬƌ 4.4.2 Linear Discriminant Analysis for D = 1 57

Slide 58

Slide 58 text

ąƎĮŦŠŴLDAŹĘÍ&ŝƝƽƣU¯ŸuV&Ɠ ›ŮűŘűŔ-ıŸ0¡Ɠ›ůÀĖ0rŶƒřųf ŦƏƌ ũŤűŢƐƍŸƱƽƺDŽƦŸf&ƓƵƖƤ0ŐQŶĴ ÚŦƏƌ 4.4.4ôŲŹƝƽƣũƐŪƐŝU¯Ÿ0¡E; #Ɠ›ůųŘ řţƍŶ{ŘfƓśşƌ 4.4.2 Linear Discriminant Analysis for D = 1 58

Slide 59

Slide 59 text

4.4.3 Linear Discriminant Analysis for D > 1 Ă340²(D > 1)

Slide 60

Slide 60 text

4.4.3 Linear Discriminant Analysis for D > 1 LDAƓē¢Ÿġ§\¢ŸYHŶš|ŦƏƌ ũŸūƆŶŔĘÍ&ŹŔGƝƽƣŝU¯ŸuVƵƝƫƿ ųŔ,Ɲƽƣ-ıŸ0¡-0¡đ2Ɠ›ů^\ĺÀĖ 0rŶƒřųfŦƏƌ ^\ĺÀĖ0rŹũƐŪƐŸ\¢ŝ1¾*ŸÀĖ0 rŶƒřųfŤűŔţƍŶũƐŪƐŸ\¢ŹäĽĽ$ Ɠ›ůƌ 60

Slide 61

Slide 61 text

4.4.3 Linear Discriminant Analysis for D > 1 D = 2Ÿ^\ĺÀĖ0rŸixĽ¢ŸƞƽƳƓéŦƌ ŴŮŭŸTƇx" ĨƄūŹx# ĨŶũŮűTƓ1¤ŦƏų¤ ʼnŹ1¾*ŸÀĖ0rŶŵƏƌ pŹVar !" = Var !# ŲCor !", !# = 0Ŭƌ FŹ!" ų!# Ŷ0.7ŸäĽŝŗƏƌ 61

Slide 62

Slide 62 text

4.4.3 Linear Discriminant Analysis for D > 1 ^\ĺÀĖ0rŸixĽ¢Ź ŢƔŵŬƌ !~V(J, Σ)Ůű¬şƌ E ! = JŹD¾*ŸuVƵƝƫƿŬƌ Cov ! = ΣŹD×DŸ0¡-0¡đ2Ŭƌ 62

Slide 63

Slide 63 text

=;(! = A)Ɠ Ŷ+ŤűŔţŮŞų Iť»Ŷ¢ěóƓđřų ųŵƏƌ D = 1ŸųŞŹŢƔŵŬŮūŷ 4.4.3 Linear Discriminant Analysis for D > 1 63

Slide 64

Slide 64 text

!ƓéŦƌ ůŸ¸.ŹũƐŪƐŸ¯ŽèÖ95%ŸŌWŬƌ çĂŹƵƖƤŸÆfZÛŬƌ GƝƽƣŜƍŸ20%ŸĘÍ&ųŔũƐŜƍLDAŲěóţ ƐūÆfZÛ(œŸgĂ)ƓFŸTŶéŦƌ 4.4.3 Linear Discriminant Analysis for D > 1 64

Slide 65

Slide 65 text

ƵƖƤŸÆfZÛŹH; A = HZ(A)ųŵƏAŸńHƓĒ ŤűŘƏƌ ůƄƎ ƓÎūŦƌ(< ≠ \) 4.4.3 Linear Discriminant Analysis for D > 1 65

Slide 66

Slide 66 text

ţŮŞŸƝǀƢƨƫƛDŽƬŸƪDŽƦŶLDAƓĴÚŦƏƌ ġ§\¢ŹbalanceųstudentŬƌ 10000%ŸĜăƪDŽƦƓÚŘūƌ ĜăƪDŽƦŸƘƽDŽÖŹ2.75%ŬŮūƌDž ŒŘ÷xŶĕŚƏŠŴůɎŝ…ĔŬƌ 4.4.3 Linear Discriminant Analysis for D > 1 66

Slide 67

Slide 67 text

żųůƆ ĜăƪDŽƦŸƘƽDŽÖŹŔgŃŸƪDŽƦŸƘƽDŽÖƌƎ ƇŘ&ŶŵƏŸŹ~ÑŬƌ ůƄƎŔ¥ŤŘƪDŽƦƓ›ŮűŞűŢŸ0ŐQŶıŦų ƘƽDŽÖŝŝƏŢųŝ ŲŞƏƌ ĜăƪDŽƦŲřƄşŘş»ŵƱƽƺDŽƦƓfŤūŜƍŷ ŢƐŹD ųNŸÃŝ_ŞŘųŔƌƎŏďŬƌ RŸYHŹD = 2ŲN = 10000ŬŜƍƂųƔŴ€Ŋ ŹŵŘų‡řŠŴ 4.4.3 Linear Discriminant Analysis for D > 1 67

Slide 68

Slide 68 text

žūůƆ ĜăƪDŽƦŸřŭdefaultŸ7HŹ3.33%ŬŮūƌ ůƄƎŔþjdefaultŶŵƍŵŘų ÍŦƏ0ŐQŲƇ 3.33%ŸƘƽDŽÖŶŵƏƌ LDAŸ0ŐQƌƎlŤŬŠŒŘƘƽDŽÖŶŵƏƌ 4.4.3 Linear Discriminant Analysis for D > 1 68

Slide 69

Slide 69 text

ƘƽDŽŶŹíŐŗƏƌ defaultŦƏŶdefaultŤŵŘų34ŦƏƘƽDŽųŔ defaultŤŵŘŶdefaultŦƏų34ŦƏƘƽDŽ ƘƽDŽŸíŐŶůŘűćŚƏŢųŹ_ Ŭƌ ŸƌřŵĒ(ËIđ2)ŝ#5Ŭƌ 4.4.3 Linear Discriminant Analysis for D > 1 69

Slide 70

Slide 70 text

ĒŶƌƏųLDAŹ104ŝdefaultŦƏų Ťūƌ gŃŶŹ81ŝdefaultŤű23ŝdefaultŤŵŜŮūƌ ůƄƎdefaultŤŵŜŮū9667Ÿřŭ23ŝļIJŚūƌ ÷xŒŘ»ŶƅŚƏƌŷ 4.4.3 Linear Discriminant Analysis for D > 1 70

Slide 71

Slide 71 text

ŬŠŴdefaultŤū333Ÿřŭ252ƇĕįŤūƌDž ,áŵƘƽDŽÖŹŘ»ŶĕŚƏŠŴdefaultŤū Ÿ ŸƘƽDŽÖŹŒŘƌ ƾƣƝŝŒŘƓÔfŤƌřųŤűŘƏƛDŽƬêŸĘ ÏŜƍŦƏųŔ252/333=75.7%ųŘř¢bŹDŠ+Ɛƍ ƐŵŘEĉ‰ŝŗƏƌ 4.4.3 Linear Discriminant Analysis for D > 1 71

Slide 72

Slide 72 text

0ŐŸ‰ĉŹĐØdųŜÙÓdŲƇĹĔŬƌ xųÔÞxųŘřÚğŝŗƏƌ xŹßŠŝŗƏųŞŶ·µŲÀŤşł‰ŶŵƏÖŲŗ ƎŔŢŸYHŹ24.3%ųŘ ÔÞxŹßŠŝŵŘųŞŶ·µŲÀŤşŀ‰ŶŵƏÖ ŲŗƎŔŢŸYHŹ99.8%ųŒŘ 4.4.3 Linear Discriminant Analysis for D > 1 72

Slide 73

Slide 73 text

ŵƔŲŢƔŵŶxŝŘŸŜLJ ƵƖƤ0ŐQŹ,űŸ0ŐQŸ Ų­ƇĠqÖŝ Řƌ(ƜƗƣƻƪƿŝÀŤŘYH) ŢƐŹŔƘƽDŽŝŴŸƝƽƣŜƍ±ūŜŶŜŜƒƍŧŔ ũŸƘƽDŽŸÿNƓ­kĿŶ˜ŚƏųŘřŽKŬƌ ¦ƛDŽƬêŹgŃŶŹdefaultųŵƏƓļIJŮű 0ŐŦƏŢųŹĶŠūŘŜƇŤƐŵŘƌ ũřŘŮūYHƛDŽƬêŸƮDŽƤŶj†ŤűLDAƓ\ «ŤūƎŦƏƌ 4.4.3 Linear Discriminant Analysis for D > 1 73

Slide 74

Slide 74 text

ƵƖƤ0ŐQŹ ‚èÖD;(!)ŝ­Ƈ_ŞşŵƏƝƽƣ ŶĘÍ&Ɠ7Ǝ~űƏƌ !Śź2ƝƽƣŶ0ŠƏdefaultŸYH Ÿ¨ŹŔĘÍ&Źdefault=YesŶ0ŐţƐƏƌ ůƄƎŔ ‚èÖŶjŤű50%Ÿľ&Ų0ŐŤűŘƏų ĚŚƏŷ defaultŲŗƏŸĠ·æÖƓšūŘYHŹŔŢŸľ &ƓšƏųčŘŷ 4.4.3 Linear Discriminant Analysis for D > 1 74

Slide 75

Slide 75 text

ľ&Ɠ20%ŶšūYHŸû³ƓĒŶéŦƌ defaultŤū333ŸřŭĕįŤūŸŹ138(41.4%) ŬŮūƌ ţŮŞŸ50%Ÿľ&Ÿ¨Ź75.7%ŬŮūƌŷ ŧŘſƔŸPŤūŷ 4.4.3 Linear Discriminant Analysis for D > 1 75

Slide 76

Slide 76 text

ŬŠŴƇŭƑƔčŘŢųźŜƎťƈŵŘƌ ,ŸƘƽDŽÖųŤűŹ2.75%Ŝƍ3.73%ŶŝŮūƌ ƝǀƢƨƫêŜƍŦƏųľ&20%Ÿ¦ŝčŘŜƇŤƐ ŵŘƌŷ 4.4.3 Linear Discriminant Analysis for D > 1 76

Slide 77

Slide 77 text

ľ&Ɠ\«ŤūYHŸƘƽDŽÖŸƞƽƳƓīŨƏƌ ½ĨŝţŮŞŸľ&Ŭƌ ŘƑŘƑŵƘƽDŽÖŝľ&ŸĽ¢ųŤűƴǁƨƫţƐű ŘƏƌ 4.4.3 Linear Discriminant Analysis for D > 1 77

Slide 78

Slide 78 text

œŘgĂŝ,ŸƘƽDŽÖŬƌ ŇŘçĂŝdefaultŦƏŶdefaultŤŵŘų34ŤūƘ ƽDŽÖŬƌ ƚǀǂƢŸÏĂŹdefaultŤŵŘŶdefaultŦƏų34 ŤūƘƽDŽÖŬƌ 4.4.3 Linear Discriminant Analysis for D > 1 78

Slide 79

Slide 79 text

ľ&ŝ0.5Ÿ¨ŝ,ŸƘƽDŽÖŹÝŘƌ ŬŠŴŔŴŸƘƽDŽÖƓšūŘŜŶƌŮűľ&Ź\ :ŦƏƌ ƇŭƑƔŔŶƇ»ŖŵŒXŜƍľ&ƓÆfŦƏƌ 4.4.3 Linear Discriminant Analysis for D > 1 79

Slide 80

Slide 80 text

ROCªĂƓéŦƌ ĄĨŹdefaultŦƏŶdefaultŦƏų34ŤūÖŬƌ ůƄƎŔxŸŢųŬƌ 4.4.3 Linear Discriminant Analysis for D > 1 ½ĨŹdefaultŤŵŘŶdefault ŦƏų34ŤūÖŬƌ ůƄƎŔ1-ÔÞxŬƌ ŗƍƊƏľ&ŲŔůŸÖƓě óŤűƴǁƨƫŤūŸŝROCªĂ Ŭƌ 80

Slide 81

Slide 81 text

4.4.3 Linear Discriminant Analysis for D > 1 81

Slide 82

Slide 82 text

4.4.3 Linear Discriminant Analysis for D > 1 ŇĂŸŸʼnîŝ_ŞŠƐźŔčŘ0ŐŝŲŞƏƌ ũŸʼnîŹAUCųLźƐƏƌ RŸYHAUCŹ0.95ųňtŶŒŘƌ £ƆŸçĂŹŔŗƍƊƏľ&Ŷ śŘűůŸÖŝòŤŘĂŬƌ 6ƶDŽƢŸTŲŘřųŔ0rŝ ĹŵŮűŘƏųŞųŜŶŵƏƌ ŢŸYH34Ÿû³ŹŸŒ XƇŚŵŘƌ 82

Slide 83

Slide 83 text

4.4.3 Linear Discriminant Analysis for D > 1 ČáŶ0ŐŤūû³ŹĒŸƌřŶŵƏƌ “dž”Ɠ·/ŤūŘƌŔjïġŬƌ “DŽ”ũŸAjŬƌŔsÐġŬƌ defaultƪDŽƦŸYHŹ”dž”ŝdefault=YesŬƌ 83

Slide 84

Slide 84 text

4.4.3 Linear Discriminant Analysis for D > 1 ĹĔŵýěĺƓŸĒŶéŦƌ False Pos. rateųTrue Pos. rateŸ0ÂŹGƝƽƣŸgŃ Ÿ%¢Ŭƌ Pos. pred. valueųNeg. Pred. valueŸ0ÂŹGƝƽƣŸ ÍŤū%¢Ŭƌ 84

Slide 85

Slide 85 text

4.4.4 Quadratic Discriminant Analysis ¾340²

Slide 86

Slide 86 text

4.4.4 Quadratic Discriminant Analysis LDAŲŹŔĘÍ&ŹŔGƝƽƣŝU¯ŸuVƵƝƫƿ ųŔ,Ɲƽƣ-ıŸ0¡-0¡đ2Ɠ›ů^\ĺÀ Ė0rŶƒřųfŤūƌ ¾340²(QDA)ŲŹŔĘÍ&ŝGƝƽƣU¯Ÿ uVƵƝƫƿų0¡-0¡đ2Ɠ›ů^\ĺÀĖ0 rŶƒřųfŦƏƌ IJŘŹ0¡-0¡đ2ŝ-ı(LDA)ŜGƝƽƣU¯ (QDA)ŜŬŷ yŲ¬şųŢřŵƏƌ !~V(J;, Σ;) Σ; ŝƝƽƣ<Ÿ0¡-0¡đ2Ŭƌ 86

Slide 87

Slide 87 text

4.4.4 Quadratic Discriminant Analysis ŢŸfŸųŞŘůƇŸH; A Ź ųŵƏƌ QDAŹŔĘÍƪDŽƦ! = AŹŔH; A ŝ­Ƈ_ŞşŵƏ Ɲƽƣ<Ŷ0ŐŦƏƌ J;, Σ;, :; Źf&ƓÚŘƏƌ LDAųŹIJŮűQDAŹAŸ¾yŶŵŮűƏƌ 87

Slide 88

Slide 88 text

4.4.4 Quadratic Discriminant Analysis QDAŸ¦ŝ}ũřŬŠŴLDAŸĻ”ŹŗƏŸLJ DíŸġ§\¢ŝŗƏYHŔLDAŹD(D + 1)/2%Ÿ ƱƽƺDŽƦƓfŦƏ…ĔŝŗƏƌ ¦QDAŹ9D(D + 1)/2%ƇŗƏƌ ŬŠŴŔQDAŹLDAƌƎƇ´ħ‰ŝŒŘƌ ƺƾƨƫƪƺƾƨƫŹũƐŪƐŗƏƌ 88

Slide 89

Slide 89 text

4.4.4 Quadratic Discriminant Analysis ùĎŸçĂŹƵƖƤ0ŐQŶƌƏÆfZÛŔœĎŸ ÏĂŹLDAŸÆfZÛŔĀĎŸgĂŹQDAŸÆfZ ÛŬƌ pŹΣ" = Σ# ŔFŹΣ" ≠ Σ# Ŭƌ 89

Slide 90

Slide 90 text

4.4.4 Quadratic Discriminant Analysis pŸ¨ŹLDAŸ¦ŝčţũř FŸ¨ŹQDAŸ¦ŝčţũř 90

Slide 91

Slide 91 text

4.5 A Comparison of Classification Method 34•ÈŸÃĪ

Slide 92

Slide 92 text

4.5 A Comparison of Classification Method ŢŸðŲŹǁƢƣƩƕƨƝRsŔLDAŔQDAŶůŘűġ §Ťūƌ 2ðŲŹkĭ'È(KNN)ŶůŘűġ§Ťūƌ ũƐŪƐŸƺƾƨƫƪƺƾƨƫŶůŘűġ§ŤűŘşƌ 92

Slide 93

Slide 93 text

4.5 A Comparison of Classification Method ƄŧŹŔǁƢƣƩƕƨƝRsųLDAŸĽ$ŶůŘűġ §ŦƏƌ ġ§\¢ŝ1ů(D = 1)Ų2ůŸƝƽƣŝŗƏųŞŔ ŢŸyŜƍLDAŸj¢ƚƨƤŹ ų¬ŠƏƌ ^. ų^" ŹŔJ" ųJ# ųE#ŸĽ¢Ŭƌ 93

Slide 94

Slide 94 text

4.5 A Comparison of Classification Method ǁƢƣƩƕƨƝRsŸj¢ƚƨƤŹ ŢƔŵŬŮūƌŷ ŴŭƍƇAŶjŤűĂŵĽ¢Ŭƌ ůƄƎŔǁƢƣƩƕƨƝRsƇLDAƇÆfZÛŹãĂŲ āŵƔŬƌ ůŸIJŘŹ­mÈƓ řŜŔÀĖ0rƓfŤű uVų0¡ƓěóŦƏŜŬƌ ŭŵƅŶD > 1ŲƇI»Ŭƌ 94

Slide 95

Slide 95 text

4.5 A Comparison of Classification Method ûnůŸ•ÈŹŔIťû³ƓƇūƍŦŮű‡řŜƇ ŤƐŵŘ ŠŴ_ŹIJřû³ŶŵƏƌ LDAŹŔĘÍ&Ź,Ɲƽƣ-ıŸ0¡-0¡đ2Ɠ ›ůÀĖ0rŶƒřųfŤűŘūŷ ŬŜƍŔũŸfŝ’ƎïůųŞŹLDAŹǁƢƣƩƕƨ ƝRsƌƎ)ëŬƌ İŶŢŸfŝ’ƎïūŵŘųŞŹǁƢƣƩƕƨƝR sŸ¦ŝ)ëŬƌ 95

Slide 96

Slide 96 text

4.5 A Comparison of Classification Method KNNŹŢŸðŲġ§Ťū¦ÈųŹ,şIJŮūƌŷ ĘÍ&! = AŸƝƽƣŹŔ !ŜƍĭŘŋŶ9%ŸĘÍ &Ɠ@ÒŤűƝƽƣƓ34ŦƏƌ ůƄƎŔKNNŹƯǂƱƽƺƫƾƨƝŵ•ÈŬƌ ŬŜƍŔZÛŝ,ÑĂťƈŵŘųŞŹŔLDAƉǁƢ ƣƩƕƨƝƌƎ)ëŬƌ ŬŠŴŔKNNŹŴŸġ§\¢ŝĹĔŜųŜŹ0Ŝƍ ŵŘƌ 96

Slide 97

Slide 97 text

4.5 A Comparison of Classification Method QDAŹŔKNNųLDAśƌŽǁƢƣƩƕƨƝRsŸa; ÏƅūŘŵƇŸŬƌ QDAŹŔÆfZÛŝ2¾Ľ¢ŬŜƍĂŵ¦ÈƌƎƇ ´ħ‰ŹŒŘƌ ŲƇŔKNNƂŴŲŹŵŘƌ 97

Slide 98

Slide 98 text

4.5 A Comparison of Classification Method 4ůŸ•ÈƓŔ6íŸơƭƾƚŸƪDŽƦƓŮűĴÚŤ űƅūƌ 3ůŹĂŵZÛŲŔÁƎ3ůŹňĂŵZÛŬƌ ũƐŪƐŶjŤűŔ100%ŸƪDŽƦƥƨƫƓŮūƌ ũŤűŔƘƽDŽÖƓěóŤūƌ KNNŹ9 = 1ŸųŞųCV(cross-validation)ųŘř•Èų 2íŐŤūƌ CVŶůŘűŹ5ðŲġ§ŦƏƌ ũƐŪƐŸơƭƾƚŹD = 2Ŭƌ ůƄƎġ§\¢ŹůŬƌ 98

Slide 99

Slide 99 text

4.5 A Comparison of Classification Method Ÿůŝ ĂŵZÛ ŲŔŸů ŝňĂŵ ZÛŬƌ 99

Slide 100

Slide 100 text

4.5 A Comparison of Classification Method ơƭƾƚ 1 2ůŸƝƽƣŲũƐŪƐĘÍ&Ź20 %Ŭƌ GƝƽƣŸĘÍ&ŹuV&ŝÞŵ ƏÐäĽŵÀĖ ¢Ŭƌ 100

Slide 101

Slide 101 text

4.5 A Comparison of Classification Method ơƭƾƚ 1 LDAŝÝřƄşŘŮūƌ KNNŹưƽůŘūƌ QDAŹ´ħ‰ŝ…ĔŶŒŘŜ ƍLDAƌƎ‹:Ťūƌ ǁƢƣƩƕƨƝRsŹÀĖ0rŸ fƓśŜŵŘ0ŔLDAƌƎƂƔŸ ŭƋŮų9Ůūƌ 101

Slide 102

Slide 102 text

4.5 A Comparison of Classification Method ơƭƾƚ 2 ĝfŹơƭƾƚ1ųƂƃāŬƌ ūŬŤŔ2ůŸġ§\¢ŸļŶ-0.5 ŸäĽŝŗƏƌ ơƭƾƚ1ųäjáŵƱƳƙDŽƷǂ ƣŹƂųƔŴāŬƌ 102

Slide 103

Slide 103 text

4.5 A Comparison of Classification Method ơƭƾƚ 3 !" ų!# Źt0rŶƒřƌ GƝƽƣ50%ŸƪDŽƦŝŗƏƌ t0rŸÕŹƂųƔŴÀĖ0rų āŬŠŴŔuV&ŜƍijŘ&ŝ ^şŵƏ(JŶŗƏƌ ZÛĂŹãĂŲŔǁƢƣƩƕƨƝR sŝĴŤűŘūƌ LDAŹfŝÀĖ0rŬŜƍŷ 103

Slide 104

Slide 104 text

4.5 A Comparison of Classification Method ơƭƾƚ 3 LDAƇǁƢƣƩƕƨƝRsƇŸ¦ ÈƌƎ)ƐűŘūƌ QDAŹÀĖ0rťƈŵŜŮūūƆ ŶŜŵƎ‹:Ťūƌ 104

Slide 105

Slide 105 text

4.5 A Comparison of Classification Method ơƭƾƚ 4 Ɲƽƣ1Ÿġ§\¢ļŸäĽŹ0.5 ŲŔƝƽƣ2Ÿġ§\¢ļŸäĽŹ -0.5Ŭƌ ŢƐŹQDAŸfŶj†ŤűŘƏ ƌ ZÛŹ¾ªĂŶŵƏƌ ūŤŜŶQDAŝŸ•ÈƌƎ)Ɛ űŘƏŢųŝĕűCƐƏŷ 105

Slide 106

Slide 106 text

4.5 A Comparison of Classification Method ơƭƾƚ 5 GƝƽƣŸĘÍ&ŹŔġ§\¢ļ ŶäĽŸŵŘÀĖ ¢Ŭƌ âá\¢ŹŔ!" #ų!# #ų!"×!# Ɠġ §\¢ųŤűŔǁƢƣƩƕƨƝĽ¢ ŜƍƠǂƴƿŤūƌ ũŸû³ŔZÛŹ¾ŶŵƏƌ QDAŝÝŘŘŷ ¾ŶKNN-CVŬƌ Ăŵ•Èʉĉŝ‹Řƌ 106

Slide 107

Slide 107 text

4.5 A Comparison of Classification Method ơƭƾƚ 6 ĞúŹơƭƾƚ5ųāŬŠŴŔâ á\¢ŹƇŮųēŅŵňĂĽ¢ ŜƍƠǂƴƾǂƞŤūƌ ũŸû³ŔQDAŲƇĴ1ŵƻƪƿ :ŝŲŞŵŜŮūƌ Ăŵ•ÈƌƎQDAŸ¦ŝƷơŬ ŠŴŔƌƎ´ħŵKNN-CVŝÝč ŜŮūƌ 107

Slide 108

Slide 108 text

4.5 A Comparison of Classification Method ơƭƾƚ 6 KNN-1ŝ݋ŜŮūƌ ƪDŽƦŝēŅŵňĂĽ¢ŲŗŮ űƇŔKŸ&ŝÀŤşĵ™ţƐŵŘų "ÑųŤűĤ{ŲŗƏŢųƓéŤű ŘƏƌ 108

Slide 109

Slide 109 text

4.5 A Comparison of Classification Method Ÿ6ůŸ!ŹŔŴƔŵÕÇŲƇ)ëųŘř•ÈŹ ŵŘŢųƓéŤűŘƏƌ ũƐŪƐŶŘŘųŢŝŗƏƌ ÆfZÛŝĂŵųŞŹLDAųǁƢƣƩƕƨƝRsŝ ŘŘƌ ĴxŶňĂŵųŞŹQDAŝŘŘƌ ēŅŵňĂŸųŞŹKNNŝŘŘƌ ŬŠŴKŸ&Ź‘ĹŶĵſƁŞŬƌ 109