Slide 1

Slide 1 text

0 2 . .. 2 2 1 )152"1&. - 604'.,3%+ - 20193 16 $/-3 ($!4#2*4

Slide 2

Slide 2 text

0 2 . .. 2 2 1 0. 1. J ໦ͷੜ੒ํ๏ JJ ෆ७౓ JJJ ໦ͷႩఆํ๏ 2. Random Forest J #PPUTUSBQ4BNQMJOH JJ "EB#PPTUͱ3BOEPN'PSFTU 3. Gradient Boostingʢ ʣ

Slide 3

Slide 3 text

1 00 . 21 0 1 1 00 . VXWU[ G0 • +HB9P • !L+F@6AESCE:H#-=A6P;QDK F

Slide 4

Slide 4 text

1 00 . 21 0 1 1 00 . *6>3247 %#+)'-"Titanic6>3247,& 5;>.<0;1:#% = $2 )&% # !& & $ #!/>8% ($GitHub#.49!&

Slide 5

Slide 5 text

0 2 . .. 2 2 1 • ʢ ʣ 1. ܾఆ໦

Slide 6

Slide 6 text

1 00 . 21 0 1 1 00 . )- • )- *2.1%'* 20 ,$) ,/"+(#*!/"*453,+*,& 6 6

Slide 7

Slide 7 text

1 00 . 21 0 1 1 00 . " ! x1 > 5 x1 <= 5 x2 <= 0.6 x2 > 0.6 x3 = 0 x3 = 1

Slide 8

Slide 8 text

1 00 . 21 0 1 1 00 . • !"node# • !"root node# ! • !"terminal node $ !: leaf node# ! • "branch, sub tree#

Slide 9

Slide 9 text

1 00 . 21 0 1 1 00 .

Slide 10

Slide 10 text

1 00 . 21 0 1 1 00 . F F EG]Y^QW\$D CYW\VR`$D 52N1 • ]Y^QW\$D 2N@F (XaUP:7,B6N"F0P:"D,*PN1 "E;N&PJNHD5L*P) :A37 • YW\VR`$D [aZBAF (XaUPB6N=8+MFD3K4ES_T8B6N" -P;1c

Slide 11

Slide 11 text

1 00 . 21 0 1 1 00 .

Slide 12

Slide 12 text

1 00 . 21 0 1 1 00 . # #! "$#'!( • CART(C&RT, Classification And Regression Treeʣ • ID3(Iterative Dichotomiser 3) • C4.5 • CHAID(Chi-squared Automatic Interaction Detection) ")( $CART (%#&CART* "%

Slide 13

Slide 13 text

1 00 . 21 0 1 1 00 . CART

Slide 14

Slide 14 text

1 00 . 21 0 1 1 00 . 2 C4.5CHAID

Slide 15

Slide 15 text

1 00 . 21 0 1 1 00 . ʢ)? D?K =>D#* • J L Joint ProbabilityK "@$0?%1D@4=+ $A=B0,:8=1*$A=B0?%1DF! ", $ =!7+ • & 89;93@$0%1D@4=+ $A=B0,:8=1*$B?'>2$A0%1DF! " = ∑'() * !(", $' )=!7 • 1 ,D$0%4:8=-.@B=< @$0%1D@4=+ $B0%4:8<$A0%1DF! "|$ =!6*! "|$ = . /,0 . 0 =5ED+ • = HIGFD?/CED@4=F=--*HIG0/CE8?CED@ 4=F=-.+ * (@2Ato-kei.netF ?

Slide 16

Slide 16 text

1 00 . 21 0 1 1 00 . ʢ &+&6 ' • 150 7! • 3527 " • !+.4/7 #$ % = {1,2, … , ,} • 352"&!+150 7 !(") • .4/%&!+150 7 !$ • 352"&!+150'"%'.4/&!+150 7!$ (") → ,*-#352"&#.4/%$%+ 0 #$ " = 12(3) 1(3) -)+ 352"'.4/( ' $%+.4/argmax $ 0(#$ |") $%+

Slide 17

Slide 17 text

1 00 . 21 0 1 1 00 . ! "# $ = &'()) &()) ͷಋग़ #+ ! "# = ⁄ -# - #+ $"$!$ ! $ "# = ⁄ -# ($) -# #+"$!$ ! "# , $ = ! "# ! $ "# = -# - -# ($) -# = -# ($) - "$!$ ! $ = / #01 2 ! "# , $ = / #01 2 -# ($) - = -($) -

Slide 18

Slide 18 text

1 00 . 21 0 1 1 00 . ! "# $ = &'()) &()) ͷಋग़ʢ͖ͭͮʣ # + ! "# , $ = ⁄ .# ($) ., "$!$ ! $ = ⁄ .($) . "$!$# + ! "# $ = ! "# , $ ! $ = ⁄ .# ($) . ⁄ .($) . = .# ($) .($) → "$!$#

Slide 19

Slide 19 text

1 00 . 21 0 1 1 00 . #&"! ( sklearn.DecisionTreeClassifier!,14.) (%& +3.204 5entropy6"-/5gini6%2 *! " ""))%'$)( • +3.2047 ! " = − ∑ &'( ) * +& " ln *(+& |") • -/7 ! " = ∑ &'( ) * +& " (1 − * +& " )

Slide 20

Slide 20 text

1 00 . 21 0 1 1 00 . 0 1!.7 • %6 9 ↓ • $"/+"=@;9 ↓ • $)=@;'7(7-9 • $)0 *394%$"9 ↓ • ,0 ?$"0 * 394%28&030-=@;9 →#790=@;:<>/35% $"0 5$ 0 5$

Slide 21

Slide 21 text

1 00 . 21 0 1 1 00 . * • .2( +(0 2 • */)(0 "%!,' )1 →*2&-1'&7*8 !($# **35648

Slide 22

Slide 22 text

1 00 . 21 0 1 1 00 . 0768/ +% 5&4*20768/)4=?>@95-$4 sklearn.DecisionTreeClassifier/" ,103!.=?>@9# • max_depth(default=None)...0'AB • min_samples_split (default=2).../.<@;0:@9 • min_samples_leaf (default=1)...(*0AB<@;/.0:@9 • max_leaf_nodes (default=None)...AB<@;0

Slide 23

Slide 23 text

1 00 . 21 0 1 1 00 . ( • " -!,&*%, • +,)Graphviz-# Graphviz' %*%, 3.134-7(45/8 dtreeplt-$ "(.2608

Slide 24

Slide 24 text

1 00 . 21 0 1 1 00 . • T. Hastie, R. Tibshirani and J. Friedman. “Elements of Statistical Learning”, Springer, 2009. • , $% #.-54, , 2012 • A Complete Tutorial on Tree Based Modeling from Scratch (in R & Python) – Analytic Vidhya • )3+2,0 – Python! • "&' – Code Craft House • [Python]Graphviz# 1*/12dtreeplt( - Qiita

Slide 25

Slide 25 text

0 2 . .. 2 2 1 • ʢ ʣ 2. Random Forest 3BOEPN'PSFTU

Slide 26

Slide 26 text

1 00 . 21 0 1 1 00 . LRandom Forest10M DKFIKHJB • DKFIKHJB ,10&,3"179#7 2 &/ %.=': • 0!$:DKFIKHJB 0! ,+1 -27 #8;/ #:)5 1 =4<(,&, : E?J@6GKAC>J@. *)#:

Slide 27

Slide 27 text

1 00 . 21 0 1 1 00 . -8/836#Bootstrap Sampling • -8/836"% $,&+!", )-4791,# !) →Bagging, Random Forest"%290$Bootstrap Sample, ! $ , !) • Bootstrap Sampling"%/846*'5/8458.! , ):resampling with replacementʗ(;

Slide 28

Slide 28 text

1 00 . 21 0 1 1 00 . Bootstrap Sampling ... ... ... ... ... ... ... ... resampling resampling resampling ... 1 2 n

Slide 29

Slide 29 text

1 00 . 21 0 1 1 00 . Bagging • Bootstrap and AGGregatING • Bootstrap Sampling)")(&" "#' →$ ")!%

Slide 30

Slide 30 text

1 00 . 21 0 1 1 00 . Bagging ... ... → ... ... resampling resampling resampling

Slide 31

Slide 31 text

1 00 . 21 0 1 1 00 . Random Forest • &!A9C.I/5K=6D-?$ /E@,$ ←RMUNBBootstrap SampleBCH<06.5K@, 6=,J(/ 7>+J9E B")/*1@IF8, • Random Forest %A,J(L+H.7EEHK9:2TUPSA'8J3?>") B, @L>0JG-A69 ←Bootstrap SamplingAG;=5K9#QVO4?A,J(/ @J

Slide 32

Slide 32 text

1 00 . 21 0 1 1 00 . Random Forest ... ... ... ... resampling resampling resampling X1 , X2 , X3 , X4 X5 X1 , X3 , X5 X1 , X2 , X3 , X4

Slide 33

Slide 33 text

1 00 . 21 0 1 1 00 . Out-Of-Bag" • Random Forest '&(!#"%# Out-Of-Bag" $*+),Out-Of- Bag- "%# .http://alfredplpl.hatenablog.com/entry/2013/12/24/225420

Slide 34

Slide 34 text

1 00 . 21 0 1 1 00 . sklearnRandomForestClassifier DecisionTreeClassifier &*)-# • n_estimators ... .$' +%010/ • bootstrap ... Bootstrap Sampling'*!.$' +%0True/ • oob_score ... Out-Of-Bag",(+'*! .$' +%0False/

Slide 35

Slide 35 text

1 00 . 21 0 1 1 00 . • , , , 2012 • David S. Moore, et al, Bootstrap Method and Permutation Tests, “The Practice of Business Statistics: Using Data for Decisions”, ch.18, W. H. Freeman • L. Breiman, and A. Cutler, “Random Forests” • Bagging and Random Forest Ensemble Algorithms for Machine Learning – Machine Learning Mastery