Slide 1

Slide 1 text

Tree methods dataitgirls3 Instructor Sunmi Yoon

Slide 2

Slide 2 text

Decision Tree

Slide 3

Slide 3 text

Sex <= 0.5 gini = 0.473 samples = 891 value = [549, 342] class = Survived Fare <= 26.269 gini = 0.306 samples = 577 value = [468, 109] class = Survived True Fare <= 48.2 gini = 0.383 samples = 314 value = [81, 233] class = Dead False gini = 0.226 samples = 415 value = [361, 54] class = Survived gini = 0.448 samples = 162 value = [107, 55] class = Survived gini = 0.447 samples = 225 value = [76, 149] class = Dead gini = 0.106 samples = 89 value = [5, 84] class = Dead

Slide 4

Slide 4 text

Sex <= 0.5 gini = 0.473 samples = 891 value = [549, 342] class = Survived Fare <= 26.269 gini = 0.306 samples = 577 value = [468, 109] class = Survived True Fare <= 48.2 gini = 0.383 samples = 314 value = [81, 233] class = Dead False gini = 0.226 samples = 415 value = [361, 54] class = Survived gini = 0.448 samples = 162 value = [107, 55] class = Survived gini = 0.447 samples = 225 value = [76, 149] class = Dead gini = 0.106 samples = 89 value = [5, 84] class = Dead Root Node (ࡸܻ) Intermediate Node (о૑) Terminal Node, Leaf (੖)

Slide 5

Slide 5 text

Sex <= 0.5 gini = 0.473 samples = 891 value = [549, 342] class = Survived Fare <= 26.269 gini = 0.306 samples = 577 value = [468, 109] class = Survived True Fare <= 48.2 gini = 0.383 samples = 314 value = [81, 233] class = Dead False gini = 0.226 samples = 415 value = [361, 54] class = Survived gini = 0.448 samples = 162 value = [107, 55] class = Survived gini = 0.447 samples = 225 value = [76, 149] class = Dead gini = 0.106 samples = 89 value = [5, 84] class = Dead അ ਤ஖ী ؘ੉ఠо ݻ ѐ ਤ஖೧ ੓ח૑ Ӓ ؘ੉ఠٜ੉ যڃ ۄ߰ਸ о஖Ҋ ੓ח૑

Slide 6

Slide 6 text

Sex <= 0.5 gini = 0.473 samples = 891 value = [549, 342] class = Survived Fare <= 26.269 gini = 0.306 samples = 577 value = [468, 109] class = Survived True Fare <= 48.2 gini = 0.383 samples = 314 value = [81, 233] class = Dead False gini = 0.226 samples = 415 value = [361, 54] class = Survived gini = 0.448 samples = 162 value = [107, 55] class = Survived gini = 0.447 samples = 225 value = [76, 149] class = Dead gini = 0.106 samples = 89 value = [5, 84] class = Dead যڃ ӝળਵ۽ о૑஖ӝܳ ೮ח૑ (gini ژח entropy)

Slide 7

Slide 7 text

Sex <= 0.5 gini = 0.473 samples = 891 value = [549, 342] class = Survived Fare <= 26.269 gini = 0.306 samples = 577 value = [468, 109] class = Survived True Fare <= 48.2 gini = 0.383 samples = 314 value = [81, 233] class = Dead False gini = 0.226 samples = 415 value = [361, 54] class = Survived gini = 0.448 samples = 162 value = [107, 55] class = Survived gini = 0.447 samples = 225 value = [76, 149] class = Dead gini = 0.106 samples = 89 value = [5, 84] class = Dead Terminal Nodeী ب଱ೠ ؘ੉ఠٜਸ যڌѱ ࠙ܨೡ Ѫੋ૑

Slide 8

Slide 8 text

sklearn Code

Slide 9

Slide 9 text

Impurity

Slide 10

Slide 10 text

Impurity ੄ࢎѾ੿աޖח Impurity (ࠛࣽب, ࠛഛपࢿ)੉ ծই૑ח ߑߨਵ۽ ೟ण೤פ׮. ࣽبо ૐоೞח Ѫਸ فҊ Information gain੉ۄҊ ೞӝب ೤פ׮. য়ט਷ ੄ࢎѾ੿աޖ੄ ࠛࣽب ஏ੿ ߑߨ ઺, Gini Indexܳ ҕࠗ೤פ׮.

Slide 11

Slide 11 text

Sex <= 0.5 gini = 0.473 samples = 891 value = [549, 342] class = Survived Fare <= 26.269 gini = 0.306 samples = 577 value = [468, 109] class = Survived True Fare <= 48.2 gini = 0.383 samples = 314 value = [81, 233] class = Dead False gini = 0.226 samples = 415 value = [361, 54] class = Survived gini = 0.448 samples = 162 value = [107, 55] class = Survived gini = 0.447 samples = 225 value = [76, 149] class = Dead gini = 0.106 samples = 89 value = [5, 84] class = Dead G = d ∑ i=1 Ri ( 1 − m ∑ k=1 p2 ik) Step 1. gini = 0.473 ਸ ૒੽ ҅࢑೧ ঳যࠁࣁਃ Step 2. gini = 0.226 ਸ ૒੽ ҅࢑೧ ঳যࠁࣁਃ

Slide 12

Slide 12 text

https://imgur.com/n3MVwHW

Slide 13

Slide 13 text

Random Forest

Slide 14

Slide 14 text

ৈ۞ ౟ܻٜਸ ‘׮ܰѱ’ ݅ٚ׮. https://www.researchgate.net/figure/Architecture-of-the-random-forest-model_fig1_301638643

Slide 15

Slide 15 text

https://community.alteryx.com/t5/Alteryx-Designer-Knowledge-Base/Seeing-the-Forest-for-the-Trees-An-Introduction-to-Random-Forest/ta-p/158062 bagging = bootstrap aggregating

Slide 16

Slide 16 text

Bagging ߓӦ(bagging)਷ bootstrap aggregating੄ ড੗۽, ࠗ౟झ౟ە(bootstrap)ਸ ా೧ ઑӘঀ ׮ܲ ള۲ ؘ੉ఠী ؀೧ ള۲ػ ӝୡ ࠙ܨӝ(base learner)ٜਸ Ѿ೤(aggregating)दఃח ߑߨ੉׮. ࠗ౟झ౟ە੉ۆ, ઱য૓ ള۲ ؘ੉ఠীࢲ ઺ࠂਸ ೲਊೞৈ ਗ ؘ੉ఠࣇҗ э਷ ௼ӝ੄ ؘ੉ఠࣇਸ ݅٘ח җ੿ਸ ݈ೠ׮. ߓӦਸ ా೧ ےؒ ನۨझ౟ܳ ള۲दఃח җ੿਷ ׮਺җ э੉ ࣁ ױ҅۽ ૓೯ػ׮. 1. ࠗ౟झ౟ە ߑߨਸ ా೧ Nѐ੄ ള۲ ؘ੉ఠࣇਸ ࢤࢿೠ׮. 2. Nѐ੄ ӝୡ ࠙ܨӝ(౟ܻ)ٜਸ ള۲दఅ׮. 3. ӝୡ ࠙ܨӝ(౟ܻ)ٜਸ ೞա੄ ࠙ܨӝ(ےؒ ನۨझ౟)۽ Ѿ೤ೠ׮(ಣӐ ژח җ߈ࣻై಴ ߑध ੉ਊ). Wikipedia ےؒನۨझ౟ > ߓӦਸ ੉ਊೠ ನۨझ౟ ҳࢿ

Slide 17

Slide 17 text

sklearn Code

Slide 18

Slide 18 text

՘