Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Tree Methods

Sunmi Yoon
November 04, 2019

Tree Methods

Decision Tree, Random Forest를 dataitgirls3 학생들에게 가르치기 위해 만든 수업자료입니다.

Sunmi Yoon

November 04, 2019
Tweet

More Decks by Sunmi Yoon

Other Decks in Technology

Transcript

  1. Sex <= 0.5 gini = 0.473 samples = 891 value

    = [549, 342] class = Survived Fare <= 26.269 gini = 0.306 samples = 577 value = [468, 109] class = Survived True Fare <= 48.2 gini = 0.383 samples = 314 value = [81, 233] class = Dead False gini = 0.226 samples = 415 value = [361, 54] class = Survived gini = 0.448 samples = 162 value = [107, 55] class = Survived gini = 0.447 samples = 225 value = [76, 149] class = Dead gini = 0.106 samples = 89 value = [5, 84] class = Dead
  2. Sex <= 0.5 gini = 0.473 samples = 891 value

    = [549, 342] class = Survived Fare <= 26.269 gini = 0.306 samples = 577 value = [468, 109] class = Survived True Fare <= 48.2 gini = 0.383 samples = 314 value = [81, 233] class = Dead False gini = 0.226 samples = 415 value = [361, 54] class = Survived gini = 0.448 samples = 162 value = [107, 55] class = Survived gini = 0.447 samples = 225 value = [76, 149] class = Dead gini = 0.106 samples = 89 value = [5, 84] class = Dead Root Node (ࡸܻ) Intermediate Node (о૑) Terminal Node, Leaf (੖)
  3. Sex <= 0.5 gini = 0.473 samples = 891 value

    = [549, 342] class = Survived Fare <= 26.269 gini = 0.306 samples = 577 value = [468, 109] class = Survived True Fare <= 48.2 gini = 0.383 samples = 314 value = [81, 233] class = Dead False gini = 0.226 samples = 415 value = [361, 54] class = Survived gini = 0.448 samples = 162 value = [107, 55] class = Survived gini = 0.447 samples = 225 value = [76, 149] class = Dead gini = 0.106 samples = 89 value = [5, 84] class = Dead അ ਤ஖ী ؘ੉ఠо ݻ ѐ ਤ஖೧ ੓ח૑ Ӓ ؘ੉ఠٜ੉ যڃ ۄ߰ਸ о஖Ҋ ੓ח૑
  4. Sex <= 0.5 gini = 0.473 samples = 891 value

    = [549, 342] class = Survived Fare <= 26.269 gini = 0.306 samples = 577 value = [468, 109] class = Survived True Fare <= 48.2 gini = 0.383 samples = 314 value = [81, 233] class = Dead False gini = 0.226 samples = 415 value = [361, 54] class = Survived gini = 0.448 samples = 162 value = [107, 55] class = Survived gini = 0.447 samples = 225 value = [76, 149] class = Dead gini = 0.106 samples = 89 value = [5, 84] class = Dead যڃ ӝળਵ۽ о૑஖ӝܳ ೮ח૑ (gini ژח entropy)
  5. Sex <= 0.5 gini = 0.473 samples = 891 value

    = [549, 342] class = Survived Fare <= 26.269 gini = 0.306 samples = 577 value = [468, 109] class = Survived True Fare <= 48.2 gini = 0.383 samples = 314 value = [81, 233] class = Dead False gini = 0.226 samples = 415 value = [361, 54] class = Survived gini = 0.448 samples = 162 value = [107, 55] class = Survived gini = 0.447 samples = 225 value = [76, 149] class = Dead gini = 0.106 samples = 89 value = [5, 84] class = Dead Terminal Nodeী ب଱ೠ ؘ੉ఠٜਸ যڌѱ ࠙ܨೡ Ѫੋ૑
  6. Impurity ੄ࢎѾ੿աޖח Impurity (ࠛࣽب, ࠛഛपࢿ)੉ ծই૑ח ߑߨਵ۽ ೟ण೤פ׮. ࣽبо ૐоೞח

    Ѫਸ فҊ Information gain੉ۄҊ ೞӝب ೤פ׮. য়ט਷ ੄ࢎѾ੿աޖ੄ ࠛࣽب ஏ੿ ߑߨ ઺, Gini Indexܳ ҕࠗ೤פ׮.
  7. Sex <= 0.5 gini = 0.473 samples = 891 value

    = [549, 342] class = Survived Fare <= 26.269 gini = 0.306 samples = 577 value = [468, 109] class = Survived True Fare <= 48.2 gini = 0.383 samples = 314 value = [81, 233] class = Dead False gini = 0.226 samples = 415 value = [361, 54] class = Survived gini = 0.448 samples = 162 value = [107, 55] class = Survived gini = 0.447 samples = 225 value = [76, 149] class = Dead gini = 0.106 samples = 89 value = [5, 84] class = Dead G = d ∑ i=1 Ri ( 1 − m ∑ k=1 p2 ik) Step 1. gini = 0.473 ਸ ૒੽ ҅࢑೧ ঳যࠁࣁਃ Step 2. gini = 0.226 ਸ ૒੽ ҅࢑೧ ঳যࠁࣁਃ
  8. Bagging ߓӦ(bagging)਷ bootstrap aggregating੄ ড੗۽, ࠗ౟झ౟ە(bootstrap)ਸ ా೧ ઑӘঀ ׮ܲ ള۲

    ؘ੉ఠী ؀೧ ള۲ػ ӝୡ ࠙ܨӝ(base learner)ٜਸ Ѿ೤(aggregating)दఃח ߑߨ੉׮. ࠗ౟झ౟ە੉ۆ, ઱য૓ ള۲ ؘ੉ఠীࢲ ઺ࠂਸ ೲਊೞৈ ਗ ؘ੉ఠࣇҗ э਷ ௼ӝ੄ ؘ੉ఠࣇਸ ݅٘ח җ੿ਸ ݈ೠ׮. ߓӦਸ ా೧ ےؒ ನۨझ౟ܳ ള۲दఃח җ੿਷ ׮਺җ э੉ ࣁ ױ҅۽ ૓೯ػ׮. 1. ࠗ౟झ౟ە ߑߨਸ ా೧ Nѐ੄ ള۲ ؘ੉ఠࣇਸ ࢤࢿೠ׮. 2. Nѐ੄ ӝୡ ࠙ܨӝ(౟ܻ)ٜਸ ള۲दఅ׮. 3. ӝୡ ࠙ܨӝ(౟ܻ)ٜਸ ೞա੄ ࠙ܨӝ(ےؒ ನۨझ౟)۽ Ѿ೤ೠ׮(ಣӐ ژח җ߈ࣻై಴ ߑध ੉ਊ). Wikipedia ےؒನۨझ౟ > ߓӦਸ ੉ਊೠ ನۨझ౟ ҳࢿ
  9. ՘