Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Tree Methods
Search
Sunmi Yoon
November 04, 2019
Technology
0
110
Tree Methods
Decision Tree, Random Forest를 dataitgirls3 학생들에게 가르치기 위해 만든 수업자료입니다.
Sunmi Yoon
November 04, 2019
Tweet
Share
More Decks by Sunmi Yoon
See All by Sunmi Yoon
데이터 분석가 채용 공고 읽는 방법
ysunmi0427
1
320
Deep down in classification 0.5 magic number
ysunmi0427
0
87
Confusion matrix
ysunmi0427
0
140
심슨의 역설
ysunmi0427
0
2k
회사는 어떤 사람을 데이터 분석가로 채용하고 싶어하는 것일까?
ysunmi0427
0
2.2k
Other Decks in Technology
See All in Technology
日経のデータベース事業とElasticsearch
hinatades
PRO
0
230
【内製開発Summit 2025】イオンスマートテクノロジーの内製化組織の作り方/In-house-development-summit-AST
aeonpeople
2
610
【詳説】コンテンツ配信 システムの複数機能 基盤への拡張
hatena
0
230
What's new in Go 1.24?
ciarana
1
110
PHPカンファレンス名古屋-テックリードの経験から学んだ設計の教訓
hayatokudou
2
540
サイト信頼性エンジニアリングとAmazon Web Services / SRE and AWS
ymotongpoo
7
1.5k
Goで作って学ぶWebSocket
ryuichi1208
3
2.7k
内製化を加速させるlaC活用術
nrinetcom
PRO
2
140
OSS構成管理ツールCMDBuildを使ったAWSリソース管理の自動化
satorufunai
0
640
分解して理解する Aspire
nenonaninu
2
1k
Exadata Database Service on Cloud@Customer セキュリティ、ネットワーク、および管理について
oracle4engineer
PRO
2
1.5k
Ruby on Railsで持続可能な開発を行うために取り組んでいること
am1157154
3
140
Featured
See All Featured
Docker and Python
trallard
44
3.3k
4 Signs Your Business is Dying
shpigford
182
22k
Agile that works and the tools we love
rasmusluckow
328
21k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
175
52k
A Modern Web Designer's Workflow
chriscoyier
693
190k
Designing for Performance
lara
604
68k
How to train your dragon (web standard)
notwaldorf
91
5.9k
Why You Should Never Use an ORM
jnunemaker
PRO
55
9.2k
Fashionably flexible responsive web design (full day workshop)
malarkey
406
66k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
356
29k
A better future with KSS
kneath
238
17k
Fireside Chat
paigeccino
34
3.2k
Transcript
Tree methods dataitgirls3 Instructor Sunmi Yoon
Decision Tree
Sex <= 0.5 gini = 0.473 samples = 891 value
= [549, 342] class = Survived Fare <= 26.269 gini = 0.306 samples = 577 value = [468, 109] class = Survived True Fare <= 48.2 gini = 0.383 samples = 314 value = [81, 233] class = Dead False gini = 0.226 samples = 415 value = [361, 54] class = Survived gini = 0.448 samples = 162 value = [107, 55] class = Survived gini = 0.447 samples = 225 value = [76, 149] class = Dead gini = 0.106 samples = 89 value = [5, 84] class = Dead
Sex <= 0.5 gini = 0.473 samples = 891 value
= [549, 342] class = Survived Fare <= 26.269 gini = 0.306 samples = 577 value = [468, 109] class = Survived True Fare <= 48.2 gini = 0.383 samples = 314 value = [81, 233] class = Dead False gini = 0.226 samples = 415 value = [361, 54] class = Survived gini = 0.448 samples = 162 value = [107, 55] class = Survived gini = 0.447 samples = 225 value = [76, 149] class = Dead gini = 0.106 samples = 89 value = [5, 84] class = Dead Root Node (ࡸܻ) Intermediate Node (о) Terminal Node, Leaf ()
Sex <= 0.5 gini = 0.473 samples = 891 value
= [549, 342] class = Survived Fare <= 26.269 gini = 0.306 samples = 577 value = [468, 109] class = Survived True Fare <= 48.2 gini = 0.383 samples = 314 value = [81, 233] class = Dead False gini = 0.226 samples = 415 value = [361, 54] class = Survived gini = 0.448 samples = 162 value = [107, 55] class = Survived gini = 0.447 samples = 225 value = [76, 149] class = Dead gini = 0.106 samples = 89 value = [5, 84] class = Dead അ ਤী ؘఠо ݻ ѐ ਤ೧ ח Ӓ ؘఠٜ যڃ ۄ߰ਸ оҊ ח
Sex <= 0.5 gini = 0.473 samples = 891 value
= [549, 342] class = Survived Fare <= 26.269 gini = 0.306 samples = 577 value = [468, 109] class = Survived True Fare <= 48.2 gini = 0.383 samples = 314 value = [81, 233] class = Dead False gini = 0.226 samples = 415 value = [361, 54] class = Survived gini = 0.448 samples = 162 value = [107, 55] class = Survived gini = 0.447 samples = 225 value = [76, 149] class = Dead gini = 0.106 samples = 89 value = [5, 84] class = Dead যڃ ӝળਵ۽ оӝܳ ೮ח (gini ژח entropy)
Sex <= 0.5 gini = 0.473 samples = 891 value
= [549, 342] class = Survived Fare <= 26.269 gini = 0.306 samples = 577 value = [468, 109] class = Survived True Fare <= 48.2 gini = 0.383 samples = 314 value = [81, 233] class = Dead False gini = 0.226 samples = 415 value = [361, 54] class = Survived gini = 0.448 samples = 162 value = [107, 55] class = Survived gini = 0.447 samples = 225 value = [76, 149] class = Dead gini = 0.106 samples = 89 value = [5, 84] class = Dead Terminal Nodeী بೠ ؘఠٜਸ যڌѱ ࠙ܨೡ Ѫੋ
sklearn Code
Impurity
Impurity ࢎѾաޖח Impurity (ࠛࣽب, ࠛഛपࢿ) ծইח ߑߨਵ۽ णפ. ࣽبо ૐоೞח
Ѫਸ فҊ Information gainۄҊ ೞӝب פ. য়ט ࢎѾաޖ ࠛࣽب ஏ ߑߨ , Gini Indexܳ ҕࠗפ.
Sex <= 0.5 gini = 0.473 samples = 891 value
= [549, 342] class = Survived Fare <= 26.269 gini = 0.306 samples = 577 value = [468, 109] class = Survived True Fare <= 48.2 gini = 0.383 samples = 314 value = [81, 233] class = Dead False gini = 0.226 samples = 415 value = [361, 54] class = Survived gini = 0.448 samples = 162 value = [107, 55] class = Survived gini = 0.447 samples = 225 value = [76, 149] class = Dead gini = 0.106 samples = 89 value = [5, 84] class = Dead G = d ∑ i=1 Ri ( 1 − m ∑ k=1 p2 ik) Step 1. gini = 0.473 ਸ ҅೧ যࠁࣁਃ Step 2. gini = 0.226 ਸ ҅೧ যࠁࣁਃ
https://imgur.com/n3MVwHW
Random Forest
ৈ۞ ܻٜਸ ‘ܰѱ’ ݅ٚ. https://www.researchgate.net/figure/Architecture-of-the-random-forest-model_fig1_301638643
https://community.alteryx.com/t5/Alteryx-Designer-Knowledge-Base/Seeing-the-Forest-for-the-Trees-An-Introduction-to-Random-Forest/ta-p/158062 bagging = bootstrap aggregating
Bagging ߓӦ(bagging) bootstrap aggregating ড۽, ࠗझە(bootstrap)ਸ ా೧ ઑӘঀ ܲ ള۲
ؘఠী ೧ ള۲ػ ӝୡ ࠙ܨӝ(base learner)ٜਸ Ѿ(aggregating)दఃח ߑߨ. ࠗझەۆ, য ള۲ ؘఠীࢲ ࠂਸ ೲਊೞৈ ਗ ؘఠࣇҗ э ӝ ؘఠࣇਸ ݅٘ח җਸ ݈ೠ. ߓӦਸ ా೧ ےؒ ನۨझܳ ള۲दఃח җ җ э ࣁ ױ҅۽ ೯ػ. 1. ࠗझە ߑߨਸ ా೧ Nѐ ള۲ ؘఠࣇਸ ࢤࢿೠ. 2. Nѐ ӝୡ ࠙ܨӝ(ܻ)ٜਸ ള۲दఅ. 3. ӝୡ ࠙ܨӝ(ܻ)ٜਸ ೞա ࠙ܨӝ(ےؒ ನۨझ)۽ Ѿೠ(ಣӐ ژח җ߈ࣻై ߑध ਊ). Wikipedia ےؒನۨझ > ߓӦਸ ਊೠ ನۨझ ҳࢿ
sklearn Code