Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
了解决策树和C4.5算法
Search
yafei002
January 08, 2017
Technology
1
260
了解决策树和C4.5算法
yafei002
January 08, 2017
Tweet
Share
More Decks by yafei002
See All by yafei002
了解人工神经网络
yafei002
1
210
了解朴素贝叶斯
yafei002
1
230
了解K-Means算法
yafei002
1
210
了解KNN算法
yafei002
0
170
数据可视化之视觉感知与认知
yafei002
1
380
数据可视化之地理信息可视化
yafei002
1
390
数据可视化之层次和网络数据可视化(上)
yafei002
1
660
数据可视化之复杂高维多元数据的可视化(上)
yafei002
1
290
Data Visualization Introduction and History
yafei002
1
320
Other Decks in Technology
See All in Technology
_第4回__AIxIoTビジネス共創ラボ紹介資料_20251203.pdf
iotcomjpadmin
0
170
Oracle Database@Azure:サービス概要のご紹介
oracle4engineer
PRO
3
260
Introduction to Sansan Meishi Maker Development Engineer
sansan33
PRO
0
330
Agentic AIが変革するAWSの開発・運用・セキュリティ ~Frontier Agentsを試してみた~ / Agentic AI transforms AWS development, operations, and security I tried Frontier Agents
yuj1osm
0
190
「エッジ×分散生成AI」の技術と変わる産業、そしてITの未来
piacerex
0
110
Eight Engineering Unit 紹介資料
sansan33
PRO
0
6.1k
Introduction to Sansan for Engineers / エンジニア向け会社紹介
sansan33
PRO
5
59k
Data Hubグループ 紹介資料
sansan33
PRO
0
2.5k
Digitization部 紹介資料
sansan33
PRO
1
6.4k
20251203_AIxIoTビジネス共創ラボ_第4回勉強会_BP山崎.pdf
iotcomjpadmin
0
170
AIと融ける人間の冒険
pujisi
0
110
Everything As Code
yosuke_ai
0
480
Featured
See All Featured
Bootstrapping a Software Product
garrettdimon
PRO
307
120k
Visual Storytelling: How to be a Superhuman Communicator
reverentgeek
2
400
Being A Developer After 40
akosma
91
590k
Organizational Design Perspectives: An Ontology of Organizational Design Elements
kimpetersen
PRO
0
48
The Organizational Zoo: Understanding Human Behavior Agility Through Metaphoric Constructive Conversations (based on the works of Arthur Shelley, Ph.D)
kimpetersen
PRO
0
210
Reality Check: Gamification 10 Years Later
codingconduct
0
2k
Git: the NoSQL Database
bkeepers
PRO
432
66k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
31
3.1k
Building a Scalable Design System with Sketch
lauravandoore
463
34k
Building AI with AI
inesmontani
PRO
1
600
Reflections from 52 weeks, 52 projects
jeffersonlam
355
21k
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
4.1k
Transcript
了解决策树与 C4.5算法 yafei002
决策树工作原理
如何选择节点 熵(entropy):信息的期望值
如何选择节点 决策树算法选择最大增益作为最佳划分 增益:
如何选择节点 增益偏向于取值多的属性 增益率(Gain ration)将输出节点的个数纳入考量 C4.5
决策树剪枝 为了避免生成的树过多从而过度拟合训练数据,需要对生成的决策树进行剪枝。 C4.5算法引入了悲观剪枝的方法。 悲观剪枝: 1. 一个节点对应N个实例和E个错误,则该 节点的经验错误率=(E+罚项)/ N 2. 一个子树有L个叶子节点,这些叶子节点共包含∑N个实例和∑E个错误,则该
子树的经验错误率=( ∑ E+L*罚项)/ ∑N 3. 假设子树被它的最佳叶节点替换后,在训练数据集的错误为J,如果 J+罚项<= ∑E+罚项*L + ( ∑ E+罚项)的一倍的标准差 则决定用该最佳节点替换子树
决策树剪枝 X T1 T2 T3 X T1 T2 T3 T2
(X输出最大) (最佳叶节点) 剪枝是单一的自底向上的遍历的过程 图:一个剪枝的中间步骤
PANG-NINGTAN, MICHAELSTEINBACH, & VIPINKUMAR. (2011). 数据挖掘导 论:完整版. 人民邮电出版社. 吴信东, &
VipinKumar. (2013). 数据挖掘十大算法. 清华大学出版社. 参考资料