文档介绍:第 28 卷第 10 期计算机应用研究畅28 畅10
2011 年 10 月 Vol 畅2011No
Application Research puters Oct
一种基于变精度粗糙集的 C4畅5 决策树改进算法倡
刘兴文, 王典洪, 陈分雄
(中国地质大学, 武汉 430074)
摘要: 针对 决策树构造复杂、分类精度不高等问题,提出了一种基于变精度粗糙集的决策树构造改进算
C
法。该算法采用近似分类质量作为节点选择属性的启发函数,与信息增益率相比,该标准更能准确地刻画属性
分类的综合贡献能力,同时对噪声有一定的抑制能力。此外还针对两个或两个以上属性的近似分类质量相等的
特殊情形,给出了如何选择最优的分类属性作为节点的方法。实验结果证明, 该算法构造的决策树在分类精度
和规模上均优于 算法。
C
关键词: 数据挖掘; 决策树; 信息增益率; 算法; 粗糙集; 变精度粗糙集; 近似分类质量
C
中图分类号: 18 文献标志码: 文章编号: 1001唱3695(2011)10唱3649唱03
TP A
: /. .
doi j issn
Improved C decision trees algorithm based on variable precision rough set
唱, 唱, 唱
LIU Xing wen WANG Dian hong CHEN Fen xiong
(China University of Geosciences, Wuhan 430074, China)
Abstract:
Aiming, at the problems plexisity and relatively low classification( ) of decision trees constructed by
C algorithm( this), paper proposed a new decision trees classification algorithm VPRSC based on the variable precision
rough set VPRS which took the approximate. quality of classification as the heuristic function in order to alleviate the effect
of noise data on choosing splitting attributes It also gave out the solution to the problem how to. choose the best attributes as the
node when two or more attributes had the same value of approximate quality of classification prove that the. size
K