1 / 9
文档名称:

C4.5算法连续属性的改良应用【外文翻译】.doc

格式:doc   页数:9
下载后只包含 1 个 DOC 格式的文档,没有任何的图纸或源代码,查看文件列表

如果您已付费下载过本站文档,您可以点这里二次下载

分享

预览

C4.5算法连续属性的改良应用【外文翻译】.doc

上传人:问道九霄 2012/4/12 文件大小:0 KB

下载得到文件列表

C4.5算法连续属性的改良应用【外文翻译】.doc

文档介绍

文档介绍:毕业论文(设计)
外文翻译
外文原文
Improved Use of Continuous Attributes in
J. R. Quinlan
Basser Department puter Science
University of Sydney, Sydney Australia 2006
Abstract
A reported weakness of in domains with continuous attributes is addressed by modifying the formation and evaluation of tests on continuous attributes. An MDL-inspired penalty is applied to such tests, eliminating some of them from consideration and altering the relative desirability of all tests. Empirical trials show that the modifications lead to smaller decision trees with higher predictive accuracies. Results also confirm that a new version of incorporating these changes is superior to recent approaches that use global discrimination and that construct small trees with multi-interval splits.
1. Introduction
Most empirical learning systems are given a set of pre-classified cases, each described by a vector of attribute values, and construct from them a mapping from attribute values to classes. The attributes used to describe cases can be grouped into continuous attributes, whose values are numeric, and discrete attributes with unordered nominal values. For example, the description of a person might include weight in kilograms, with a value such as , and color of eyes whose value is one of `brown', `blue', etc.
(Quinlan, 1993) is one such system that learns decision-tree classif