文档介绍:密级: 学校代码:10075
分类号: 学号:20091339
工学硕士学位论文
基于粗糙集样例约简的支持向量机
学位申请人:王婷婷
指导教师:王熙照教授
翟俊海副教授
学位类别:工学硕士
学科专业:计算机应用技术
授予单位:河北大学
答辩日期:二○一二年五月
Classified Index: CODE: 10075
: NO: 20091339
A Dissertation for the Degree of M. Engineering
Instances Reduction Support Vector
Machine Base on Rough Set
Candidate: Wang Tingting
Supervisor: Prof. Wang Xizhao
Associate Prof. Zhai Junhai
Academic Degree Applied: Master of Engineering
Specialty: Computer Applied Technology
University: Hebei University
Date of Oral Examination: May, 2012
摘要
摘要
支持向量机和粗糙集理论是当今人工智能和机器学习领域的研究热点。支持向量机
以结构风险最小化原则为分类准则,利用靠近分类边界的支持向量构造最优分类超平
面。对分类训练有贡献的只有支持向量,但是求解支持向量机却需要整个训练集,因此,
当训练集规模较大时,支持向量机训练时具有存储空间需求量大,寻优速度缓慢,需要
花费大量时间等缺点。
针对这一问题,本文提出了一种基于粗糙集样例约简支持向量机的方法。其基本思
想是因为支持向量大多位于分类边界附近,所以可利用相容粗糙集方法选出边界域中的
样例作为候选支持向量。另外,粗糙集还能在保持分类能力不变的前提下,通过约简,
删除样例集中的冗余属性。因此,本文提出的方法能同时约简属性和样例。具体地,首
先采用相容粗糙集理论对样例集进行属性约简和样例约简,然后用约简后的样例集训练
支持向量机。另外,本文还提出了基于样例选取的属性约简算法。实验结果证实了这种
方法的有效性,特别对大型数据库,本文提出的算法能有效地减少存储空间和执行时间。
关键词相容粗糙集约简支持向量机最优分类超平面统计学习理论
I
Abstract
Abstract
Support vector machine (SVM) and rough set are hot research in the field of artificial
intelligence and machine learning today. Support vector machine is a novel approach for
pattern classification rooted in statistical learning theory, the principle of structural risk
minimization is used as the criterion of classification, and the optimal classification
hyperplane is constructed from support vectors near the boundary. Only the support vectors
have contribution to classification. However, solving SVM is based on whole training set.
When the training set is very large, it will require a great amount of memory and take a long
time to search the optimal solution.
In order to deal with the problem mentioned above, this paper presents a method named
support vector machine based on instance reduct