1 / 11
文档名称:

数据挖掘十大算法八:kNN.pdf

格式:pdf   大小:360KB   页数:11页
下载后只包含 1 个 PDF 格式的文档,没有任何的图纸或源代码,查看文件列表

如果您已付费下载过本站文档,您可以点这里二次下载

分享

预览

数据挖掘十大算法八:kNN.pdf

上传人:橘子 2022/1/28 文件大小:360 KB

下载得到文件列表

数据挖掘十大算法八:kNN.pdf

文档介绍

文档介绍:: .
Chapter8
ach: (i) the set of labeled objects to be used for evaluating
a test object’s class,1 (ii) a distance or similarity metric that can be used to compute
1This need not be the entire training set.
151
© 2009 by Taylor & Francis Group, LLC152 kNN: k-Nearest Neighbors
the closeness of objects, (iii) the value of k, the number of nearest neighbors, and (iv)
the method used to determine the class of the target object based on the classes and
distances of the k nearest neighbors. In its simplest form, kNN can involve assigning
an object the class of its nearest neighbor or of the majority of its nearest neighbors,
but a variety of enhancements are possible and are discussed below.
More generally, kNN is a special case of instance-based learning [1]. This includes
case-based reasoning [3], which deals with symbolic data. The kNN approach is also
an example of a lazy learning technique, that is, a technique that waits until the query
arrives to generalize beyond the training data.
Although kNN classification is a classification technique that is easy to understand
and implement, it performs well in many situations. In particular, a well-known result
by Cover and Hart [6] shows that the classification error2 of the nearest neighbor rule
is bounded above by twice the optimal Bayes error3 under certain reasonable assump-
tions. Furthermore, the error of the general kNN method asymptotically approaches
that of the Bayes error and can be used to approximate it.
Also, because of its simplicity, kNN is easy to modify for more complicated classifi-
cation problems. For instance, kNN is particularly well-suited for multimodal classes4
as well as applications in which an object can have many class labels. To illustrate the
last point, for the ass