文档介绍:Chapter 2K-MeansJoydeep Ghosh and Alexander Introduction............................................................ The k-means Algorithm................................................. Available Software...................................................... Examples............................................................... Advanced Topics........................................................ Summary............................................................... Exercises...............................................................33References................................................................... IntroductionIn this chapter, we describe thek-meansalgorithm, a straightforward and widelyused clustering algorithm. Given a set of objects (records), the goal of clusteringor segmentation is to divide these objects into groups or “clusters” such that objectswithin a group tend to be moresimilar to oneanother paredto objects belongingto different groups. In other words, clustering algorithms place similar points in thesame cluster while placing dissimilar points in different clusters. Note that, in contrasttosupervisedtasks such as regression or classi?cation where there is a notion of atarget value or class label, the objects that form the inputs to a clustering proceduredo e with an associated target. Therefore,clusteringis often referred toas unsupervised learning. Because there is no need for labeled data, unsupervisedalgorithms are suitable for many applications where labeled data is dif?cult to tasks such as clustering are also often used to explore and characterizethe dataset before running a supervised learning task. Since clustering makes no useof class labels, some notion of similarity must be de?ned based on the attributes of theobjects. Thede?nition of similarity and the methodin whichpoints are clustered differbased on the clustering algorithm being applied. Thus, diff