1 / 82
文档名称:

Introduction to Data Mining - Chapter 8. Cluster Analysis - Basic Concepts and Algorithms.pdf

格式:pdf   页数:82
下载后只包含 1 个 PDF 格式的文档,没有任何的图纸或源代码,查看文件列表

如果您已付费下载过本站文档,您可以点这里二次下载

Introduction to Data Mining - Chapter 8. Cluster Analysis - Basic Concepts and Algorithms.pdf

上传人:kuo08091 2014/5/26 文件大小:0 KB

下载得到文件列表

Introduction to Data Mining - Chapter 8. Cluster Analysis - Basic Concepts and Algorithms.pdf

文档介绍

文档介绍:8
Cluster Analysis:
Basic Concepts and
Algorithms
Cluster analysis divides data into groups (clusters) that are meaningful, useful,
or both. If meaningful groups are the goal, then the clusters should capture the
natural structure of the data. In some cases, however, cluster analysis is only a
useful starting point for other purposes, such as data summarization. Whether
for understanding or utility, cluster analysis has long played an important
role in a wide variety of fields: psychology and other social sciences, biology,
statistics, pattern recognition, information retrieval, machine learning, and
data mining.
There have been many applications of cluster analysis to practical prob-
lems. We provide some specific examples, organized by whether the purpose
of the clustering is understanding or utility.
Clustering for Understanding Classes, or conceptually meaningful groups
of objects that mon characteristics, play an important role in how
people analyze and describe the world. Indeed, human beings are skilled at
dividing objects into groups (clustering) and assigning particular objects to
these groups (classification). For example, even relatively young children can
quickly label the objects in a photograph as buildings, vehicles, people, ani-
mals, plants, etc. In the context of understanding data, clusters are potential
classes and cluster analysis is the study of techniques for automatically finding
classes. The following are some examples:
488 Chapter 8 Cluster Analysis: Basic Concepts and Algorithms
• Biology. Biologists have spent many years creating a taxonomy (hi-
erarchical classification) of all living things: kingdom, phylum, class,
order, family, genus, and species. Thus, it is perhaps not surprising that
much of the early work in cluster analysis sought to create a discipline
of mathematical taxonomy that could automatically find such classifi-
cation structures. More recently, biologists have applied clustering to
analy