文档介绍:CLUSTERING AND CLASSIFICATION OF ANALYTICAL DATA 1
peak in a chromatogram. Thus, each sample is considered
Clustering and Classification as a point in an n-dimensional measurement space. The
of Analytical Data dimensionality of the space corresponds to the number of
measurements that are available for each sample. A basic
assumption is that the distance between pairs of points in
Barry K. Lavine this measurement space is inversely related to the degree
Clarkson University, Potsdam, USA of similarity between the corresponding samples. Points
representing samples from one class will cluster in a lim-
ited region of the measurement space distant from the
points corresponding to the other class. Pattern recogni-
1 Introduction 1
tion (. clustering and classification) is a set of methods
2 ponent Analysis 2 for investigating data represented in this manner, in order
Variance-based Coordinate System 3 to assess its overall structure, which is defined as the overall
Information Content of Principal relationship of each sample to every other in the data set.
Components 3
Case Studies 4
3 Cluster Analysis 5 1 INTRODUCTION
Hierarchical Clustering 6
Practical Considerations 8 Since the early 1980s, a major effort has been made to sub-
Case Studies 8 stantially improve the analytical methodology applied to
4 Pattern Recognition 9 the study of environmental samples. Instrumental tech-
k-Nearest Neighbor 12 niques such as gas chromatography, high-performance
Soft Independent Modeling by Class liquid chromatography (HPLC) and X-ray fluorescence
Analogy 12 spectroscopy have dramatically increased the number of
Feature Selection anic and pounds that can be identified
Case Studies 13 and quantified, even at trace levels, in the environment.
5 Software 19 This capability, in turn, has allowed scientists to attack
6Conclusion 19 ever plex problems, such as oil and fuel spill
identification, but has al