文档介绍:Deterministic Annealing for Clustering,
Compression, Classification, Regression,
and Related Optimization Problems
H ROSE, MEMBER, IEEE
Invited Paper
The deterministic annealing approach to clustering and its Keywords—Classification, clustering, compression, determinis-
extensions has demonstrated substantial performance improve- tic annealing, maximum entropy, optimization methods, regression,
ment over standard supervised and unsupervised learning methods vector quantization.
in a variety of important applications pression,
estimation, pattern recognition and classification, and statistical
regression. The method offers three important features: 1) the I. INTRODUCTION
ability to avoid many poor local optima; 2) applicability to many There are several ways to motivate and introduce the
different structures/architectures; and 3) the ability to minimize material described in this paper. Let us place it within
the right cost function even when its gradients vanish almost
everywhere, as in the case of the empirical classification error. It the work perspective, and particularly that of
is derived within a probabilistic framework from basic information learning. The area of works has greatly benefited
theoretic principles (., maximum entropy and random coding). from its unique position at the crossroads of several diverse
The application-specific cost is minimized subject to a constraint scientific and engineering disciplines including statistics
on the randomness (Shannon entropy) of the solution, which is
gradually lowered. We emphasize intuition gained from analogy and probability theory, physics, biology, control and signal
to statistical physics, where this is an annealing process that processing, information theory, complexity theory, and psy-
avoids many shallow local minima of the specified cost and, at chology (see [45]). works have provided a fertile
the limit of zero “temperature,” produces a nonrandom (hard) soil for the infusion (and occas