文档介绍:A Survey of Evolutionary Algorithms for
Data Mining and Knowledge Discovery
Alex A. Freitas
Postgraduate Program puter Science, Pontificia Universidade Catolica do Parana
Rua Imaculada Conceicao, 1155. Curitiba - PR. 80215-901. Brazil.
E-mail: ******@ Web page: .br/~alex
Abstract: This chapter discusses the use of evolutionary algorithms, particularly
ic algorithms and ic programming, in data mining and knowledge
discovery. We focus on the data mining task of classification. In addition, we
discuss some preprocessing and postprocessing steps of the knowledge discovery
process, focusing on attribute selection and pruning of an ensemble of classifiers.
We show how the requirements of data mining and knowledge discovery
influence the design of evolutionary algorithms. In particular, we discuss how
individual representation, ic operators and fitness functions have to be
adapted for extracting high-level knowledge from data.
1. Introduction
The amount of data stored in databases continues to grow fast. Intuitively, this
large amount of stored data contains valuable hidden knowledge, which could be
used to improve the decision-making process of anization. For instance,
data about previous sales might contain interesting relationships between
products and customers. The discovery of such relationships can be very useful to
increase the sales of pany. However, the number of human data analysts
grows at a much smaller rate than the amount of stored data. Thus, there is a clear
need for (semi-)automatic methods for extracting knowledge from data.
This need has led to the emergence of a field called data mining and
knowledge discovery [66]. This is an interdisciplinary field, using methods of
several research areas (specially machine learning and statistics) to extract high-
level knowledge from real-world data sets. Data mining is the core step of a
broader process, called knowledge discovery in databases, or knowledge
discovery, for