文档介绍:Data Mining with Constrained-Syntax ic Programming: Applications in
Medical Data Set
Celia C. Bojarczuka, Heitor S. Lopesb, Alex A. Freitasc
a Electrotechnics Department & Bioinformatics Laboratory – CEFET-PR, Curitiba, PR, Brazil
b Electronics Department & Bioinformatics Laboratory – CEFET-PR, Curitiba, PR, Brazil
c Graduate Program puter Science – PUC-PR, Curitiba, PR, Brazil
Abstract
A Brief Overview of ic Programming
This work is intended to discover classification rules for
diagnosing certain pathologies. In order to discover these
rules we have developed a new constrained-syntax ic ic programming is a powerful search method
programming algorithm based on some concepts of data inspired by natural selection [9]. The basic idea is to evolve
mining, particularly with emphasis on the discovery of a population of ”programs” candidate to the solution of a
comprehensible knowledge. pare the performance specific problem. A program (an individual of the
of the proposed GP algorithm with a ic algorithm population) is usually represented in the form of a tree,
and with the very well-known decision-tree algorithm where the internal nodes are functions (operators) and the
. leaf nodes are terminal symbols. Both the function set and
the terminal set must contain symbols appropriate for the
Keywords: target problem. For instance, the function set can contain