This paper discusses a genetic-algorithm-based approach for selecting a small number of representative instances from a given data set in a pattern classification problem. The genetic algorithm also selects a small number of significant features. That is, instances and features are simultaneously selected for finding a compact data set. The selected instances and features are used as a reference set in a nearest neighbor classifier. Our goal is to improve the classification performance (i.e., generalization ability) of our nearest neighbor classifier by searching for an appropriate reference set. In this paper, we first describe the implementation of our genetic algorithm for instance and feature selection. Next we discuss the definition of a fitness function in our genetic algorithm. Then we examine the classification performance of nearest neighbor classifiers designed by our approach through computer simulations on artificial data sets and real-world data sets.
展开▼