Recent advancement and wide use of highthroughput technologies for biological research are producing enormous size of biological datasets distributed worldwide. Data mining techniques and machine learning methods provide useful tools for knowledge discovery in this field. The goal of this paper is to present the design of a pattern classifier to mine distributed biological dataset. The proposed classifier is built around a special class of computing model termed as Fuzzy Cellular Automata (FCA). A concrete example of the effectiveness of this approach is provided by demonstrating its success in gene identification problem. Extensive experimental results confirm the scalability of the FCA to handle distributed biological datasets. Application of the proposed model to solve gene identification problem establishes the FCA as the classifier ideally suited for biological data mining in a distributed environment.
展开▼