Analyzing DNA microarray data pose a serious challenge because of their large number of features (genes) and relatively small number of samples. Extracting features, those have predictive capability for classifying these huge datasets demands appropriate approaches like feature reduction and identifying optimal set of genes. In this paper along with conventional statistical methods like filtering the dataset to reduce the number of features, one additional approach of evaluating correlation between the classes for each feature is performed. Proposed approach yields higher classification accuracy for both Acute Lymphoblastic (ALL) and High Grade Glioma cancer dataset than using only traditional statistical filtering methods.
展开▼