A lot of research about cancer dataset classification has been done to decrease the number of death caused by cancer. Cancer microarray dataset consists of a large number of features that if we use all of them will spend time, cost, and memory capacities. It necessary to reduce the number of features using feature selection. We need to choose a feature selection method that not only eliminate the irrelevant features, but also consider the existence of correlated genes. If we ignore the correlated genes, it will lead to the disappearance of important information about cancer itself. To prove that feature selection will give higher accuracy, this research will compare the accuracy between classification of datasets without feature selection and with feature selection. This research use CSVM-RFE as feature selection method. To classify, this research use SVM and KFCM with two different kernel types, that is Gaussian RBF Kernel with σ = 0.05 and Polynomial Kernel with degree = 3. Those methods are applied on three different cancer datasets. As a result, highest accuracy of colon cancer dataset is 98.6 % using SVM based RBF Kernel. Highest accuracy of prostate cancer dataset is 99.2 % using SVM based polynomial kernel, and highest accuracy of lymphoma cancer dataset is 99.1 % using SVM based RBF kernel.
展开▼