首页> 外文会议>2014 Iranian Conference on Intelligent Systems >Feature subset selection and parameters optimization for support vector machine in breast cancer diagnosis
【24h】

Feature subset selection and parameters optimization for support vector machine in breast cancer diagnosis

机译:支持向量机在乳腺癌诊断中的特征子集选择和参数优化

获取原文
获取原文并翻译 | 示例

摘要

Due to high death rate in women with breast cancer, the detection will play a major role in the treatment of this type of cancer. Therefore, the early detection of breast cancer will increase the patients' chances of survival. The main tendency in feature extraction has been illustrating the data in a lower dimensional and different feature space, for instance, using principal component analysis (PCA). In this paper, we argue that feature selection depend on top of eigenvalue certainly is not proper because they may not encode useful information for classilcation purposes, features should be selected form all the components by feature selection methods. So, Genetic Algorithm (GA) is used in the most favorable selection of principal components instead of using classical method. We have applied PCA for dimension reduction, genetic algorithms for feature selection and support vector machines for classification. The estimate of this Algorithm has been done based on Wisconsin Breast Cancer Dataset (WBCD) which is commonly used among researchers who use machine learning methods for breast cancer diagnosis. The performance of this approach is given. In addition, the methods used in the past have been compared to the performance of the chosen approach. This approach affords optimal classification which is capable to minimize amount of features and maximize the accuracy sensitivity, specificity and receiver operating characteristic (ROC) curves. 10-fold cross-validation has been used on the classification phase. The average classification accuracy of the developed PCA+GA+SVM system is obtained 100% for a subset that contained two features. This is very favorable compared to the previously reported results
机译:由于乳腺癌女性的高死亡率,该检测将在此类癌症的治疗中发挥重要作用。因此,早期发现乳腺癌将增加患者的生存机会。特征提取的主要趋势是例如使用主成分分析(PCA)来说明低维和不同特征空间中的数据。在本文中,我们认为依赖特征值的特征选择当然是不合适的,因为它们可能不会编码用于分类目的的有用信息,因此应通过特征选择方法从所有组件中选择特征。因此,遗传算法(GA)用于最有利的主成分选择,而不是使用经典方法。我们已将PCA应用于降维,将遗传算法应用于特征选择,并将支持向量机应用于分类。该算法的估计是基于威斯康星州乳腺癌数据集(WBCD)进行的,该数据集在使用机器学习方法进行乳腺癌诊断的研究人员中普遍使用。给出了这种方法的性能。此外,将过去使用的方法与所选方法的性能进行了比较。这种方法提供了最佳的分类方法,该方法能够最大程度地减少特征量,并最大程度提高准确度灵敏度,特异性和接收器工作特性(ROC)曲线。分类阶段已使用10倍交叉验证。对于包含两个特征的子集,已开发的PCA + GA + SVM系统的平均分类准确度为100%。与先前报告的结果相比,这是非常有利的

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号