首页> 外文会议>IEEE Symposium Series on Computational Intelligence >A Cost-Sensitive Centroid-based Differential Evolution Classification Algorithm applied to Cancer Data Sets
【24h】

A Cost-Sensitive Centroid-based Differential Evolution Classification Algorithm applied to Cancer Data Sets

机译:基于成本敏感的基于质心的差分演进分类算法,应用于癌症数据集

获取原文

摘要

Nowadays, the collected or generated data for some real-life applications such as in the Medical domain and Intrusion Detection, are typically imbalanced. Imbalanced data sets consist of data where one class-label (minority) includes significantly fewer instances compared to other class labels. The misclassification of the minority class-label could be costly in some circumstances. Therefore, the extraction of valuable information from this kind of data poses a challenge to the scientific community. During the last decades, the researchers proposed a centroid-based classification algorithm using differential evolution (CDE) to solve data classification. However, CDE shows an inefficient performance especially when applied to imbalanced binary data sets. In this paper, we propose a cost-sensitive version of CDE based on a new objective function in order to overcome this drawback. We are using four cancer data sets that are imbalanced namely Breast, Lung, Uterus, and Stomach. Furthermore, we analyzed and investigated the performance of our proposed version of CDE for predicting the survivability of cancer patients compared to the performance of the current variants of CDE. Moreover, we compared the performance of our proposed version of CDE with the performance of five cost-sensitive machine learning algorithms. The experimental results demonstrate that our proposed version of CDE improves the performance of CDE when applied to imbalanced binary data sets. Furthermore, the performance of our proposed CDE algorithm outperformed the performance of the current variants of CDE on all data sets in terms of Area Under Curve and G-mean.
机译:如今,用于一些现实寿命应用的收集或生成的数据,例如在医疗领域和入侵检测中,通常是不平衡的。不平衡数据集由一个类标签(少数群体)包含与其他类标签相比明显更少的数据组成的数据组成。在某些情况下,少数群体标签的错误分类可能是昂贵的。因此,从这种数据提取有价值的信息对科学界构成挑战。在过去的几十年中,研究人员提出了一种使用差分演进(CDE)来解决数据分类的基于质心的分类算法。然而,CDE表示效率低下的性能,特别是当应用于不平衡的二进制数据集时。在本文中,我们提出了一种基于新目标函数的CDE成本敏感版本,以克服这一缺点。我们使用的是四种癌症数据集,其是乳腺,肺,子宫和胃的不平衡。此外,我们分析并调查了我们提出的CDE的表现,以预测癌症患者的活力与CDE目前变体的性能相比。此外,我们将建议版本的CDE的性能进行了比较,具有五种成本敏感的机器学习算法的性能。实验结果表明,我们所提出的CDE版本在应用于不平衡二进制数据集时提高了CDE的性能。此外,我们所提出的CDE算法的性能优于在曲线下区域和G均值下的所有数据集上的当前CDE的性能的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号