首页> 外文期刊>Engineering Applications of Artificial Intelligence >Improvement of Bagging performance for classification of imbalanced datasets using evolutionary multi-objective optimization
【24h】

Improvement of Bagging performance for classification of imbalanced datasets using evolutionary multi-objective optimization

机译:使用进化多目标优化改进不平衡数据集分类的Bagging性能

获取原文
获取原文并翻译 | 示例

摘要

Today, classification of imbalanced datasets, in which the samples belonging to one class is more than the samples pertaining to other classes, has been paid much attention owing to its vast application in real-world problems. Bagging ensemble method, as one of the most favorite ensemble learning algorithms can provide better performance in solving imbalanced problems when is incorporated with undersampling methods. In Bagging method, diversity of classifiers, performance of classifiers, appropriate number of bags (classifiers) and balanced training sets to train the classifiers are important factors in successfulness of Bagging so as to deal with imbalanced problems. In this paper, through inspiring of evolutionary undersampling (the new undersampling method for seeking the subsets of majority class samples) and taking the mentioned factors into account, i.e., diversity, performance of classifiers, number of classifiers and balanced training set, a multi-objective optimization undersampling is proposed. The proposed method uses multi-objective evolutionary to produce set of diverse, well-performing and (near) balanced bags. Accordingly, the proposed method provides the possibility of generating diverse and well-performing classifiers and determining the number of classifiers in Bagging algorithm. Moreover, two different strategies are employed in the proposed method so as to improve the diversity. In order to confirm the proposed method's efficiency, its performance is measured over 33 imbalanced datasets using AUC and then compared with 6 well-known ensemble learning algorithms. Investigating the obtained results of such comparisons using non-parametric statistical analysis demonstrate the dominancy of the proposed method compared to other employed techniques, as well.
机译:如今,由于不平衡数据集在现实世界中的广泛应用,其中不属于一个类别的样本要比属于其他类别的样本多的数据集的分类引起了人们的广泛关注。套袋合奏方法是最喜欢的合奏学习算法之一,当与欠采样方法结合使用时,可以更好地解决不平衡问题。在装袋法中,分类器的多样性,分类器的性能,适当的袋数(分类器)和训练分类器的平衡训练集是处理装袋成功以解决不平衡问题的重要因素。在本文中,通过启发进化式欠采样(一种用于寻找多数类别样本子集的新的欠采样方法),并考虑了上述因素,即多样性,分类器的性能,分类器的数量和均衡的训练集,提出了目标优化欠采样。所提出的方法使用多目标进化算法来生成一组多样化的,性能良好的和(接近)平衡的袋子。因此,所提出的方法提供了在Bagging算法中生成各种性能良好的分类器并确定分类器数量的可能性。此外,在所提出的方法中采用了两种不同的策略以提高多样性。为了确认所提出方法的效率,使用AUC在33个不平衡数据集中测量了其性能,然后与6种众所周知的集成学习算法进行了比较。使用非参数统计分析调查此类比较的所得结果,也证明了与其他采用的技术相比,该方法的优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号