...
首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >EUSBoost: Enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling
【24h】

EUSBoost: Enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling

机译:Eusboost:通过进化欠采样增强高度不平衡数据集的集合

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Classification with imbalanced data-sets has become one of the most challenging problems in Data Mining. Being one class much more represented than the other produces undesirable effects in both the learning and classification processes, mainly regarding the minority class. Such a problem needs accurate tools to be undertaken; lately, ensembles of classifiers have emerged as a possible solution. Among ensemble proposals, the combination of Bagging and Boosting with preprocessing techniques has proved its ability to enhance the classification of the minority class. In this paper, we develop a new ensemble construction algorithm (EUSBoost) based on RUSBoost, one of the simplest and most accurate ensemble, which combines random undersampling with Boosting algorithm. Our methodology aims to improve the existing proposals enhancing the performance of the base classifiers by the usage of the evolutionary undersampling approach. Besides, we promote diversity favoring the usage of different subsets of majority class instances to train each base classifier. Centered on two-class highly imbalanced problems, we will prove, supported by the proper statistical analysis, that EUSBoost is able to outperform the state-of-the-art methods based on ensembles. We will also analyze its advantages using kappa-error diagrams, which we adapt to the imbalanced scenario.
机译:使用不平衡数据集分类已成为数据挖掘中最具挑战性问题之一。作为其他比其他人在学习和分类过程中产生不良影响,主要是少数阶级。这样的问题需要准确的工具;最近,分类器的合奏已经成为可能的解决方案。在集合建议中,袋装和提升与预处理技术的组合已经证明了其增强少数阶级分类的能力。在本文中,我们基于Rusboost的新集合施工算法(Eusboost),最简单和最准确的集合之一,它将随机缺乏采样与升压算法相结合。我们的方法旨在通过使用进化欠采样方法来改善现有提案,提高基础分类器的绩效。此外,我们促进多样性,利用各大类实例的不同子集的使用来培训每个基本分类器。以两级高度不平衡问题为中心,我们将证明,通过适当的统计分析支持,即尽抵销,能够以基于合奏而优于最先进的方法。我们还将使用Kappa-Error图分析其优点,我们适应不平衡的场景。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号