...
首页> 外文期刊>Neurocomputing >A study on combining dynamic selection and data preprocessing for imbalance learning
【24h】

A study on combining dynamic selection and data preprocessing for imbalance learning

机译:动态选择与数据预处理相结合的不平衡学习研究

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In real life, classifier learning may encounter a dataset in which the number of instances of a given class is much higher than for other classes. Such imbalanced datasets require special attention because traditional classifiers generally favor the majority class which has a large number of instances. Ensemble classifiers, in such cases, have been reported to yield promising results. Most often, ensembles are specially designed for data level preprocessing techniques that aim to balance class proportions by applying under-sampling and/or over-sampling. Most available studies concentrate on static ensembles designed for different preprocessing techniques. Contrary to static ensembles, dynamic ensembles became popular thanks to their performance in the context of ill defined problems (small size datasets). A dynamic ensemble includes a dynamic selection module for choosing the best ensemble given a test instance. This paper experimentally evaluates the argument that dynamic selection combined with a preprocessing technique can achieve higher performance than static ensemble for imbalanced classification problems. For this evaluation, we collect 84 two-class and 26 multi-class datasets of varying degrees of class-imbalance. In addition, we consider five variations of preprocessing methods and four dynamic selection methods. We further design a useful experimental framework to integrate preprocessing and dynamic selection. Our experiments show that the dynamic ensemble improves the F-measure and the G-mean as compared to the static ensemble. Moreover, considering different levels of imbalance, dynamic selection methods secure higher ranks than other alternatives. (c) 2018 Elsevier B.V. All rights reserved.
机译:在现实生活中,分类器学习可能会遇到一个数据集,其中给定类的实例数比其他类的实例数高得多。这种不平衡的数据集需要特别注意,因为传统的分类器通常偏向具有大量实例的多数类。据报道,在这种情况下,整体分类器产生了可喜的结果。通常,合奏是专门为数据级预处理技术而设计的,该技术旨在通过应用欠采样和/或过采样来平衡类比例。大多数可用的研究都集中在针对不同预处理技术设计的静态合奏上。与静态合奏相反,动态合奏由于其在未定义问题(小数据集)的情况下的性能而变得流行。动态合奏包括动态选择模块,用于在给定测试实例的情况下选择最佳合奏。本文通过实验评估了以下观点:对于不平衡分类问题,动态选择与预处理技术相结合可以获得比静态集成更高的性能。为了进行此评估,我们收集了84个级别不同的不平衡程度的两类和26种多类数据集。另外,我们考虑了预处理方法的五个变体和四个动态选择方法。我们进一步设计了一个有用的实验框架,以整合预处理和动态选择。我们的实验表明,与静态乐团相比,动态乐团提高了F测度和G均值。此外,考虑到不同程度的失衡,动态选择方法可确保比其他选择更高的等级。 (c)2018 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号