首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Ensemble of Classifiers Based on Multiobjective Genetic Sampling for Imbalanced Data
【24h】

Ensemble of Classifiers Based on Multiobjective Genetic Sampling for Imbalanced Data

机译:基于多目标遗传采样的分类器集合,基于多目标基因采样进行不平衡数据

获取原文
获取原文并翻译 | 示例

摘要

Imbalanced datasets may negatively impact the predictive performance of most classical classification algorithms. This problem, commonly found in real-world, is known in machine learning domain as imbalanced learning. Most techniques proposed to deal with imbalanced learning have been proposed and applied only to binary classification. When applied to multiclass tasks, their efficiency usually decreases and negative side effects may appear. This paper addresses these limitations by presenting a novel adaptive approach, E-MOSAIC (Ensemble of Classifiers based on MultiObjective Genetic Sampling for Imbalanced Classification). E-MOSAIC evolves a selection of samples extracted from training dataset, which are treated as individuals of a MOEA. The multiobjective process looks for the best combinations of instances capable of producing classifiers with high predictive accuracy in all classes. E-MOSAIC also incorporates two mechanisms to promote the diversity of these classifiers, which are combined into an ensemble specifically designed for imbalanced learning. Experiments using twenty imbalanced multi-class datasets were carried out. In these experiments, the predictive performance of E-MOSAIC is compared with state-of-the-art methods, including methods based on presampling, active-learning, cost-sensitive, and boosting. According to the experimental results, the proposed method obtained the best predictive performance for the multiclass accuracy measures mAUC and G-mean.
机译:不平衡数据集可能对大多数古典分类算法的预测性能产生负面影响。这个问题常见于现实世界中,在机器学习领域是不平衡的学习。已经提出了大多数提议处理不平衡学习的技术,并仅应用于二进制分类。当应用于多牌任务时,它们的效率通常会降低,并且可能出现负副作用。本文通过呈现一种新颖的自适应方法,E-MOSAIC(基于多目标遗传学采样的分类器的集合用于用于不平衡分类的分类器的集合)来解决这些限制。 E-MORAIC演变了一种从训练数据集中提取的样本,其被视为MOEA的个体。多目标过程寻找能够在所有类别中产生具有高预测准确性的分类器的最佳组合。 E-MORAIC还包括两种机制,以促进这些分类器的多样性,这些分类器组合成专门为不平衡学习而设计的集合。进行了使用二十个不平衡多级数据集的实验。在这些实验中,将E-MOSAIC的预测性能与最先进的方法进行比较,包括基于预采样,主动学习,成本敏感和升压的方法。根据实验结果,所提出的方法获得了多种多数精度测量Mauc和G-均值的最佳预测性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号