...
首页> 外文期刊>International Journal of Neural Systems >A Pareto-based Ensemble with Feature and Instance Selection for Learning from Multi-Class Imbalanced Datasets
【24h】

A Pareto-based Ensemble with Feature and Instance Selection for Learning from Multi-Class Imbalanced Datasets

机译:一个基于帕累托的合奏,具有功能和实例选择,用于从多级不平衡数据集学习

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Imbalanced classification is related to those problems that have an uneven distribution among classes. In addition to the former, when instances are located into the overlapped areas, the correct modeling of the problem becomes harder. Current solutions for both issues are often focused on the binary case study, as multi-class datasets require an additional effort to be addressed. In this research, we overcome these problems by carrying out a combination between feature and instance selections. Feature selection will allow simplifying the overlapping areas easing the generation of rules to distinguish among the classes. Selection of instances from all classes will address the imbalance itself by finding the most appropriate class distribution for the learning task, as well as possibly removing noise and difficult borderline examples. For the sake of obtaining an optimal joint set of features and instances, we embedded the searching for both parameters in a Multi-Objective Evolutionary Algorithm, using the C4.5 decision tree as baseline classifier in this wrapper approach. The multi-objective scheme allows taking a double advantage: the search space becomes broader, and we may provide a set of different solutions in order to build an ensemble of classifiers. This proposal has been contrasted versus several state-of-the-art solutions on imbalanced classification showing excellent results in both binary and multi-class problems.
机译:不平衡的分类与课程之间存在不均匀分布的问题有关。除了前者之外,当实例位于重叠区域时,问题的正确建模变得越来越难。对于两个问题的当前解决方案通常集中在二进制案例研究中,因为多级数据集需要额外努力解决。在这项研究中,我们通过在功能和实例选择之间进行组合来克服这些问题。特征选择将允许简化重叠区域,缓解生成规则以区分类别。从所有类别的选择将通过查找学习任务的最合适的类分发,以及可能删除噪声和困难的边界示例来解决不平衡本身。为了获得最佳的关节特征和实例,我们将在多目标进化算法中嵌入搜索两个参数,使用C4.5决策树作为这种包装方法中的基线分类器。多目标方案允许采用双重优势:搜索空间变得更广泛,我们可以提供一组不同的解决方案,以便构建分类器的集合。这一提议对比对不平衡分类的几种最先进的解决方案表现出二元和多级问题的出色结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号