...
首页> 外文期刊>Information Fusion >Overfitting cautious selection of classifier ensembles with genetic algorithms
【24h】

Overfitting cautious selection of classifier ensembles with genetic algorithms

机译:用遗传算法过分谨慎地选择分类器集合

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Information fusion research has recently focused on the characteristics of the decision profiles of ensemble members in order to optimize performance. These characteristics are particularly important in the selection of ensemble members. However, even though the control of overfitting is a challenge in machine learning problems, much less work has been devoted to the control of overfitting in selection tasks. The objectives of this paper are: (1) to show that overfitting can be detected at the selection stage; and (2) to present strategies to control overfitting. Decision trees and k nearest neighbors classifiers are used to create homogeneous ensembles, while single- and multi-objective genetic algorithms are employed as search algorithms at the selection stage. In this study, we use bagging and random subspace methods for ensemble generation. The classification error rate and a set of diversity measures are applied as search criteria. We show experimentally that the selection of classifier ensembles conducted by genetic algorithms is prone to overfitting, especially in the multi-objective case. In this study, the partial validation, backwarding and global validation strategies are tailored for classifier ensemble selection problem and compared. This comparison allows us to show that a global validation strategy should be applied to control overfitting in pattern recognition systems involving an ensemble member selection task. Furthermore, this study has helped us to establish that the global validation strategy can be used to measure the relationship between diversity and classification performance when diversity measures are employed as single-objective functions.
机译:信息融合研究最近集中于合奏成员的决策配置文件的特征,以优化性能。这些特征在选择合奏成员时特别重要。然而,尽管过度拟合的控制是机器学习问题中的一个挑战,但在选择任务中控制过度拟合的工作却少得多。本文的目标是:(1)证明在选择阶段可以发现过度拟合; (2)提出控制过度拟合的策略。决策树和k个最近邻分类器用于创建同质集合,而单目标和多目标遗传算法在选择阶段用作搜索算法。在这项研究中,我们使用装袋法和随机子空间法进行合奏生成。将分类错误率和一组多样性度量用作搜索标准。我们通过实验表明,通过遗传算法进行的分类器集合的选择容易过度拟合,尤其是在多目标情况下。在这项研究中,针对分类器整体选择问题量身定制了部分验证,后退和全局验证策略,并进行了比较。这种比较使我们能够表明,应采用全局验证策略来控制涉及整体成员选择任务的模式识别系统中的过拟合。此外,这项研究帮助我们确定了在采用多样性度量作为单目标函数时,可以使用全局验证策略来度量多样性与分类性能之间的关系。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号