首页> 外文学位 >A statistical approach to improving accuracy in classifier ensembles.
【24h】

A statistical approach to improving accuracy in classifier ensembles.

机译:一种用于提高分类器集成准确性的统计方法。

获取原文
获取原文并翻译 | 示例

摘要

Popular ensemble classifier induction algorithms, such as bagging and boosting, construct the ensemble by optimizing component classifiers in isolation. The controllable degrees of freedom in an ensemble include the instance selection and feature selection for each component classifier. Because their degrees of freedom are uncoupled, the component classifiers are not built to optimize performance of the ensemble, rather they are constructed by minimizing individual training loss. Recent work in the ensemble literature contradicts the notion that a combination of the best individually performing classifiers results in lower ensemble error rates. Zenobi et al. demonstrated that ensemble construction should consider a classifier's contribution to ensemble accuracy and diversity even at the expense of individual classifier performance. To tradeoff individual accuracy against ensemble accuracy and diversity, a component classifier inducer requires knowledge of the choices made by the other ensemble members.;We introduce an approach, called DiSCO, that exercises direct control over the tradeoff between diversity and error by sharing ensemble-wide information on instance selection during training. A classifier's contribution to ensemble accuracy and diversity can be measured as it is constructed in isolation, but without sharing information among its peers in the ensemble during training, nothing can be done to control it. In this work, we explore a method for training the component classifiers collectively by sharing information about training set selection. This allows our algorithm to build ensembles whose component classifiers select complementary error distributions that maximize diversity while minimizing ensemble error directly. Treating ensemble construction as an optimization problem, we explore approaches using local search, global search and stochastic methods.;Using this approach we can improve ensemble classifier accuracy over bagging and boosting on a variety of data, particularly those for which the classes are moderately overlapping. In ensemble classification research, how to use diversity to build effective classifier teams is an open question. We also provide a method that uses entropy as a measure of diversity to train an ensemble classifier.
机译:流行的集成分类器归纳算法(例如装袋和提升)通过独立地优化组件分类器来构建集成。集合中的可控制自由度包括每个组件分类器的实例选择和特征选择。因为它们的自由度是解耦的,所以并不是建立组件分类器来优化集成的性能,而是通过最小化单个训练损失来构造它们。合奏文献中的最新工作与以下概念相反:最好的单独执行的分类器组合会导致较低的合奏错误率。 Zenobi等。证明了整体构建应该考虑分类器对整体准确性和多样性的贡献,即使以牺牲单个分类器性能为代价。为了权衡个体精度与整体准确性和多样性,组件分类器诱导器需要了解其他整体成员所做的选择。;我们引入了一种称为DiSCO的方法,该方法通过共享整体来对多样性和错误之间的权衡进行直接控制。培训期间有关实例选择的广泛信息。当分类器是孤立构建时,可以衡量分类器对合奏准确性和多样性的贡献,但是如果在训练过程中没有在分类中的同级之间共享信息,则无法进行任何控制。在这项工作中,我们探索了一种通过共享有关训练集选择的信息来集体训练组件分类器的方法。这使我们的算法可以构建集成体,其组件分类器选择互补的误差分布,这些分布可以最大程度地提高多样性,同时又可以直接将总体误差最小化。将集合构建视为一个优化问题,我们探索使用局部搜索,全局搜索和随机方法的方法;使用这种方法,我们可以通过打包和增强各种数据来提高集合分类器的准确性,尤其是那些类别中等重叠的数据。在集成分类研究中,如何利用多样性来建立有效的分类器团队是一个悬而未决的问题。我们还提供了一种使用熵作为多样性度量的方法来训练集成分类器。

著录项

  • 作者

    Holness, Gary F.;

  • 作者单位

    University of Massachusetts Amherst.;

  • 授予单位 University of Massachusetts Amherst.;
  • 学科 Artificial Intelligence.;Computer Science.
  • 学位 Ph.D.
  • 年度 2008
  • 页码 274 p.
  • 总页数 274
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 人工智能理论;自动化技术、计算机技术;
  • 关键词

  • 入库时间 2022-08-17 11:39:22

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号