A statistical approach to improving accuracy in classifier ensembles.

机译：一种用于提高分类器集成准确性的统计方法。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Popular ensemble classifier induction algorithms, such as bagging and boosting, construct the ensemble by optimizing component classifiers in isolation. The controllable degrees of freedom in an ensemble include the instance selection and feature selection for each component classifier. Because their degrees of freedom are uncoupled, the component classifiers are not built to optimize performance of the ensemble, rather they are constructed by minimizing individual training loss. Recent work in the ensemble literature contradicts the notion that a combination of the best individually performing classifiers results in lower ensemble error rates. Zenobi et al. demonstrated that ensemble construction should consider a classifier's contribution to ensemble accuracy and diversity even at the expense of individual classifier performance. To tradeoff individual accuracy against ensemble accuracy and diversity, a component classifier inducer requires knowledge of the choices made by the other ensemble members.;We introduce an approach, called DiSCO, that exercises direct control over the tradeoff between diversity and error by sharing ensemble-wide information on instance selection during training. A classifier's contribution to ensemble accuracy and diversity can be measured as it is constructed in isolation, but without sharing information among its peers in the ensemble during training, nothing can be done to control it. In this work, we explore a method for training the component classifiers collectively by sharing information about training set selection. This allows our algorithm to build ensembles whose component classifiers select complementary error distributions that maximize diversity while minimizing ensemble error directly. Treating ensemble construction as an optimization problem, we explore approaches using local search, global search and stochastic methods.;Using this approach we can improve ensemble classifier accuracy over bagging and boosting on a variety of data, particularly those for which the classes are moderately overlapping. In ensemble classification research, how to use diversity to build effective classifier teams is an open question. We also provide a method that uses entropy as a measure of diversity to train an ensemble classifier.

机译：流行的集成分类器归纳算法（例如装袋和提升）通过独立地优化组件分类器来构建集成。集合中的可控制自由度包括每个组件分类器的实例选择和特征选择。因为它们的自由度是解耦的，所以并不是建立组件分类器来优化集成的性能，而是通过最小化单个训练损失来构造它们。合奏文献中的最新工作与以下概念相反：最好的单独执行的分类器组合会导致较低的合奏错误率。 Zenobi等。证明了整体构建应该考虑分类器对整体准确性和多样性的贡献，即使以牺牲单个分类器性能为代价。为了权衡个体精度与整体准确性和多样性，组件分类器诱导器需要了解其他整体成员所做的选择。；我们引入了一种称为DiSCO的方法，该方法通过共享整体来对多样性和错误之间的权衡进行直接控制。培训期间有关实例选择的广泛信息。当分类器是孤立构建时，可以衡量分类器对合奏准确性和多样性的贡献，但是如果在训练过程中没有在分类中的同级之间共享信息，则无法进行任何控制。在这项工作中，我们探索了一种通过共享有关训练集选择的信息来集体训练组件分类器的方法。这使我们的算法可以构建集成体，其组件分类器选择互补的误差分布，这些分布可以最大程度地提高多样性，同时又可以直接将总体误差最小化。将集合构建视为一个优化问题，我们探索使用局部搜索，全局搜索和随机方法的方法;使用这种方法，我们可以通过打包和增强各种数据来提高集合分类器的准确性，尤其是那些类别中等重叠的数据。在集成分类研究中，如何利用多样性来建立有效的分类器团队是一个悬而未决的问题。我们还提供了一种使用熵作为多样性度量的方法来训练集成分类器。

著录项

作者
Holness, Gary F.;
展开▼
作者单位

University of Massachusetts Amherst.;

展开▼
授予单位 University of Massachusetts Amherst.;
学科 Artificial Intelligence.;Computer Science.
学位 Ph.D.
年度 2008
页码 274 p.
总页数 274
原文格式 PDF
正文语种 eng
中图分类人工智能理论;自动化技术、计算机技术;
关键词
入库时间 2022-08-17 11:39:22

相似文献

外文文献
中文文献
专利

1. An effective approach for improving the accuracy of a random forest classifier in the classification of Hyperion data [J] . Dibyajyoti Chutia, Naiwrita Borah, Diganta Baruah, Applied Geomatics . 2020,第1期

机译：一种有效的方法，用于提高Hyperion数据分类中随机林分类器的准确性
2. An approach to improve the accuracy of probabilistic classifiers for decision support systems in sentiment analysis [J] . Garcia-Diaz Vincente, Pascual Espada Jordan, Gonzalez Crespo Ruben, Applied Soft Computing . 2018,第期

机译：一种提高情绪分析中决策支持系统概率分类器准确性的方法
3. A Probabilistic Approach to Improve the Accuracy of Axle-Based Automatic Vehicle Classifiers [J] . Naim Bitar, Hazem H. Refai Intelligent Transportation Systems, IEEE Transactions on . 2017,第3期

机译：一种提高基于轴的自动车辆分类器准确性的概率方法
4. Combining Multiple Statistical Classifiers to Improve the Accuracy of Task Classification [C] . Wei-Lin Wu, Ru-Zhan Lu, Feng Gao, International Conference on Computational Linguistics and Intelligent Text Processing(CICLing 2005); 20050213-19; Mexico City(MX) . 2005

机译：组合多个统计分类器以提高任务分类的准确性
5. Nearest Neighbor Classifiers with Improved Accuracy and Efficiency [D] . Bhattacharya, Sinchan 2017

机译：最近的邻居分类器，具有更高的准确性和效率
6. Classifying Injuries in Young Children as Abusive or Accidental: Reliability and Accuracy of an Expert Panel Approach [O] . Douglas J. Lorenz, Mary Clyde Pierce, Kim Kaczor, -1

机译：将幼儿中的伤害分类为虐待还是意外：专家小组方法的可靠性和准确性
7. Statistical Approach for Improving Genomic Prediction Accuracy through Efficient Diagnostic Measure of Influential Observation [O] . Neeraj Budhlakoti, Anil Rai, D. C. Mishra 2020

机译：通过有效诊断测量改善基因组预测精度的统计方法

A statistical approach to improving accuracy in classifier ensembles.

摘要

著录项

相似文献

相关主题

期刊订阅