Stacking Bagged and Boosted Forests for Eective Automated Classification

Raphael Campos; Sergio Canuto; Thiago Sall; Clebson C. A. de Sa; Marcos Andre Goncalves

首页> 外文期刊>ACM SIGIR FORUM >Stacking Bagged and Boosted Forests for Eective Automated Classification

【24h】

Stacking Bagged and Boosted Forests for Eective Automated Classification

机译：堆叠袋装和人工林以实现有效的自动分类

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Random Forest (RF) is one of the most successful strategies for automatedrnclassication tasks. Motivated by the RF success, recentlyrnproposed RF-based classication approaches leverage the central RFrnidea of aggregating a large number of low-correlated trees, whichrnare inherently parallelizable and provide exceptional generalizationrncapabilities. In this context, this work brings several new contributionsrnto this line of research. First, we propose a new RF-basedrnstrategy (BERT) that applies the boosting technique in bags of extremelyrnrandomized trees. Second, we empirically demonstraternthat this new strategy, as well as the recently proposed BROOF andrnLazyNN RF classiers do complement each other, motivating usrnto stack them to produce an even more eective classier. Up tornour knowledge, this is the rst strategy to eectively combine thernthree main ensemble strategies: stacking, bagging (the cornerstonernof RFs) and boosting. Finally, we exploit the ecient and unbiasedrnstacking strategy based on out-of-bag (OOB) samples to considerablyrnspeedup the very costly training process of the stackingrnprocedure. Our experiments in several datasets covering two highdimensionalrnand noisy domains of topic and sentiment classicationrnprovide strong evidence in favor of the benets of our RF-basedrnsolutions. We show that BERT is among the top performers in thernvast majority of analyzed cases, while retaining the unique benetsrnof RF classiers (explainability, parallelization, easiness of parameterization).rnWe also show that stacking only the recently proposedrnRF-based classiers and BERT using our OOB-based strategy is notrnonly signicantly faster than recently proposed stacking strategiesrn(up to six times) but also much more eective, with gains up to 21%rnand 17% on MacroF1 and MicroF1, respectively, over the best basernmethod, and of 5% and 6% over a stacking of traditional methods,rnperforming no worse than a complete stacking of methods at arnmuch lower computational eort.

机译：随机森林（RF）是自动分类任务最成功的策略之一。在RF成功的推动下，最近提出的基于RF的分类方法利用了中心RFrnidea来聚合大量低关联树，这些树固有地可并行化并提供出色的泛化能力。在这种情况下，这项工作为这一研究领域带来了一些新的贡献。首先，我们提出了一种新的基于RF的策略（BERT），该策略将增强技术应用于极端随机化的树袋中。其次，我们从经验上证明了这种新策略以及最近提出的BROOF和LazyNN RF分类器确实可以互补，从而促使我们将它们堆叠起来以产生更有效的分类器。掌握最新知识，这是有效地结合这三种主要集成策略的第一个策略：堆叠，装袋（基本RF）和提升。最后，我们利用基于袋装（OOB）样本的有效且无偏的堆叠策略，大大加快了堆叠过程非常昂贵的培训过程。我们在涵盖主题和情感分类的两个高维嘈杂领域的几个数据集中进行的实验提供了有力的证据，有利于我们基于RF的解决方案的益处。我们展示了BERT在大多数分析案例中表现最好，同时保留了独特的有益RF分类器（可解释性，并行化，参数化的简便性）。我们还展示了仅堆叠最近提出的基于rnRF的分类。 ers和BERT使用我们基于OOB的策略不仅比最近提出的堆栈策略快了多达六倍，而且效率更高，在MacroF1和MicroF1上分别提高了21％和17％，最好的基本方法，以及传统方法的5％和6％的性能，在计算成本较低的情况下，性能不比完全堆叠的方法差。

著录项

来源
《ACM SIGIR FORUM》 |2017年第cd期|105-114|共10页
作者
Raphael Campos; Sergio Canuto; Thiago Sall; Clebson C. A. de Sa; Marcos Andre Goncalves;
展开▼
作者单位

Federal University of Minas Gerais Computer Science Department Av. Antonio Carlos 6627 - ICEx Belo Horizonte, Brazil;

Federal University of Minas Gerais Computer Science Department Av. Antonio Carlos 6627 - ICEx Belo Horizonte, Brazil;

Federal University of Minas Gerais Computer Science Department Av. Antonio Carlos 6627 - ICEx Belo Horizonte, Brazil;

Federal University of Minas Gerais Computer Science Department Av. Antonio Carlos 6627 - ICEx Belo Horizonte, Brazil;

Federal University of Minas Gerais Computer Science Department Av. Antonio Carlos 6627 - ICEx Belo Horizonte, Brazil;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Classication; Ensemble; Bagging; Boosting; Stacking;

机译：分类;合奏;套袋助推;堆码;

相似文献

外文文献
中文文献
专利

1. Stacking Bagged and Boosted Forests for Eective Automated Classification [J] . Raphael Campos, Sergio Canuto, Thiago Sall, ACM SIGIR FORUM . 2017,第cd期

机译：为E€自动分类堆叠袋装和提升森林
2. An Evaluation of Bagging, Boosting, and Random Forests for Land-Cover Classification in Cape Cod,Massachusetts, USA [J] . Bardan Ghimire, John Rogan, Victor Rodriguez Galiano, GIScience & remote sensing . 2012,第5期

机译：美国马萨诸塞州科德角的套袋，人工林和随机森林对土地覆盖分类的评估
3. Improved landslide assessment using support vector machine with bagging, boosting, and stacking ensemble machine learning framework in a mountainous watershed, Japan [J] . Dou Jie, Yunus Ali P., Dieu Tien Bui, Landslides . 2020,第3期

机译：利用支持向量机改进了滑坡评估，其中袋装，升压和堆叠集合机器学习框架，日本山区流域
4. A comparison of stacking with meta decision trees to bagging, boosting, and stacking with other methods [C] . Zenko, B., Todorovski, . 2001

机译：将元决策树的堆栈与装袋，提升和其他方法的堆栈进行比较
5. Bagged Projection Methods for Supervised Classification in Big Data [D] . da Silva Cousillas, Natalia. 2017

机译：大数据监督分类的袋装投影方法
6. Comparison of Bagging and Boosting Ensemble Machine Learning Methods for Automated EMG Signal Classification [O] . Emine Yaman, Abdulhamit Subasi 2019

机译：袋装和升压集合机学习方法的比较自动化EMG信号分类
7. Stacking Bagged and Boosted Forests for Classification of Noisy and High-Dimensional Data [O] . Raphael R. Campos, Marcos André Gonçalves 2018

机译：堆积袋和提升森林，用于分类嘈杂和高维数据

Stacking Bagged and Boosted Forests for Eective Automated Classification

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅

Stacking Bagged and Boosted Forests for Eective Automated Classification