...
首页> 外文期刊>Information Systems >Improving random forests by neighborhood projection for effective text classification
【24h】

Improving random forests by neighborhood projection for effective text classification

机译:通过邻域投影改进随机森林以实现有效的文本分类

获取原文
获取原文并翻译 | 示例

摘要

In this article, we propose a lazy version of the traditional random forest (RF) classifier (called LazyNN_RF), specially designed for highly dimensional noisy classification tasks. The LazyNN_RF "localized" training projection is composed by examples that better resemble the examples to be classified, obtained through nearest neighborhood training set projection. Such projection filters out irrelevant data, ultimately avoiding some of the drawbacks of traditional random forests, such as overfitting due to very complex trees, especially in high dimensional noisy datasets. In sum, our main contributions are: (i) the proposal and implementation of a novel lazy learner based on the random forest classifier and nearest neighborhood projection of the training set that excels in automatic text classification tasks, as well as (ii) a throughout and detailed experimental analysis that sheds light on the behavior, effectiveness and feasibility of our solution. By means of an extensive experimental evaluation, performed considering two text classification domains and a large set of baseline algorithms, we show that our approach is highly effective and feasible, being a strong candidate for consideration for solving automatic text classification tasks when compared to state-of-the-art classifiers. (C) 2018 Elsevier Ltd. All rights reserved.
机译:在本文中,我们提出了传统随机森林(RF)分类器的惰性版本(称为LazyNN_RF),该分类器专门为高维噪声分类任务而设计。 LazyNN_RF“本地化”训练投影由更好地类似于通过最近邻域训练集投影获得的待分类示例的示例组成。这样的投影过滤掉了不相关的数据,最终避免了传统随机森林的一些缺点,例如由于非常复杂的树木而导致的过度拟合,尤其是在高维噪声数据集中。总而言之,我们的主要贡献是:(i)基于随机森林分类器和在自动文本分类任务方面表现出色的训练集的最近邻域投影,提出和实施一种新型的懒惰学习器,以及(ii)并进行了详细的实验分析,阐明了我们解决方案的行为,有效性和可行性。通过广泛的实验评估(考虑了两个文本分类域和大量的基线算法),我们证明了我们的方法非常有效和可行,与状态转换相比,它是解决自动文本分类任务的强大候选者。最先进的分类器。 (C)2018 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号