...
首页> 外文期刊>Advances in data analysis and classification >Ensemble of optimal trees, random forest and random projection ensemble classification
【24h】

Ensemble of optimal trees, random forest and random projection ensemble classification

机译:最佳树木,随机森林和随机投影合奏分类的集合

获取原文
获取原文并翻译 | 示例

摘要

The predictive performance of a random forest ensemble is highly associated with the strength of individual trees and their diversity. Ensemble of a small number of accurate and diverse trees, if prediction accuracy is not compromised, will also reduce computational burden. We investigate the idea of integrating trees that are accurate and diverse. For this purpose, we utilize out-of-bag observations as a validation sample from the training bootstrap samples, to choose the best trees based on their individual performance and then assess these trees for diversity using the Brier score on an independent validation sample. Starting from the first best tree, a tree is selected for the final ensemble if its addition to the forest reduces error of the trees that have already been added. Our approach does not use an implicit dimension reduction for each tree as random project ensemble classification. A total of 35 bench mark problems on classification and regression are used to assess the performance of the proposed method and compare it with random forest, random projection ensemble, node harvest, support vector machine, kNN and classification and regression tree. We compute unexplained variances or classification error rates for all the methods on the corresponding data sets. Our experiments reveal that the size of the ensemble is reduced significantly and better results are obtained in most of the cases. Results of a simulation study are also given where four tree style scenarios are considered to generate data sets with several structures.
机译:随机森林集合的预测性能与个体树木的强度高度相关,以及它们的多样性。如果预测准确性没有受到损害,则少量准确和多样化的树木集合,也会降低计算负担。我们调查整合准确和多样化的树木的想法。为此目的,我们利用袋子外观测作为训练引导样本的验证样本,根据他们的单独性能选择最佳树木,然后使用独立验证样本上的BRIER得分评估这些树的多样性。从第一个最佳树开始,如果森林添加到已添加的树木的错误,则为最终合奏选择一棵树。我们的方法不使用每棵树的隐式尺寸减少为随机项目集合分类。在分类和回归中共有35个基准标记问题来评估所提出的方法的性能,并将其与随机森林,随机投影合奏,节点收获,支持向量机,knn和分类和回归树进行比较。我们计算相应数据集上所有方法的无法解释的差异或分类错误率。我们的实验表明,在大多数情况下,综合的大小明显减少了更好的结果。还给出了仿真研究的结果,其中四个树式方案被认为是使用多个结构生成数据集的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号