【24h】

How Many Trees in a Random Forest?

机译:随机森林中有几棵树?

获取原文
获取原文并翻译 | 示例

摘要

Random Forest is a computationally efficient technique that can operate quickly over large datasets. It has been used in many recent research projects and real-world applications in diverse domains. However, the associated literature provides almost no directions about how many trees should be used to compose a Random Forest. The research reported here analyzes whether there is an optimal number of trees within a Random Forest, i.e., a threshold from which increasing the number of trees would bring no significant performance gain, and would only increase the computational cost. Our main conclusions are: as the number of trees grows, it does not always mean the performance of the forest is significantly better than previous forests (fewer trees), and doubling the number of trees is worthless. It is also possible to state there is a threshold beyond which there is no significant gain, unless a huge computational environment is available. In addition, it was found an experimental relationship for the AUC gain when doubling the number of trees in any forest. Furthermore, as the number of trees grows, the full set of attributes tend to be used within a Random Forest, which may not be interesting in the biomedical domain. Additionally, datasets' density-based metrics proposed here probably capture some aspects of the VC dimension on decision trees and low-density datasets may require large capacity machines whilst the opposite also seems to be true.
机译:随机森林是一种计算有效的技术,可以快速处理大型数据集。它已被用于许多最新的研究项目和不同领域的实际应用中。但是,相关文献几乎没有提供关于使用多少棵树组成随机森林的指导。此处报告的研究分析了随机森林中是否存在最佳数量的树木,即增加树木数量不会带来明显性能提升且只会增加计算成本的阈值。我们的主要结论是:随着树木数量的增加,这并不总是意味着森林的性能明显优于以前的森林(较少的树木),而使树木数量增加一倍是毫无价值的。除非存在巨大的计算环境,否则还可以声明存在一个阈值,没有超过该阈值的收益不大。此外,还发现当任何森林中的树木数量增加一倍时,AUC增益的实验关系。此外,随着树木数量的增加,全套属性倾向于在随机森林中使用,这在生物医学领域可能并不重要。此外,此处提出的数据集基于密度的指标可能会捕获决策树上VC维的某些方面,而低密度数据集可能需要大容量的计算机,而相反的情况似乎也是如此。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号