首页> 外文期刊>Bioinformatics >Enriched random forests
【24h】

Enriched random forests

机译:丰富的随机森林

获取原文
获取原文并翻译 | 示例
       

摘要

Although the random forest classification procedure works well in datasets with many features, when the number of features is huge and the percentage of truly informative features is small, such as with DNA microarray data, its performance tends to decline significantly. In such instances, the procedure can be improved by reducing the contribution of trees whose nodes are populated by non-informative features. To some extent, this can be achieved by prefiltering, but we propose a novel, yet simple, adjustment that has demonstrably superior performance: choose the eligible subsets at each node by weighted random sampling instead of simple random sampling, with the weights tilted in favor of the informative features. This results in an enriched random forest. We illustrate the superior performance of this procedure in several actual microarray datasets.
机译:尽管随机森林分类程序在具有许多特征的数据集中效果很好,但是当特征数量巨大且真正具有信息意义的特征所占的百分比较小时(如DNA微阵列数据),其性能往往会大大下降。在这种情况下,可以通过减少其节点由非信息性特征填充的树的贡献来改进该过程。在某种程度上,这可以通过预滤波来实现,但是我们提出了一种新颖而又简单的调整,该调整具有明显的优越性能:通过加权随机抽样而不是简单随机抽样在每个节点上选择合格的子集,权重倾斜信息功能。这导致了丰富的随机森林。我们在几个实际的微阵列数据集中说明了该程序的优越性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号