...
【24h】

A bias correction algorithm for the Gini variable importance measure in classification trees

机译:分类树中基尼变量重要性度量的偏差校正算法

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

This article considers a measure of variable importance frequently used in variable-selection methods based on decision trees and tree-based ensemble models. These models include CART, random forests, and gradient boosting machine. The measure of variable importance is defined as the total heterogeneity reduction produced by a given covariate on the response variable when the sample space is recursively partitioned. Despite its popularity, some authors have shown that this measure is biased to the extent that, under certain conditions, there may be dangerous effects on variable selection. Here we present a simple and effective method for bias correction, focusing on the easily generalizable case of the Gini index as a measure of heterogeneity.
机译:本文考虑了基于决策树和基于树的集成模型的变量选择方法中经常使用的变量重要性度量。这些模型包括CART,随机森林和梯度增强机。变量重要性的度量定义为当递归划分样本空间时,给定协变量对响应变量产生的总异质性降低。尽管它很受欢迎,但一些作者表明,该措施在一定条件下存在一定程度的偏差,可能会对变量选择产生危险的影响。在这里,我们介绍一种简单有效的偏差校正方法,重点关注基尼系数的易于概括的情况,以作为异质性的度量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号