首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Binary Partitions with Approximate Minimum Impurity
【24h】

Binary Partitions with Approximate Minimum Impurity

机译:近似最小杂质的二进制分区

获取原文
       

摘要

The problem of splitting attributes is one of the main steps in the construction of decision trees. In order to decide the best split, impurity measures such as Entropy and Gini are widely used. In practice, decision-tree inducers use heuristics for finding splits with small impurity when they consider nominal attributes with a large number of distinct values. However, there are no known guarantees for the quality of the splits obtained by these heuristics. To fill this gap, we propose two new splitting procedures that provably achieve near-optimal impurity. We also report experiments that provide evidence that the proposed methods are interesting candidates to be employed in splitting nominal attributes with many values during decision tree/random forest induction.
机译:属性拆分问题是构建决策树的主要步骤之一。为了确定最佳分割,广泛使用了诸如熵和基尼等杂质度量。在实践中,决策树诱导器在考虑具有大量不同值的名义属性时,会使用启发式算法查找杂质较少的拆分。但是,对于这些启发式方法获得的分割质量,尚无已知的保证。为了填补这一空白,我们提出了两个新的分离程序,可证明实现了接近最佳的杂质。我们还报告了实验,这些实验提供了证据,表明所提出的方法是在决策树/随机森林归纳过程中用于分割具有多个值的名义属性的有趣候选方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号