【24h】

A clustering-based decision tree induction algorithm

机译:基于聚类的决策树归纳算法

获取原文

摘要

Decision tree induction algorithms are well known techniques for assigning objects to predefined categories in a transparent fashion. Most decision tree induction algorithms rely on a greedy top-down recursive strategy for growing the tree, and pruning techniques to avoid overfitting. Even though such a strategy has been quite successful in many problems, it falls short in several others. For instance, there are cases in which the hyper-rectangular surfaces generated by these algorithms can only map the problem description after several sub-sequential partitions, which results in a large and incomprehensible tree. Hence, we propose a new decision tree induction algorithm based on clustering which seeks to provide more accurate models and/or shorter descriptions more comprehensible for the end-user. We do not base our performance analysis solely on the straightforward comparison of our proposed algorithm to baseline methods. Instead, we propose a data-dependent analysis in order to look for evidences which may explain in which situations our algorithm outperforms a well-known decision tree induction algorithm.
机译:决策树归纳算法是众所周知的技术,用于以透明方式将对象分配给预定义的类别。大多数决策树归纳算法依靠贪婪的自上而下的递归策略来生长树,并采用修剪技术来避免过度拟合。即使这样的策略在很多问题上都取得了成功,但在其他几个方面却远远不够。例如,在某些情况下,由这些算法生成的超矩形曲面只能在几个子序列分区之后映射问题描述,这会导致树大而难以理解。因此,我们提出了一种基于聚类的新决策树归纳算法,旨在为最终用户提供更准确的模型和/或更短的描述。我们不仅仅基于将我们提出的算法与基准方法进行直接比较来进行性能分析。取而代之的是,我们提出了一种与数据相关的分析,以寻找可以解释我们的算法在哪种情况下胜过众所周知的决策树归纳算法的证据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号