首页> 外文期刊>Cybernetics, IEEE Transactions on >Segment Based Decision Tree Induction With Continuous Valued Attributes
【24h】

Segment Based Decision Tree Induction With Continuous Valued Attributes

机译:具有连续值属性的基于分段的决策树归纳

获取原文
获取原文并翻译 | 示例
           

摘要

A key issue in decision tree (DT) induction with continuous valued attributes is to design an effective strategy for splitting nodes. The traditional approach to solving this problem is adopting the candidate cut point (CCP) with the highest discriminative ability, which is evaluated by some frequency based heuristic measures. However, such methods ignore the class permutation of examples in the node, and they cannot distinguish the CCPs with the same or similar frequency information, thus may fail to induce a better and smaller tree. In this paper, a new concept, i.e., segment of examples, is proposed to differentiate the CCPs with same frequency information. Then, a new hybrid scheme that combines the two heuristic measures, i.e., frequency and segment, is developed for splitting DT nodes. The relationship between frequency and the expected number of segments, which is regarded as a random variable, is also given. Experimental comparisons demonstrate that the proposed scheme is not only effective to improve the generalization capability, but also valid to reduce the size of the tree.
机译:具有连续值属性的决策树(DT)归纳中的关键问题是设计一种有效的节点拆分策略。解决此问题的传统方法是采用具有最高判别能力的候选切入点(CCP),并通过基于频率的启发式方法对其进行评估。但是,这种方法忽略了节点中示例的类置换,并且它们无法区分具有相同或相似频率信息的CCP,因此可能无法诱导出更好或更小的树。在本文中,提出了一种新的概念,即部分示例,以区分具有相同频率信息的CCP。然后,开发了一种新的混合方案,该方案结合了两种启发式措施,即频率和分段,用于拆分DT节点。还给出了频率与期望段数之间的关系,该关系被视为随机变量。实验比较表明,该方案不仅有效地提高了泛化能力,而且有效地减小了树的大小。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号