...
首页> 外文期刊>Neurocomputing >A framework for bottom-up induction of oblique decision trees
【24h】

A framework for bottom-up induction of oblique decision trees

机译:自下而上的倾斜决策树归纳框架

获取原文
获取原文并翻译 | 示例
           

摘要

Decision-tree induction algorithms are widely used in knowledge discovery and data mining, specially in scenarios where model comprehensibility is desired. A variation of the traditional univariate approach is the so-called oblique decision tree, which allows multivariate tests in its non-terminal nodes. Oblique decision trees can model decision boundaries that are oblique to the attribute axes, whereas univariate trees can only perform axis-parallel splits. The vast majority of the oblique and univariate decision-tree induction algorithms employ a top-down strategy for growing the tree, relying on an impurity-based measure for splitting nodes. In this paper, we propose BUTIF-a novel Bottom-Up Oblique Decision-Tree Induction Framework. BUTIF does not rely on an impurity-measure for dividing nodes, since the data resulting from each split is known a priori. For generating the initial leaves of the tree and the splitting hyperplanes in its internal nodes, BUTIF allows the adoption of distinct clustering algorithms and binary classifiers, respectively. It is also capable of performing embedded feature selection, which may reduce the number of features in each hyperplane, thus improving model comprehension. Different from virtually every top-down decision-tree induction algorithm, BUTIF does not require the further execution of a pruning procedure in order to avoid overfitting, due to its bottom-up nature that does not overgrow the tree. We compare distinct instances of BUTIF to traditional univariate and oblique decision-tree induction algorithms. Empirical results show the effectiveness of the proposed framework.
机译:决策树归纳算法广泛用于知识发现和数据挖掘中,尤其是在需要模型可理解性的情况下。传统单变量方法的一种变体是所谓的斜决策树,它允许在其非终端节点中进行多变量测试。倾斜决策树可以对与属性轴倾斜的决策边界进行建模,而单变量树只能执行轴平行拆分。绝大多数倾斜和单变量决策树归纳算法都采用自顶向下策略来生长树,这依赖于基于杂质的度量来分裂节点。在本文中,我们提出了一种新的自下而上的斜决策树归纳框架BUTIF。 BUTIF不依赖于杂质测量来划分节点,因为每个拆分产生的数据都是先验的。为了生成树的初始叶子及其内部节点中的分裂超平面,BUTIF分别允许采用不同的聚类算法和二进制分类器。它还能够执行嵌入式特征选择,这可以减少每个超平面中的特征数量,从而提高模型理解能力。与几乎所有自上而下的决策树归纳算法不同,BUTIF不需要进一步执行修剪过程以避免过拟合,因为它的自底向上特性不会使树过度增长。我们将BUTIF的不同实例与传统的单变量和倾斜决策树归纳算法进行比较。实证结果表明了该框架的有效性。

著录项

  • 来源
    《Neurocomputing》 |2014年第5期|3-12|共10页
  • 作者单位

    Faculdade de Informatica, Pontificia Universidade Catolica do Rio Grande do Sul - Av. Ipiranga, 6681, 90619-900, Porto Alegre-RS, Brazil;

    Instituto de Ciencias Matematicas e de Computacao - ICMC, Universidade de Sao Paulo - Campus de Sao Carlos, Caixa Postal 668, 13560-970 Sao Carlos-SP, Brazil;

    Instituto de Ciencias Matematicas e de Computacao - ICMC, Universidade de Sao Paulo - Campus de Sao Carlos, Caixa Postal 668, 13560-970 Sao Carlos-SP, Brazil;

    Instituto de Ciencias Matematicas e de Computacao - ICMC, Universidade de Sao Paulo - Campus de Sao Carlos, Caixa Postal 668, 13560-970 Sao Carlos-SP, Brazil;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Oblique decision trees; Bottom-up induction; Clustering;

    机译:斜决策树;自下而上的感应;聚类;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号