首页> 外文期刊>Expert Systems >An empirical comparison of cost-sensitive decision tree induction algorithms
【24h】

An empirical comparison of cost-sensitive decision tree induction algorithms

机译:成本敏感型决策树归纳算法的经验比较

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Decision tree induction is a widely used technique for learning from data, which first emerged in the 1980s. In recent years, several authors have noted that in practice, accuracy alone is not adequate, and it has become increasingly important to take into consideration the cost of misclassifying the data. Several authors have developed techniques to induce cost-sensitive decision trees. There are many studies that include pair-wise comparisons of algorithms, but the comparison including many methods has not been conducted in earlier work. This paper aims to remedy this situation by investigating different cost-sensitive decision tree induction algorithms. A survey has identified 30 cost-sensitive decision tree algorithms, which can be organized into 10 categories. A representative sample of these algorithms has been implemented and an empirical evaluation has been carried. In addition, an accuracy-based look-ahead algorithm has been extended to a new cost-sensitive look-ahead algorithm and also evaluated. The main outcome of the evaluation is that an algorithm based on genetic algorithms, known as Inexpensive Classification with Expensive Tests, performed better over all the range of experiments thus showing that to make a decision tree cost-sensitive, it is better to include all the different types of costs, that is, cost of obtaining the data and misclassification costs, in the induction of the decision tree.
机译:决策树归纳是一种广泛用于从数据中学习的技术,该技术最早出现于1980年代。近年来,几位作者指出,在实践中,仅靠准确性是不够的,考虑到错误分类数据的成本已变得越来越重要。几位作者已经开发出诱导成本敏感型决策树的技术。有许多研究包括算法的成对比较,但是在早期工作中尚未进行包括许多方法的比较。本文旨在通过研究不同的成本敏感型决策树归纳算法来纠正这种情况。一项调查确定了30种对成本敏感的决策树算法,这些算法可以分为10类。这些算法的代表性示例已实现,并进行了经验评估。此外,基于精度的预见算法已扩展为一种新的成本敏感型预见算法,并已进行了评估。评估的主要结果是,基于遗传算法的算法,即带有昂贵测试的廉价分类,在所有实验范围内均表现更好,因此表明,使决策树对成本敏感,最好将所有决策树的归纳中包含不同类型的成本,即获取数据的成本和分类错误的成本。

著录项

  • 来源
    《Expert Systems》 |2011年第3期|p.227-268|共42页
  • 作者

    Susan Lomax; Sunil Vadera;

  • 作者单位

    Data Mining & Pattern Recognition Research Centre, School of Computing, Science and Engineering, University of Salford, Salford M5 4WT, UK;

    Data Mining & Pattern Recognition Research Centre, School of Computing, Science and Engineering, University of Salford, Salford M5 4WT, UK;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    cost-sensitive learning; decision trees; data mining;

    机译:成本敏感型学习;决策树;数据挖掘;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号