首页> 外文期刊>Procedia Computer Science >Adapted pruning scheme for the framework of imbalanced data-sets
【24h】

Adapted pruning scheme for the framework of imbalanced data-sets

机译:针对不平衡数据集框架的适应性修剪方案

获取原文

摘要

Learning from imbalanced data is attracting an increasing interest by the machine learning community. This is mainly due to the high number of real applications that are affected by this situation. The adaptation of the standard decision trees to deal with imbalanced data represents one of the important number of approaches that have been developed to address this problem. This adaptation has been proposed under three different perspectives: splitting criterion, assignment rule and pruning. In this paper, we focus our attention to the pruning of decision trees. We propose an adaptation of the standard pruning algorithm MCCP to address the skewed-data problem. Our contribution affects two levels: adaption of the metric used in selecting nodes to be firstly pruned and change of the evaluation measure used in selecting the best decision-tree through the pruning set. Our goal is to show that, contrary to the popular belief in the literature enquiring into the uselessness of decision tree pruning, an adaptive pruning technique for imbalanced situations is more efficient and more accurate towards the minority class. A total of twelve binary class data-sets having different imbalance ratio are used to test the performance of the proposed method. Experimental results show that the proposed post-pruning approach can increase the performance of imbalanced decision trees in terms of evaluation measures that are recent and appropriate for the context of imbalanced classification.
机译:从不平衡数据中进行学习正吸引着机器学习社区越来越多的兴趣。这主要是由于受这种情况影响的大量实际应用程序。调整标准决策树以处理不平衡数据代表了已开发出的解决此问题的重要方法之一。已从三个不同的角度提出了这种适应方法:拆分标准,分配规则和修剪。在本文中,我们将注意力集中在决策树的修剪上。我们提出一种标准修剪算法MCCP的改编方案,以解决数据偏斜问题。我们的贡献会影响两个级别:首先选择要修剪的节点时使用的度量标准的适应性;以及通过修剪集选择最佳决策树时使用的评估方法的更改。我们的目标是表明,与在文学界普遍质疑决策树修剪的无用性的信念相反,针对不平衡情况的自适应修剪技术对于少数群体更为有效和准确。总共十二个具有不同失衡比的二元类数据集被用来测试该方法的性能。实验结果表明,所提出的后修剪方法可以根据不平衡分类的最新评估方法来提高不平衡决策树的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号