首页> 外文会议>Wireless and Optical Communication Conference >Decision tree rule-based feature selection for large-scale imbalanced data
【24h】

Decision tree rule-based feature selection for large-scale imbalanced data

机译:基于决策树规则的大规模不平衡数据的特征选择

获取原文

摘要

A class imbalance problem often appears in many real world applications, e.g. fault diagnosis, text categorization, fraud detection. When dealing with a large-scale imbalanced dataset, feature selection becomes a great challenge. To confront it, this work proposes a feature selection approach based on a decision tree rule. The effectiveness of the proposed approach is verified by classifying a large-scale dataset from Santander Bank. The results show that our approach can achieve higher Area Under the Curve (AUC) and less computational time. We also compare it with filter-based feature selection approaches, i.e., Chi-Square and F-statistic. The results show that it outperforms them but needs slightly more computational efforts.
机译:阶级不平衡问题通常出现在许多真实世界应用中,例如,故障诊断,文本分类,欺诈检测。在处理大型不平衡数据集时,功能选择变得巨大挑战。要对抗它,这项工作提出了一种基于决策树规则的特征选择方法。通过分类来自桑坦德银行的大型数据集来验证拟议方法的有效性。结果表明,我们的方法可以在曲线(AUC)和较少的计算时间下实现更高的区域。我们还将其与基于过滤器的特征选择方法进行比较,即Chi-Square和F-Static。结果表明它优于它们,但需要略有更多的计算工作。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号