首页> 外文会议>Information Technology and Mechatronics Engineering Conference >Development and Design of General Data Mining System
【24h】

Development and Design of General Data Mining System

机译:通用数据挖掘系统的开发与设计

获取原文

摘要

In this paper, we focus on top-down discretization methods and propose a new method for supervised discretization based on class-feature correlation by defining a class-feature contingency factor. The proposed method takes into consideration the distribution of all samples to generate an ideal discretization scheme. The method maintains a high interdependence between the target class and the discretized attribute, and avoids overfitting. Empirical evaluation of seven discretization algorithms on UCI real datasets show that the novel algorithm can yield a better discretization scheme that improves the accuracy of decision tree classification. As to the execution time of discretization and the number of generated rules, our approach also achieves promising results.
机译:在本文中,我们专注于自上而下的离散化方法,并通过定义类特征次要因子来提出基于类特征相关性的基于类特征关联的监督离散化的新方法。所提出的方法考虑了所有样品的分布以产生理想的离散化方案。该方法在目标类和离散化属性之间保持高相互依赖性,并避免过度拟合。七个离散化算法对UCI实时数据集的实证评估表明,新颖的算法可以提高决策树分类准确性的更好的离散化方案。关于离散化的执行时间和生成的规则的数量,我们的方法也实现了有希望的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号