首页> 外文期刊>Expert systems with applications >Application of Chi-square discretization algorithms to ensemble classification methods
【24h】

Application of Chi-square discretization algorithms to ensemble classification methods

机译:Chi-Square离散化算法在整体分类方法中的应用

获取原文
获取原文并翻译 | 示例
           

摘要

Classification is one of the important tasks in data mining and machine learning. Classification performance depends on many factors as well as data characteristics. Some algorithms are known to work better with discrete data. In contrast, most real-world data contain continuous variables. For algorithms working with discrete data, these continuous variables must be converted to discrete ones. In this process called discretization, continuous variables are converted to their corresponding discrete variables. In this paper, four Chi-square based supervised discretization algorithms ChiMerge(ChiM), Chi2, Extended Chi2(ExtChi2) and Modified Chi2(ModChi2) were used. In the literature, the performance of these algorithms is often tested with decision trees and Naïve Bayes classifiers. In this study, differently, four sets of data discretized by these algorithms were classified with ensemble methods. Classification accuracies for these data sets were obtained through using a stratified 10-fold cross-validation method. The classification performance of the original and discrete data sets of the methods is presented comparatively. According to the results, the performance of the discrete data is more successful than the original data.
机译:分类是数据挖掘和机器学习中的重要任务之一。分类性能取决于许多因素以及数据特征。已知一些算法通过离散数据更好地工作。相比之下,大多数真实数据包含连续变量。对于使用离散数据的算法,必须将这些连续变量转换为离散的算法。在该过程中称为离散化,连续变量被转换为相应的离散变量。本文采用了四种基于Chi-Square的监督离散化算法嵌入式(CHIM),CHI2,扩展CHI2(EXTCHI2)和改性CHI2(MODCHI2)。在文献中,这些算法的性能通常用决策树和天真贝叶斯分类器进行测试。在这项研究中,不同地,通过这些算法离散化的四组数据以集成方法分类。通过使用分层的10倍交叉验证方法获得这些数据集的分类精度。比较呈现了该方法的原始和离散数据集的分类性能。根据结果​​,离散数据的性能比原始数据更成功。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号