首页> 外文会议>International Conference on Knowledge and Smart Technology >Study of discretization methods in classification
【24h】

Study of discretization methods in classification

机译:分类离散化方法研究

获取原文

摘要

Classification is one of the important tasks in Data Mining or Knowledge Discovery with prolific applications. Satisfactory classification depends on characteristics of the dataset too. Numerical and nominal attributes are commonly occurred in the dataset. However, classification performance may be aided by discretization of numerical attributes. At present, several discretization methods and numerous techniques for implementing classifiers exist. This study has three main objectives. First is to study the effectiveness of discretization of attributes, and second is to compare the efficiency of eight discretization methods. These are ChiMerge, Chi2, Modified Chi2, Extended Chi2, Class-Attribute Interdependence Maximization (CAIM), Class-Attribute Contingency Coefficient (CACC), Autonomous Discretization Algorithm (Ameva), and Minimum Description Length Principle (MDLP). Finally, the study investigates suitability of the eight discretization methods when applied to the five commonly known classifiers, Neural Network, K Nearest Neighbour (K-NN), Naive Bayes, C4.5, and Support Vector machine (SVM).
机译:分类是使用大量应用程序进行数据挖掘或知识发现时的重要任务之一。令人满意的分类也取决于数据集的特征。数值和标称属性通常出现在数据集中。但是,分类性能可以通过数字属性的离散化来帮助。当前,存在用于实现分类器的多种离散化方法和多种技术。这项研究有三个主要目标。首先是研究属性离散化的有效性,其次是比较八种离散化方法的效率。它们是ChiMerge,Chi2,修改的Chi2,扩展的Chi2,类属性相互依赖最大化(CAIM),类属性偶发系数(CACC),自主离散化算法(Ameva)和最小描述长度原理(MDLP)。最后,本研究调查了将八种离散化方法应用于五种常见分类器(神经网络,K最近邻(K-NN),朴素贝叶斯,C4.5和支持向量机(SVM))的适用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号