首页> 外文会议>International Conference on Knowledge and Smart Technology >Study of discretization methods in classification
【24h】

Study of discretization methods in classification

机译:分类中的离散化方法研究

获取原文

摘要

Classification is one of the important tasks in Data Mining or Knowledge Discovery with prolific applications. Satisfactory classification depends on characteristics of the dataset too. Numerical and nominal attributes are commonly occurred in the dataset. However, classification performance may be aided by discretization of numerical attributes. At present, several discretization methods and numerous techniques for implementing classifiers exist. This study has three main objectives. First is to study the effectiveness of discretization of attributes, and second is to compare the efficiency of eight discretization methods. These are ChiMerge, Chi2, Modified Chi2, Extended Chi2, Class-Attribute Interdependence Maximization (CAIM), Class-Attribute Contingency Coefficient (CACC), Autonomous Discretization Algorithm (Ameva), and Minimum Description Length Principle (MDLP). Finally, the study investigates suitability of the eight discretization methods when applied to the five commonly known classifiers, Neural Network, K Nearest Neighbour (K-NN), Naive Bayes, C4.5, and Support Vector machine (SVM).
机译:分类是数据挖掘或知识发现中的重要任务之一,具有多产应用。满意的分类取决于数据集的特征。在数据集中通常发生数值和标称属性。但是,可以通过数值属性的离散化来帮助分类性能。目前,存在若干离散化方法和用于实施分类器的许多技术。这项研究有三个主要目标。首先是研究属性的离散化的有效性,第二是比较八种离散化方法的效率。这些是Chimerge,CHI2,修改的CHI2,扩展CHI2,类属性相互依存最大化(CAIM),类属性应变系数(CACC),自主离散化算法(AMEVA)以及最小描述长度原理(MDLP)。最后,研究调查了八种离散化方法的适用性,当应用于五个常用的分类器,神经网络,K最近邻(K-NN),天真贝叶斯,C4.5和支持向量机(SVM)时。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号