首页> 外文期刊>Arabian Journal for Science and Engineering. Section A, Sciences >A Method of Classification Performance Improvement Via a Strategy of Clustering-Based Data Elimination Integrated with k-Fold Cross-Validation
【24h】

A Method of Classification Performance Improvement Via a Strategy of Clustering-Based Data Elimination Integrated with k-Fold Cross-Validation

机译:通过与K折交叉验证集成的基于聚类数据消除策略进行分类性能改进方法

获取原文
获取原文并翻译 | 示例
       

摘要

Non-system errors that occur during data entry or data collection create noisy data that reduce the success of classificationsystems. To eliminate this data, a classification system with a new data reduction method consisting of a modified k-meansalgorithm using relief algorithm coefficients named MKMA-RAC was developed. The main theme of this article is theelimination of noisy data and its consistent application to the classification system using the k-fold cross-validation method.By means of the developed system, the training data became free from noisy data by integrating the support vector machine,linear discriminant analysis (LDA) and decision tree classifierswithMKMA-RAC-based data reduction for every fold. The datareduction process was not applied for the test data. Datasets used in the proposed method were the Hepatitis, Liver Disorders,SPECT images and Statlog (Heart) dataset taken from the UCI database. Classification performance values obtained bothfrom the proposed method and without the proposed method with tenfold CV were given for these datasets. For Hepatitis,Liver Disorders, SPECT images and Statlog (Heart) datasets, and classification successes of the proposed system with SVMclassifier were 96.88%, 74.56%, 87.24%, and 90.00%, classification successes of the proposed system with LDA classifierwere 94.91%, 69.05%, 82.38%, and 88.52%, classification successes of the proposed system with decision tree classifier were96.25%, 77.73%, 88.77% and 89.63%, respectively. The test results have shown that the proposed system generally achievedhigher classification performance than other literature results. Therefore, the performance is very encouraging for patternrecognition applications.
机译:数据输入或数据收集期间发生的非系统错误会产生嘈杂的数据,从而减少分类成功系统。为了消除该数据,具有新的数据减少方法,该分类系统由修改后的k均值组成开发了使用命名Mkma-RAC的easif算法系数的算法。本文的主要主题是消除嘈杂的数据及其一致应用于使用k折交叉验证方法的分类系统。通过开发系统,通过集成支持向量机,培训数据从嘈杂的数据中没有嘈杂的数据,线性判别分析(LDA)和决策树Classifierswithmkma-RAC的数据减少每个折叠。数据减少过程未申请测试数据。所提出的方法中使用的数据集是肝炎,肝病,从UCI数据库中获取的SPECT图像和Statlog(心)数据集。分类性能值都获得从所提出的方法,没有提出的方法,给出了这些数据集的十倍CV。对于肝炎,肝脏障碍,SPECT图像和Statlog(心脏)数据集,以及SVM所提出的系统的分类成功分类器为96.88%,74.56%,87.24%和90.00%,采用LDA分类器的建议系统的分类成功分类树分类器的提出系统的分类成功为94.91%,69.05%,82.38%和88.52%96.25%,77.73%,88.77%和89.63%。测试结果表明,所提出的系统通常实现比其他文献结果更高的分类性能。因此,性能非常令人鼓舞识别申请。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号