A Method of Classification Performance Improvement Via a Strategy of Clustering-Based Data Elimination Integrated with k-Fold Cross-Validation

Onur Inan; Mustafa Serter Uzer

首页> 外文期刊>Arabian Journal for Science and Engineering. Section A, Sciences >A Method of Classification Performance Improvement Via a Strategy of Clustering-Based Data Elimination Integrated with k-Fold Cross-Validation

【24h】

A Method of Classification Performance Improvement Via a Strategy of Clustering-Based Data Elimination Integrated with k-Fold Cross-Validation

机译：通过与K折交叉验证集成的基于聚类数据消除策略进行分类性能改进方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Non-system errors that occur during data entry or data collection create noisy data that reduce the success of classificationsystems. To eliminate this data, a classification system with a new data reduction method consisting of a modified k-meansalgorithm using relief algorithm coefficients named MKMA-RAC was developed. The main theme of this article is theelimination of noisy data and its consistent application to the classification system using the k-fold cross-validation method.By means of the developed system, the training data became free from noisy data by integrating the support vector machine,linear discriminant analysis (LDA) and decision tree classifierswithMKMA-RAC-based data reduction for every fold. The datareduction process was not applied for the test data. Datasets used in the proposed method were the Hepatitis, Liver Disorders,SPECT images and Statlog (Heart) dataset taken from the UCI database. Classification performance values obtained bothfrom the proposed method and without the proposed method with tenfold CV were given for these datasets. For Hepatitis,Liver Disorders, SPECT images and Statlog (Heart) datasets, and classification successes of the proposed system with SVMclassifier were 96.88%, 74.56%, 87.24%, and 90.00%, classification successes of the proposed system with LDA classifierwere 94.91%, 69.05%, 82.38%, and 88.52%, classification successes of the proposed system with decision tree classifier were96.25%, 77.73%, 88.77% and 89.63%, respectively. The test results have shown that the proposed system generally achievedhigher classification performance than other literature results. Therefore, the performance is very encouraging for patternrecognition applications.

机译：数据输入或数据收集期间发生的非系统错误会产生嘈杂的数据，从而减少分类成功系统。为了消除该数据，具有新的数据减少方法，该分类系统由修改后的k均值组成开发了使用命名Mkma-RAC的easif算法系数的算法。本文的主要主题是消除嘈杂的数据及其一致应用于使用k折交叉验证方法的分类系统。通过开发系统，通过集成支持向量机，培训数据从嘈杂的数据中没有嘈杂的数据，线性判别分析（LDA）和决策树Classifierswithmkma-RAC的数据减少每个折叠。数据减少过程未申请测试数据。所提出的方法中使用的数据集是肝炎，肝病，从UCI数据库中获取的SPECT图像和Statlog（心）数据集。分类性能值都获得从所提出的方法，没有提出的方法，给出了这些数据集的十倍CV。对于肝炎，肝脏障碍，SPECT图像和Statlog（心脏）数据集，以及SVM所提出的系统的分类成功分类器为96.88％，74.56％，87.24％和90.00％，采用LDA分类器的建议系统的分类成功分类树分类器的提出系统的分类成功为94.91％，69.05％，82.38％和88.52％96.25％，77.73％，88.77％和89.63％。测试结果表明，所提出的系统通常实现比其他文献结果更高的分类性能。因此，性能非常令人鼓舞识别申请。

著录项

来源
《Arabian Journal for Science and Engineering. Section A, Sciences》 |2021年第2期|1199-1212|共14页
作者
Onur Inan; Mustafa Serter Uzer;
展开▼
作者单位

Computer Engineering Faculty of Engineering andArchitecture Necmettin Erbakan University Konya Turkey;

Electronics and Automation Selcuk University IlgınVocational School Konya Turkey;

展开▼
收录信息美国《科学引文索引》(SCI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Clustering-based data elimination; Relief; Medical dataset classification;

机译：基于聚类的数据消除;宽慰;医疗数据集分类;
入库时间 2022-08-18 21:04:47

相似文献

外文文献
中文文献
专利

1. Performance improvement of empirical models for estimation of global solar radiation in India: A k-fold cross-validation approach [J] . Saud Sheikh, Jamil Basharat, Upadhyay Yogesh, Sustainable Energy Technologies and Assessments . 2020,第Auga期

机译：印度全球太阳辐射估算实证模型的性能改进：k折交叉验证方法
2. K-FOLD CROSS-VALIDATION FOR IMPROVING MEDICAL CLASSIFICATION ACCURACY AND MODEL SELECTION IN K-NEAREST NEIGHBORS CLASSIFIERS [J] . Zhao M. Basic & clinical pharmacology & toxicology. . 2016,第Suppla1期

机译：K-fold交叉验证可提高K-近邻分类器的医学分类准确性和模型选择
3. Gully erosion susceptibility mapping (GESM) using machine learning methods optimized by the multi?collinearity analysis and K-fold cross-validation [J] . Omid Ghorbanzadeh, Hejar Shahabi, Fahimeh Mirchooli, Geomatics,Natural Hazards & Risk . 2020,第1期

机译：GULLY侵蚀易感性映射（GESM）使用由多功能分析优化的机器学习方法和k折交叉验证
4. Assessment of Data Augmentation Strategies Toward Performance Improvement of Abnormality Classification in Chest Radiographs [C] . Prasanth Ganesan, Sivaramakrishnan Rajaraman, Rodney Long, Annual International Conference of the IEEE Engineering in Medicine and Biology Society . 2019

机译：评估胸部X光片异常分类表现的数据增强策略的评估
5. Feature Selection and Classification for High-Dimensional Biological Data Under Cross-Validation Framework [D] . Zhong, Yi. 2018

机译：交叉验证框架下高维生物数据的特征选择与分类
6. Effectiveness of the Diagnose-Intervene- Verify-Adjust (DIVA) model for integrated primary healthcare planning and performance improvement: an embedded mixed methods evaluation in Kaduna state Nigeria [O] . Ejemai Amaize Eboreime, Nonhlanhla Nxumalo, Rohit Ramaswamy, 2019

机译：诊断干预验证调整（DIVA）模型对综合基础医疗规划和绩效改善的有效性：尼日利亚卡杜纳州的嵌入式混合方法评估
7. A new GIS-based data mining technique using an adaptive neuro-fuzzy inference system (ANFIS) and k-fold cross-validation approach for land subsidence susceptibility mapping [O] . Omid Ghorbanzadeh, Hashem Rostamzadeh, Thomas Blaschke, 2018

机译：一种新的基于GIS的数据挖掘技术，采用自适应神经模糊推理系统（ANFIS）和k倍交叉验证方法进行土地沉降敏感性映射

A Method of Classification Performance Improvement Via a Strategy of Clustering-Based Data Elimination Integrated with k-Fold Cross-Validation

摘要

著录项

相似文献

相关主题

期刊订阅