首页> 外文会议>IEEE International Conference on Computer and Communication Systems >Application of the Modified Imputation Method to Missing Data to Increase Classification Performance
【24h】

Application of the Modified Imputation Method to Missing Data to Increase Classification Performance

机译:改进的插补方法在缺失数据中的应用以提高分类性能

获取原文

摘要

Incomplete data or missing data diminishes the effectivity of statistical results, and may cause bias estimates, which in turn leads to unsound judgment. Inefficiency and impediments in data treatment analysis, which are among the predicaments linked with missing values, may affect the supervised learning process and reduce the classification accuracy and performance of the prediction model in a data mining task. This study applied the modified imputation method-which was previously tested with well-known imputation algorithms-to renowned classification techniques namely Naive Bayes, One-R, k-Nearest Neighbor (kNN), C4.5, and Support Vector Machine (SVM) using open data sets from the UCI Repository. The level of performance in terms of precision, accuracy, and Receiver Operating Characteristics (ROC) using Weka tool, before and after imputation was examined. This study manifests that there was an improvement in the classification performance upon the application of the modified imputation method on datasets during preprocessing, compared to that of datasets with missing values.
机译:数据不完整或数据丢失会降低统计结果的有效性,并可能导致偏差估计,进而导致判断不正确。数据处理分析中的效率低下和障碍(与缺失值相关联)可能会影响监督学习过程,并降低数据挖掘任务中预测模型的分类准确性和性能。这项研究将改进的插补方法(已使用著名的插补算法进行过测试)应用于著名的分类技术,包括朴素贝叶斯,One-R,k最近邻(kNN),C4.5和支持向量机(SVM)使用UCI信息库中的开放数据集。研究了在插补前后使用Weka工具在精度,准确性和接收器工作特性(ROC)方面的性能水平。这项研究表明,与缺失值的数据集相比,在预处理过程中对数据集应用改进的插补方法时,分类性能有了改善。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号