首页> 外文会议>2012 IEEE International Conference on Computational Intelligence amp; Computing Research >A novel index measure imputation algorithm for missing data values: A machine learning approach
【24h】

A novel index measure imputation algorithm for missing data values: A machine learning approach

机译:缺失数据值的新型索引度量插补算法:一种机器学习方法

获取原文
获取原文并翻译 | 示例

摘要

The problem of missing data in the real world datasets has very significant role in the real time data mining process and becomes more complex in large databases. The presence of missing values influences data set features and the class attributes, thus affecting the predictive accuracies of the classifiers. For the last one decade, many researchers have come out with different techniques for dealing with missing attribute values in databases with homogeneous and/or numeric attributes. In this research work, we proposed a new indexing measure to the imputation algorithm for missing data values of the attributes to compute the similarity measure between any two typical elements in the dataset. It can also be applied on any dataset be it a nominal and/or real. The proposed algorithm is evaluated by extensive experiments and comparison with KNNI, SVMI, WKNNI, KMI and FKMI algorithms. The results showed that the proposed algorithm has better performance than the existing imputation algorithms in terms of classification accuracy and also our decision tree algorithm employs highly accurate decision rules.
机译:现实世界数据集中的数据丢失问题在实时数据挖掘过程中起着非常重要的作用,并且在大型数据库中变得更加复杂。缺失值的存在会影响数据集功能和类属性,从而影响分类器的预测准确性。在过去的十年中,许多研究人员提出了不同的技术来处理具有均质和/或数值属性的数据库中缺少的属性值。在这项研究工作中,我们为属性缺失数据值的插补算法提出了一种新的索引度量,以计算数据集中任意两个典型元素之间的相似性度量。它也可以应用于标称和/或实数的任何数据集。通过广泛的实验对提出的算法进行了评估,并与KNNI,SVMI,WKNNI,KMI和FKMI算法进行了比较。结果表明,与分类算法相比,该算法具有更好的分类精度,并且决策树算法采用了高精度的决策规则。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号