首页> 外文会议>IEEE International Conference on Computer and Communications >Research on Data filling Algorithm Based on Improved k-means and Information Entropy
【24h】

Research on Data filling Algorithm Based on Improved k-means and Information Entropy

机译:基于改进的k均值和信息熵的数据填充算法研究

获取原文

摘要

Due to human error, equipment failure and other factors, the industrial Internet platform may generate a part of missing data. In order to fill in the missing data, this paper adopts a data filling method based on improved k-means and information entropy. First, we use the mean or mode to pre-fill the missing data. Then, we change the Euclidean distance in the k-means cluster to the Mahalanobis distance to cluster the data; and within the same category, calculate the similarity between each missing data and all complete data. Finally, Combined with the KNN idea, we find the k complete data that are most similar to each missing data, use information entropy to calculate the weight coefficients of the k complete data, and weight the corresponding attributes of the complete data to fill in the missing attributes. Experimental results show that the data filling algorithm in this paper has better filling precision than k-means and KNN algorithms.
机译:由于人为错误,设备故障和其他因素,工业互联网平台可能会生成一部分丢失的数据。为了填充丢失的数据,本文采用了一种基于改进的k均值和信息熵的数据填充方法。首先,我们使用均值或众数来预填充丢失的数据。然后,我们将k均值聚类中的欧几里得距离更改为Mahalanobis距离以对数据进行聚类;并在同一类别中,计算每个缺失数据与所有完整数据之间的相似度。最后,结合KNN思想,我们找到与每个缺失数据最相似的k个完整数据,使用信息熵来计算k个完整数据的权重系数,并对完整数据的相应属性进行加权以填充k个完整数据。缺少属性。实验结果表明,与k-means和KNN算法相比,本文的数据填充算法具有更好的填充精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号