Research on Data filling Algorithm Based on Improved k-means and Information Entropy

机译：基于改进的k均值和信息熵的数据填充算法研究

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Due to human error, equipment failure and other factors, the industrial Internet platform may generate a part of missing data. In order to fill in the missing data, this paper adopts a data filling method based on improved k-means and information entropy. First, we use the mean or mode to pre-fill the missing data. Then, we change the Euclidean distance in the k-means cluster to the Mahalanobis distance to cluster the data; and within the same category, calculate the similarity between each missing data and all complete data. Finally, Combined with the KNN idea, we find the k complete data that are most similar to each missing data, use information entropy to calculate the weight coefficients of the k complete data, and weight the corresponding attributes of the complete data to fill in the missing attributes. Experimental results show that the data filling algorithm in this paper has better filling precision than k-means and KNN algorithms.

机译：由于人为错误，设备故障和其他因素，工业互联网平台可能会生成一部分丢失的数据。为了填充丢失的数据，本文采用了一种基于改进的k均值和信息熵的数据填充方法。首先，我们使用均值或众数来预填充丢失的数据。然后，我们将k均值聚类中的欧几里得距离更改为Mahalanobis距离以对数据进行聚类;并在同一类别中，计算每个缺失数据与所有完整数据之间的相似度。最后，结合KNN思想，我们找到与每个缺失数据最相似的k个完整数据，使用信息熵来计算k个完整数据的权重系数，并对完整数据的相应属性进行加权以填充k个完整数据。缺少属性。实验结果表明，与k-means和KNN算法相比，本文的数据填充算法具有更好的填充精度。

著录项

来源
《IEEE International Conference on Computer and Communications》|2018年|1774-1778|共5页
会议地点
作者
Xiaofei Gong; Jie Zhang; Yijie Shi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Clustering algorithms; Filling; Information entropy; Euclidean distance; Uncertainty; Internet; Telecommunications;

机译：聚类算法填充信息熵欧式距离不确定度互联网电信;

相似文献

外文文献
中文文献
专利

1. Urban flooding risk assessment based on an integrated k-means cluster algorithm and improved entropy weight method in the region of Haikou, China [J] . Xu Hongshi, Ma Chao, Lian Jijian, Journal of Hydrology . 2018,第期

机译：基于集成K-Means集群算法的城市洪水风险评估及改进了中国海口地区熵权法
2. Data design and analysis based on cloud computing and improved K-Means algorithm [J] . Wu Chunqiong, Yu Rongrui, Yan Bingwen, Journal of intelligent & fuzzy systems: Applications in Engineering and Technology . 2020,第4Pta1期

机译：基于云计算和改进的K均值算法的数据设计与分析
3. SMK-means: An Improved Mini Batch K-means Algorithm Based on Mapreduce with Big Data [J] . Xiao Bo, Wang Zhen, Liu Qi, Computers, Materials & Continua . 2018,第3期

机译：SMK-means：基于大数据Mapreduce的改进的迷你批处理K-means算法
4. Research on Data filling Algorithm Based on Improved k-means and Information Entropy [C] . Xiaofei Gong, Jie Zhang, Yijie Shi IEEE International Conference on Computer and Communications . 2018

机译：基于改进的K型方式和信息熵的数据填充算法研究
5. The K-MM clustering algorithm based on K-means and K-medoids in data mining. [D] . Li, Yihao. 2011

机译：数据挖掘中基于K-means和K-medoids的K-MM聚类算法。
6. Big-Data-Mining-Based Improved K-Means Algorithm for Energy Use Analysis of Coal-Fired Power Plant Units: A Case Study [O] . Binghan Liu, Zhongguang Fu, Pengkai Wang, 2018

机译：基于大数据挖掘的改进的K均值燃煤电厂能源分析算法：案例研究
7. An Improved Parameter less Data Clustering Technique based on Maximum Distance of Data and Lioyd k-means Algorithm [O] . Mohd Wan Maseri Binti Wan, Beg A.H., Herawan Tutut, 2012

机译：基于最大数据距离和Lloyd k-means算法的改进的无参数数据聚类技术

Research on Data filling Algorithm Based on Improved k-means and Information Entropy

摘要

著录项

相似文献

相关主题

期刊订阅