An Optimized k-means Algorithm Based on Information Entropy

MEILING LIU; BEIXIAN ZHANG; XI LI; WEIDONG TANG; GANGQIANG ZHANG

首页> 外文期刊>The Computer journal >An Optimized k-means Algorithm Based on Information Entropy

【24h】

An Optimized k-means Algorithm Based on Information Entropy

机译：基于信息熵的优化K均值算法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Clustering is a widely used technique in data mining applications and various pattern recognition applications, in which data objects are divided into groups. K-means algorithm is one of the most classical clustering algorithms. In this algorithm, the initial clustering centers are randomly selected, this results in unstable clustering results. To solve this problem, an optimized algorithm to select the initial centers is proposed. In the proposed algorithm, dispersion degree is defined, which is based on entropy. In the algorithm, all the objects are firstly grouped into a big cluster, and the object that has the maximum dispersion degree and the object that has the minimum dispersion degree are selected as the initial clustering centers from the initial big cluster. And then other objects in the biggest cluster are partitioned to the initial clusters to which the objects are nearest. The partition process will be repeated until the cluster number is equal to the specified value k. Finally, the partitioned k clusters and their cluster centers are applied to k-means algorithm as initial clusters and centers. Several experiments are conducted on real data sets to evaluate the proposed algorithm. The proposed algorithm is compared with traditional k-means algorithm and max-min distance clustering algorithm, and experimental results show that the improved k-means algorithm is stable in selecting initial clustering, because it can select unique initial clustering centers. The optimized algorithm's effectiveness and feasibility are also verified by experiments, and the algorithm can reduce the times of iterations and has more stable clustering results and higher accuracy.

机译：聚类是数据挖掘应用程序中广泛使用的技术和各种模式识别应用程序，其中数据对象被分成组。 K-means算法是最古典的聚类算法之一。在该算法中，初始聚类中心随机选择，这导致群集结果不稳定。为了解决这个问题，提出了一种选择初始中心的优化算法。在所提出的算法中，定义了分散度，其基于熵。在算法中，将所有对象首先分组为大群集，并且具有最大色散度的对象和具有最小色散度的对象被选择为来自初始大群的初始聚类中心。然后，最大群集中的其他对象被划分为对象最接近的初始簇。将重复分区处理，直到簇号等于指定值k。最后，将分区K集群及其群集中心应用于K-Means算法作为初始集群和中心。在真实数据集上进行了几个实验以评估所提出的算法。将所提出的算法与传统的K-Mean算法和MAX-MIN距离聚类算法进行比较，实验结果表明，改进的K-MEAS算法在选择初始聚类时是稳定的，因为它可以选择唯一的初始聚类中心。通过实验还验证了优化的算法的有效性和可行性，并且该算法可以减少迭代时间并具有更稳定的聚类结果和更高的准确性。

著录项

来源
《The Computer journal》 |2021年第7期|1130-1143|共14页
作者
MEILING LIU; BEIXIAN ZHANG; XI LI; WEIDONG TANG; GANGQIANG ZHANG;
展开▼
作者单位

College of Artificial Intelligence and Key Laboratory of Software Engineering Guangxi University for Nationalities Naming 530006 China Guangxi Key Lab of Multi-source Information Mining & Security Guangxi Normal University Guilin 541004 China;

Guangxi Key Lab of Multi-source Information Mining & Security Guangxi Normal University Guilin 541004 China;

College of Artificial Intelligence and Key Laboratory of Software Engineering Guangxi University for Nationalities Naming 530006 China;

College of Artificial Intelligence and Key Laboratory of Software Engineering Guangxi University for Nationalities Naming 530006 China;

College of Artificial Intelligence and Key Laboratory of Software Engineering Guangxi University for Nationalities Naming 530006 China;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
information entropy; dispersion degree; k-means; clustering; clustering center;

机译：信息熵;分散度;K-means;聚类;聚类中心;

相似文献

外文文献
中文文献
专利

1. Comparison of K-Means and Fuzzy C-Means Algorithms on Simplification of 3D Point Cloud Based on Entropy Estimation [J] . Abdelaaziz Mahdaoui, Aziz Bouazi, Abdallah Marhraoui Hsaini, Advances in Science, Technology and Engineering Systems . 2017,第5期

机译：基于熵估计的简化3D点云的K均值和模糊C均值算法比较
2. Urban flooding risk assessment based on an integrated k-means cluster algorithm and improved entropy weight method in the region of Haikou, China [J] . Xu Hongshi, Ma Chao, Lian Jijian, Journal of Hydrology . 2018,第期

机译：基于集成K-Means集群算法的城市洪水风险评估及改进了中国海口地区熵权法
3. Image segmentation algorithm based on dynamic particle swarm optimization and K-means clustering [J] . Wei Xiaoqiong, Yin E. Zhang International Journal of Computers & Applications . 2020,第7a8期

机译：基于动态粒子群优化的图像分割算法和K均值聚类
4. Entropy-Based Feature Selection for Data Clustering Using k-Means and k-Medoids Algorithms [C] . Moni Kishore Dhar, S. M. Nahid Hasan, Tahsin Rahaman Otushi, International Conference on Research in Computational Intelligence and Communication Networks . 2020

机译：基于熵的特征选择，用于使用K-Means和K-METOIDS算法的数据聚类
5. Algorithm optimizations in genomic analysis using entropic dissection. [D] . Danks, Jacob R. 2015

机译：使用熵解剖的基因组分析中的算法优化。
6. A Novel Hybrid Meta-Heuristic Algorithm Based on the Cross-Entropy Method and Firefly Algorithm for Global Optimization [O] . Guocheng Li, Pei Liu, Chengyi Le, 2019

机译：一种基于跨熵方法和全局优化萤火虫算法的一种新型混合元算法
7. A Network Intrusion Detection Model Based on K-means Algorithm and Information Entropy [O] . Gao Meng, Li Dan, Wang Ni-hong, 2014

机译：基于K-MEACE算法和信息熵的网络入侵检测模型

An Optimized k-means Algorithm Based on Information Entropy

摘要

著录项

相似文献

相关主题

期刊订阅