首页> 外文会议> >A Novel Approach of Data Clustering Using An Improved Particle Swarm Optimization Based K–Means Clustering Algorithm
【24h】

A Novel Approach of Data Clustering Using An Improved Particle Swarm Optimization Based K–Means Clustering Algorithm

机译:基于改进的粒子群算法的K均值聚类算法的数据聚类新方法

获取原文

摘要

In this paper, a Modified Particle Swarm Optimization (MfPSO) based K-Means algorithm is presented to cluster multidimensional data. Poor selection of cluster centers in K-Means at the initial stage may affect the clustering result and it may get stuck at local minima. To get rid of these problems, the proposed MfPSO is employed to generate the cluster centers for a dataset. The inertia weight of PSO algorithm plays a vital role to balance the global search and local search in PSO. In the proposed algorithm, the inertia weight has been modified to improve the convergence velocity and better global search capability. The MfPSO generates the cluster centers and those derived cluster centers are then applied as the initial cluster centers in the K-Means algorithm. It has been proved quantitatively that the proposed algorithm produces better result and the local minima problem has been resolved. The proposed algorithm has been compared extensively with the conventional K-Means algorithm and chaotic descending inertia weight based PSO (CDIW PSO) on four well known dataset. The superiority of the proposed algorithm is visually and quantitatively established on the basis of two standard cluster evaluation criteria, computational time, mean, standard deviation of fitness and two other statistical significance test, called ANOVA test and t-test, best fitness curves and convergence curves for different levels of clustering.
机译:本文提出了一种基于改进粒子群算法(MfPSO)的K-Means算法对多维数据进行聚类。最初在K均值中聚类中心选择不当可能会影响聚类结果,并可能停留在局部最小值上。为了消除这些问题,建议的MfPSO用于生成数据集的聚类中心。 PSO算法的惯性权重对于平衡PSO中的全局搜索和局部搜索起着至关重要的作用。该算法对惯性权重进行了改进,以提高收敛速度和更好的全局搜索能力。 MfPSO生成聚类中心,然后将这些派生的聚类中心用作K-Means算法中的初始聚类中心。定量证明了该算法产生了较好的结果,解决了局部极小问题。在四个众所周知的数据集上,将所提出的算法与常规K-Means算法和基于混沌降惯性权重的PSO(CDIW PSO)进行了广泛的比较。该算法的优越性是在两个标准聚类评估标准,计算时间,均值,适应度的标准差和两个其他统计显着性检验(称为ANOVA检验和t检验),最佳适应性曲线和收敛性的基础上,从视觉和数量上确定的不同聚类水平的曲线。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号