首页> 外文会议>IEEE International Meeting on Power, Electronics and Computing >Performance Analysis of K-Means Seeding Algorithms
【24h】

Performance Analysis of K-Means Seeding Algorithms

机译:K均值播种算法的性能分析

获取原文

摘要

K-Means is one of the most used cluster algorithms. However, because of its optimization process is based on a greedy iterated gradient descent, K-Means is sensitive to the initial set of centers. It has been proved that a bad initial set of centroids can reduce clusters’ quality. Therefore, numerous initialization methods have been developed to prevent a lousy performance of K-Means clustering. Nonetheless, we may notice that all of these initialization methods are usually validated by using the Sum of Squared Errors (SSE), as quality measurement. In this study, we evaluate three state-of-the-art initialization methods with three different quality measures, i.e., SSE, the Silhouette Coefficient, and the Adjusted Rand Index. The analysis is carried out with seventeen benchmarks. We provide new insight into the performance of initialization methods that traditionally are left behind; our results describe the high correlation between different initialization methods and fitness functions. These results may help to optimize K-Means for other topological structures beyond those covered by optimizing SSE with low effort.
机译:K-Means是最常用的聚类算法之一。但是,由于其优化过程基于贪婪的迭代梯度下降,因此K-Means对初始中心集很敏感。已经证明,不良的初始质心集会降低星团的质量。因此,已经开发了许多初始化方法来防止K-Means聚类的糟糕性能。尽管如此,我们可能会注意到,所有这些初始化方法通常都是通过使用平方误差总和(SSE)作为质量度量来验证的。在这项研究中,我们用三种不同的质量度量(即SSE,轮廓系数和调整后的兰德指数)评估了三种最新的初始化方法。使用17个基准进行分析。我们提供了有关传统上遗留的初始化方法的性能的新见解;我们的结果描述了不同的初始化方法和适应度函数之间的高度相关性。这些结果可能有助于针对其他拓扑结构优化K-Means,而无需花费太多精力就可以优化SSE。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号