首页> 外文会议>IEEE International Meeting on Power, Electronics and Computing >Performance Analysis of K-Means Seeding Algorithms
【24h】

Performance Analysis of K-Means Seeding Algorithms

机译:K-Means播种算法的性能分析

获取原文

摘要

K-Means is one of the most used cluster algorithms. However, because of its optimization process is based on a greedy iterated gradient descent, K-Means is sensitive to the initial set of centers. It has been proved that a bad initial set of centroids can reduce clusters’ quality. Therefore, numerous initialization methods have been developed to prevent a lousy performance of K-Means clustering. Nonetheless, we may notice that all of these initialization methods are usually validated by using the Sum of Squared Errors (SSE), as quality measurement. In this study, we evaluate three state-of-the-art initialization methods with three different quality measures, i.e., SSE, the Silhouette Coefficient, and the Adjusted Rand Index. The analysis is carried out with seventeen benchmarks. We provide new insight into the performance of initialization methods that traditionally are left behind; our results describe the high correlation between different initialization methods and fitness functions. These results may help to optimize K-Means for other topological structures beyond those covered by optimizing SSE with low effort.
机译:K-means是最常用的群集算法之一。但是,由于其优化过程基于贪婪迭代梯度下降,K-Means对初始中心敏感。已经证明,初始质心集可以减少集群的质量。因此,已经开发了许多初始化方法来防止K-Means聚类的糟糕性能。尽管如此,我们可能会注意到,通常通过使用平方误差(SSE)之和作为质量测量来验证所有这些初始化方法。在这项研究中,我们评估了三种最先进的初始化方法,具有三种不同的质量措施,即SSE,轮廓系数和调整后的rand指数。分析与十七台基准进行。我们对传统留下的初始化方法的表现提供了新的洞察力;我们的结果描述了不同初始化方法与健身功能之间的高相关性。这些结果可能有助于优化用于除了优化SSE覆盖的其他拓扑结构的K-Meanse,以低努力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号