首页> 外文会议>International symposium on ubiquitous networking >Initial Centroid Selection Method for an Enhanced K-means Clustering Algorithm
【24h】

Initial Centroid Selection Method for an Enhanced K-means Clustering Algorithm

机译:改进的K-均值聚类算法的初始质心选择方法

获取原文

摘要

Clustering is an important method to discover structures and patterns in high-dimensional data and group similar ones together. K-means is one of the most popular clustering algorithms. K-means groups observations by minimizing distances between them and maximizing group distances. One of the primordial steps in this algorithm is centroid selection, in which k initial centroids are estimated either randomly, calculated, or given by the user. Existing k-means algorithms uses the 'k-means++' option for this selection. In this paper, we suggest an enhanced version of k-means clustering that minimize the runtime of the algorithm using 'Ndarray' option. Experiments have shown that if the first choice of centroids is close to the final centers, the results will be quickly found. Thus, we propose a new concept that provides one of the best choices of starting centroids that reduces the execution time by ≈80% on average for UCI, Shape and Miscellaneous datasets.
机译:聚类是发现高维数据中的结构和模式并将相似的结构分组在一起的重要方法。 K-means是最流行的聚类算法之一。 K-means通过最小化观察点之间的距离和最大化组距离来对观察进行分组。该算法中的原始步骤之一是质心选择,其中k个初始质心可以随机估计,计算或由用户给出。现有的k均值算法对此选项使用'k-means ++'选项。在本文中,我们建议使用k-means聚类的增强版本,以使用“ Ndarray”选项最小化算法的运行时间。实验表明,如果质心的首选接近最终中心,则可以很快找到结果。因此,我们提出了一个新的概念,它提供了起始质心的最佳选择之一,它可以将UCI,Shape和其他数据集的执行时间平均缩短≈80%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号