首页> 外文OA文献 >Analysis of global k-means, an incremental heuristic for minimum sum-of-squares clustering
【2h】

Analysis of global k-means, an incremental heuristic for minimum sum-of-squares clustering

机译:全局k均值分析,最小平方和聚类的增量启发法

摘要

The global k-means heuristic is a recently proposed (Likas, Vlassis and Verbeek, 2003) incremental approach for minimum sum-of-squares clustering of a set X of N points of Rd into M clusters. For k = 2,3,..., M - 1 it considers the best-known set of k - 1 centroids previously obtained, adds a new cluster center at each point of X in turn and applies k-means to each set of k centroids so-obtained, keeping the best k-partition found. We show that global k-means cannot be guaranteed to find the optimum partition for any M ≥ 2 and d ≥ 1; moreover, the same holds for all M ≥ 3 if the new cluster center is chosen anywhere in Rd instead of belonging to X. The empirical performance of global k-means is also evaluated by comparing the values it obtains with those obtained for three data sets with N ≤ 150 which are solved optimally, as well as with values obtained by the recent j-means heuristic and extensions thereof for three larger data sets with N ≤ 3038.
机译:全局k均值启发式是最近提出的一种增量方法(Likas,Vlassis和Verbeek,2003年),用于将R的N个点的X个集合最小化为M个聚类。对于k = 2,3,...,M-1,它考虑先前获得的最著名的k-1个质心集,依次在X的每个点处添加一个新的聚类中心,并将k均值应用于每个这样获得了k个质心,从而保持了找到的最佳k分区。我们证明,对于任何M≥2和d≥1,不能保证全局k均值找到最佳划分;此外,如果在Rd的任意位置选择新的聚类中心而不属于X,则对于所有M≥3都适用。对于k均值的经验性能,还通过将其获得的值与针对三个数据集获得的值进行比较来评估N≤150的最优解,以及最近的j均值启发式及其扩展得到的值,用于三个更大的N≤3038的数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号