首页> 外文期刊>Journal of Classification >Intelligent Choice of the Number of Clusters in K-Means Clustering: An Experimental Study with Different Cluster Spreads
【24h】

Intelligent Choice of the Number of Clusters in K-Means Clustering: An Experimental Study with Different Cluster Spreads

机译:K均值聚类中聚类数目的智能选择:具有不同聚类分布的实验研究

获取原文
获取原文并翻译 | 示例
           

摘要

The issue of determining “the right number of clusters” in K-Means has attracted considerable interest, especially in the recent years. Cluster intermix appears to be a factor most affecting the clustering results. This paper proposes an experimental setting for comparison of different approaches at data generated from Gaussian clusters with the controlled parameters of between- and within-cluster spread to model cluster intermix. The setting allows for evaluating the centroid recovery on par with conventional evaluation of the cluster recovery. The subjects of our interest are two versions of the “intelligent” K-Means method, ik-Means, that find the “right” number of clusters by extracting “anomalous patterns” from the data one-by-one. We compare them with seven other methods, including Hartigan’s rule, averaged Silhouette width and Gap statistic, under different between- and within-cluster spread-shape conditions. There are several consistent patterns in the results of our experiments, such as that the right K is reproduced best by Hartigan’s rule – but not clusters or their centroids. This leads us to propose an adjusted version of iK-Means, which performs well in the current experiment setting.
机译:确定K-Means中“正确数目的聚类”的问题引起了相当大的兴趣,尤其是在最近几年。群集混合似乎是最影响群集结果的因素。本文提出了一个实验设置,用于比较高斯集群产生的数据与集群间和集群内扩展的受控参数对模型集群混合的不同方法。该设置允许将质心恢复评估为与群集恢复的常规评估相同。我们感兴趣的主题是“智能” K均值方法的两个版本ik-Means,它们通过从数据中逐一提取“异常模式”来找到“正确”数量的聚类。我们将它们与其他7种方法进行了比较,包括Hartigan规则,在不同的群集间和群集内展开形状条件下的平均Silhouette宽度和Gap统计量。我们的实验结果中有几种一致的模式,例如正确的K值是根据Hartigan的规则最好地再现的,而不是簇或它们的质心。因此,我们提出了iK-Means的调整版本,该版本在当前的实验设置中效果很好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号