【24h】

Initialization Dependence of Clustering Algorithms

机译:聚类算法的初始化依赖性

获取原文
获取外文期刊封面目录资料

摘要

It is well known that the clusters produced by a clustering algorithm depend on the chosen initial centers. In this paper we present a measure for the degree to which a given clustering algorithm depends on the choice of initial centers, for a given data set. This measure is calculated for four well-known offline clustering algorithms (k-means Forgy, k-means Hartigan, k-means Lloyd and fuzzy c-means), for five benchmark data sets. The measure is also calculated for ECM, an online algorithm that does not require the number of initial centers as input, but for which the resulting clusters can depend on the order that the input arrives. Our main finding is that this initialization dependence measure can also be used to determine the optimal number of clusters.
机译:众所周知,由聚类算法产生的聚类取决于所选的初始中心。在本文中,我们针对给定的数据集,针对给定的聚类算法取决于初始中心的选择的程度提出了一种度量。针对五个基准数据集,针对四种众所周知的离线聚类算法(k均值Forgy,k均值Hartigan,k均值Lloyd和Fuzzy c均值)计算了此度量。还为ECM计算了度量,ECM是一种在线算法,不需要初始中心的数量作为输入,但是生成的聚类可以取决于输入到达的顺序。我们的主要发现是,这种初始化相关性度量也可以用于确定最佳簇数。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号