首页> 外文会议>International Conference on Machine Learning and Cybernetics >An efficient validity index method for datasets with complex-shaped clusters
【24h】

An efficient validity index method for datasets with complex-shaped clusters

机译:复杂形状聚类数据集的有效有效性指标方法

获取原文

摘要

In this paper, a validity index method VDOGK, a variation of the index method VDO, for estimating the optimal number of clusters in datasets with concave-/elongated-shaped clusters is presented. The new index uses Gustafson-Kessel FCM to partition the dataset so that geometric-shape-sensitivity problem of FCM can be reduced. It is based on both dispersion and overlap measures, where the dispersion measure estimates the overall cluster compactness and the overlap measure estimates the total ambiguity degree of data belonging to any pair of clusters in the dataset. A good clustering result is expected to have both measures small. Examples of synthetic datasets comprising concave, elongated, spherical, and/or elliptical clusters are presented. Experimental results on various datasets including synthetic and real datasets from UCI Machine Learning Laboratory demonstrate that the proposed VDOGK made correct estimation on number of clusters for all nine tested datasets, whereas VDO only scored three real datasets.
机译:本文提出了一种有效性指标方法VDOGK,它是指标方法VDO的一种变体,用于估计具有凹形/细长形聚类的数据集中的聚类的最佳数量。新索引使用Gustafson-Kessel FCM对数据集进行分区,从而可以减少FCM的几何形状敏感性问题。它基于分散和重叠度量,其中分散度量估计整体聚类的紧密度,重叠度量估计属于数据集中任意一对聚类的数据的总歧义度。良好的聚类结果有望使这两个指标都变小。给出了包括凹形,拉长形,球形和/或椭圆形群集的合成数据集的示例。在UCI机器学习实验室的各种数据集(包括合成数据集和真实数据集)上的实验结果表明,提出的VDOGK对所有9个测试数据集的簇数进行了正确估计,而VDO仅对三个真实数据集进行了评分。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号