首页> 外文期刊>Evolutionary Intelligence >Use of symmetry and stability for data clustering
【24h】

Use of symmetry and stability for data clustering

机译:使用对称性和稳定性进行数据聚类

获取原文
获取原文并翻译 | 示例
       

摘要

An important consideration in clustering is the determination of an algorithm appropriate for partitioning a given data set. Thereafter identification of the correct model order and determining the corresponding partition ing need to be performed. In this paper, at first the effec tiveness of the recently developed symmetry based cluster validity index named Sym-index which provides a measure of "symmetricity" of the different partitionings of a data set is shown to address all the above mentioned issues, viz., identifying the appropriate clustering algorithm, deter mining the proper model order and evolving the proper partitioning as long as the clusters possess the property of symmetry. Results demonstrating the superiority of the proposed cluster validity measure in appropriately deter mining the proper clustering technique as well as appro priate model order as compared to five other recently proposed measures, namely PS-index, I-index, CS-index, well-known XB-index, and stability based index, are pro vided for several clustering methods viz., two recently developed genetic algorithm based clustering techniques, the average linkage clustering algorithm, self organizing map and the expectation maximization clustering algo rithm. Five artificial data sets and three real life data sets, are considered for this purpose. In the second part of the rnpaper, a new measure of stability of clustering solutions over different bootstrap samples of a data set is proposed. Thereafter a multiobjective optimization based clustering technique is developed which optimizes both Sym-index and the measure of stability simultaneously to automati cally determine the appropriate number of clusters and the appropriate partitioning of the data sets having symmetrical shaped clusters. Results on five artificial and five real-life data sets show that the proposed technique is well-suited to detect the number of clusters from data sets having point symmetric clusters.
机译:聚类中的重要考虑因素是确定适合于划分给定数据集的算法。此后,需要执行正确的模型顺序的标识并确定相应的分区。在本文中,首先显示了最近开发的名为Sym-index的基于对称性的群集有效性指标的有效性,该指标提供了对数据集不同分区的“对称性”的度量,可以解决上述所有问题。确定合适的聚类算法,确定适当的模型顺序并发展适当的分区,只要这些聚类具有对称性即可。结果表明,与最近提出的其他五个指标(PS指标,I指标,CS指标,知名指标)相比,所提出的聚类有效性度量在适当地确定适当的聚类技术以及适当的模型顺序方面具有优越性针对几种聚类方法(即最近开发的两种基于遗传算法的聚类技术,平均链接聚类算法,自组织图和期望最大化聚类算法)提供了XB索引和基于稳定性的索引。为此,考虑了五个人造数据集和三个现实生活数据集。在本文的第二部分中,提出了一种对数据集的不同引导样本进行聚类的方法的稳定性的新度量。此后,开发了一种基于多目标优化的聚类技术,该技术同时优化了Sym-index和稳定性的度量,以自动确定适当数目的聚类和具有对称形状聚类的数据集的适当划分。在五个人工和五个现实数据集上的结果表明,该技术非常适合从具有点对称聚类的数据集中检测聚类的数量。

著录项

  • 来源
    《Evolutionary Intelligence》 |2010年第4期|p.103-122|共20页
  • 作者单位

    Image Processing and Modeling, Interdisciplinary Center for Scientific Computing (IWR), University of Heidelberg,Heidelberg, Germany;

    rnDepartment of Theoretical Bioinformatics,DKFZ (Deutsches Krebsforschungszentrum,German Cancer Research Center), Heidelberg, Germany;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    clustering; multiobjective optimization (MOO); symmetry; stability;

    机译:集群多目标优化(MOO);对称;稳定性;
  • 入库时间 2022-08-18 02:12:20

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号