...
首页> 外文期刊>Artificial Intelligence Review: An International Science and Engineering Journal >Clustering ensemble selection considering quality and diversity
【24h】

Clustering ensemble selection considering quality and diversity

机译:考虑质量和多样性的聚类合奏选择

获取原文
获取原文并翻译 | 示例
           

摘要

It is highly likely that there is a partition that is judged by a stability measure as a bad one while it contains one (or more) high quality cluster(s); and then it is totally neglected. So, inspiring from the evaluation of partitions, researchers turn to define measures for evaluation of clusters. Many stability measures have been proposed such as Normalized Mutual Information to validate a partition. The defined measures are based on Normalized Mutual Information. The drawback of the commonly used approach will be discussed in this paper and a criterion is proposed to assess the association between a cluster and a partition which is called Edited Normalized Mutual Information, ENMI criterion. The ENMI criterion compensates the drawback of the common Normalized Mutual Information (NMI) measure. Also, a clustering ensemble method that is based on aggregating a subset of primary clusters is proposed. The proposed method uses the Average ENMI as fitness measure to select a number of clusters. The clusters that satisfy a predefined threshold of the mentioned measure are selected to participate in the final ensemble. To combine the chosen clusters a set of consensus function methods are employed. One class of the used consensus functions is the co-association based consensus functions. Since the Evidence Accumulation Clustering, EAC, method can't derive the co-association matrix from a subset of clusters, Extended EAC, EEAC, is employed to construct the co-association matrix from the chosen subset of clusters. The second class of the used consensus functions is based on hyper graph partitioning algorithms. The other class of the used consensus functions considers the chosen clusters as a new feature space and uses a simple clustering algorithm to extract the consensus partitioning. The empirical studies show that the proposed method outperforms other well-known ensembles.
机译:很可能有一个分区,该分区是通过稳定性措施来判断的,而其中包含一个(或多个)高质量的集群;然后它完全被忽视了。因此,从分区评估中启发,研究人员转向定义群集评估的措施。已经提出了许多稳定性措施,例如标准化的相互信息以验证分区。定义的措施基于标准化的相互信息。本文将讨论常用方法的缺点,并且提出了一种标准来评估群集和分区之间的关联,该分区被称为编辑的归一化互信息,enmi标准。 enmi标准补偿了公共规范化互信息(NMI)测量的缺点。此外,提出了一种基于聚集初级集群子集的聚类集群方法。该方法使用平均enmi作为适合度量来选择许多簇。选择满足所提到的措施的预定义阈值的簇以参与最终的集合。要结合所选择的集群,采用了一组共识功能方法。一类二手共识职能是基于协会的共识职能。由于证据累积聚类,EAC,方法不能从集群的子集中得出共协会矩阵,所用扩展的EAC,EEAC从所选择的集群子集构成共关联矩阵。二等的二手共识函数基于HyperGraph分区算法。除此之外的共识函数的其他类别将所选择的群集视为新的特征空间,并使用简单的聚类算法提取共识分区。实证研究表明,所提出的方法优于其他众所周知的集合。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号