...
【24h】

Cooperative clustering

机译:合作集群

获取原文
获取原文并翻译 | 示例
           

摘要

Data clustering plays an important role in many disciplines, including data mining, machine learning, bioinformatics, pattern recognition, and other fields, where there is a need to learn the inherent grouping structure of data in an unsupervised manner. There are many clustering approaches proposed in the literature with different quality/complexity tradeoffs. Each clustering algorithm works on its domain space with no optimum solution for all datasets of different properties, sizes, structures, and distributions. In this paper, a novel cooperative clustering (CC) model is presented. It involves cooperation among multiple clustering techniques for the goal of increasing the homogeneity of objects within the clusters. The CC model is capable of handling datasets with different properties by developing two data structures, a histogram representation of the pair-wise similarities and a cooperative contingency graph. The two data structures are designed to find the matching sub-clusters between different clusterings and to obtain the final set of clusters through a coherent merging process. The cooperative model is consistent and scalable in terms of the number of adopted clustering approaches. Experimental results show that the cooperative clustering model outperforms the individual clustering algorithms over a number of gene expression and text documents datasets.
机译:数据聚类在许多学科中起着重要作用,包括数据挖掘,机器学习,生物信息学,模式识别和其他领域,在这些领域中,需要以无监督的方式学习数据的固有分组结构。文献中提出了许多具有不同质量/复杂度折衷的聚类方法。对于不同属性,大小,结构和分布的所有数据集,每种聚类算法都无法在其域空间上工作,而没有最佳解决方案。本文提出了一种新型的协作聚类(CC)模型。它涉及多种聚类技术之间的合作,目的是提高聚类中对象的同质性。 CC模型能够通过开发两个数据结构,成对相似性的直方图表示形式和协作权变图来处理具有不同属性的数据集。这两个数据结构旨在查找不同聚类之间的匹配子聚类,并通过相干合并过程获得聚类的最终集合。就采用的聚类方法的数量而言,合作模型是一致且可扩展的。实验结果表明,在许多基因表达和文本文档数据集上,协作聚类模型优于单独的聚类算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号