首页> 美国卫生研究院文献>other >CGC: A Flexible and Robust Approach to Integrating Co-Regularized Multi-Domain Graph for Clustering
【2h】

CGC: A Flexible and Robust Approach to Integrating Co-Regularized Multi-Domain Graph for Clustering

机译:CGC:集成共正则化多域图以进行聚类的灵活而鲁棒的方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Multi-view graph clustering aims to enhance clustering performance by integrating heterogeneous information collected in different domains. Each domain provides a different view of the data instances. Leveraging cross-domain information has been demonstrated an effective way to achieve better clustering results. Despite the previous success, existing multi-view graph clustering methods usually assume that different views are available for the same set of instances. Thus instances in different domains can be treated as having strict one-to-one relationship. In many real-life applications, however, data instances in one domain may correspond to multiple instances in another domain. Moreover, relationships between instances in different domains may be associated with weights based on prior (partial) knowledge. In this paper, we propose a flexible and robust framework, CGC (Co-regularized Graph Clustering), based on non-negative matrix factorization (NMF), to tackle these challenges. CGC has several advantages over the existing methods. First, it supports many-to-many cross-domain instance relationship. Second, it incorporates weight on cross-domain relationship. Third, it allows partial cross-domain mapping so that graphs in different domains may have different sizes. Finally, it provides users with the extent to which the cross-domain instance relationship violates the in-domain clustering structure, and thus enables users to re-evaluate the consistency of the relationship. We develop an efficient optimization method that guarantees to find the global optimal solution with a given confidence requirement. The proposed method can automatically identify noisy domains and assign smaller weights to them. This helps to obtain optimal graph partition for the focused domain. Extensive experimental results on UCI benchmark data sets, newsgroup data sets and biological interaction networks demonstrate the effectiveness of our approach.
机译:多视图图聚类旨在通过集成在不同域中收集的异构信息来增强聚类性能。每个域都提供数据实例的不同视图。利用跨域信息已被证明是实现更好的聚类结果的有效方法。尽管取得了先前的成功,但是现有的多视图图聚类方法通常假设对于同一组实例可用不同的视图。因此,可以将不同域中的实例视为具有严格的一对一关系。但是,在许多实际应用中,一个域中的数据实例可能对应于另一域中的多个实例。此外,基于先前的(部分)知识,不同域中的实例之间的关系可以与权重相关联。在本文中,我们提出了一个基于非负矩阵分解(NMF)的灵活而强大的框架CGC(共正则化图聚类)来应对这些挑战。与现有方法相比,CGC具有多个优点。首先,它支持多对多跨域实例关系。其次,它结合了跨域关系的权重。第三,它允许部分跨域映射,以便不同域中的图可能具有不同的大小。最后,它为用户提供了跨域实例关系违反域内群集结构的程度,从而使用户可以重新评估关系的一致性。我们开发了一种有效的优化方法,可确保找到具有给定置信度要求的全局最优解。所提出的方法可以自动识别噪声域并为它们分配较小的权重。这有助于获得针对聚焦域的最佳图分区。在UCI基准数据集,新闻组数据集和生物交互网络上的大量实验结果证明了我们方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号