首页> 外文OA文献 >Bipartite Isoperimetric Graph Partitioning for Data Co-clustering
【2h】

Bipartite Isoperimetric Graph Partitioning for Data Co-clustering

机译:用于数据协同聚类的二分等距图分区

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Data co-clustering refers to the problem of simultaneous clustering of two data types. Typically, the data is stored in a contingency or co-occurrence matrix C where rows and columns of the matrix represent the data types to be co-clustered. An entry Cij of the matrix signifies the relation between the data type represented by row i and column j. Co-clustering is the problem of deriving sub-matrices from the larger data matrix by simultaneously clustering rows and columns of the data matrix. In this paper, we present a novel graph theoretic approach to data co-clustering. The two data types are modeled as the two sets of vertices of a weighted bipartite graph. We then propose Isoperimetric Co-clustering Algorithm (ICA) - a new method for partitioning the bipartite graph. ICA requires a simple solution to a sparse system of linear equations instead of the eigenvalue or SVD problem in the popular spectral coclustering approach. Our theoretical analysis and extensive experiments performed on publicly available datasets demonstrate the advantages of ICA over other approaches in terms of the quality, efficiency and stability in partitioning the bipartite graph.
机译:数据共聚是指同时对两种数据类型进行聚类的问题。通常,数据存储在偶发矩阵或共现矩阵C中,矩阵的行和列表示要共聚的数据类型。矩阵的条目Cij表示第i行和第j列表示的数据类型之间的关系。共聚是通过同时对数据矩阵的行和列进行聚类来从较大的数据矩阵派生子矩阵的问题。在本文中,我们提出了一种新颖的图形理论方法来进行数据共聚。这两种数据类型被建模为加权二部图的两组顶点。然后,我们提出了等距共聚算法(ICA)-一种用于划分二分图的新方法。 ICA需要一种简单的解决方案来解决线性方程组的稀疏问题,而不是使用流行的频谱共聚方法中的特征值或SVD问题。我们的理论分析和对公开数据集进行的大量实验证明,在分割二部图时,在质量,效率和稳定性方面,ICA优于其他方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号