首页> 外文期刊>ACM transactions on knowledge discovery from data >Clustering Large Attributed Graphs: A Balance between Structural and Attribute Similarities
【24h】

Clustering Large Attributed Graphs: A Balance between Structural and Attribute Similarities

机译:聚类大型属性图:结构和属性相似性之间的平衡

获取原文
获取原文并翻译 | 示例

摘要

Social networks, sensor networks, biological networks, and many other information networks can be modeled as a large graph. Graph vertices represent entities, and graph edges represent their relationships or interactions. In many large graphs, there is usually one or more attributes associated with every graph vertex to describe its properties. In many application domains, graph clustering techniques are very useful for detecting densely connected groups in a large graph as well as for understanding and visualizing a large graph. The goal of graph clustering is to partition vertices in a large graph into different clusters based on various criteria such as vertex connectivity or neighborhood similarity. Many existing graph clustering methods mainly focus on the topological structure for clustering, but largely ignore the vertex properties, which are often heterogenous. In this article, we propose a novel graph clustering algorithm, SA-Cluster, which achieves a good balance between structural and attribute similarities through a unified distance measure. Our method partitions a large graph associated with attributes into k clusters so that each cluster contains a densely connected subgraph with homogeneous attribute values. An effective method is proposed to automatically learn the degree of contributions of structural similarity and attribute similarity. Theoretical analysis is provided to show that SA-Cluster is converging quickly through iterative cluster refinement. Some optimization techniques on matrix computation are proposed to further improve the efficiency of SA-Cluster on large graphs. Extensive experimental results demonstrate the effectiveness of SA-Cluster through comparisons with the state-of-the-art graph clustering and summarization methods.
机译:社交网络,传感器网络,生物网络和许多其他信息网络可以建模为大图。图顶点表示实体,图边缘表示它们的关系或交互。在许多大型图中,通常每个图顶点都有一个或多个属性来描述其属性。在许多应用领域中,图聚类技术对于检测大图中的密集连接组以及理解和可视化大图非常有用。图聚类的目标是基于各种标准(例如,顶点连通性或邻域相似性)将大图上的顶点划分为不同的聚类。现有的许多图聚类方法主要关注于聚类的拓扑结构,但很大程度上忽略了通常是异构的顶点属性。在本文中,我们提出了一种新颖的图聚类算法SA-Cluster,该算法通过统一的距离度量在结构相似性和属性相似性之间取得了良好的平衡。我们的方法将与属性关联的大图划分为k个簇,以便每个簇包含具有均一属性值的密集连接的子图。提出了一种有效的方法来自动学习结构相似度和属性相似度的贡献度。理论分析表明,SA-Cluster正在通过迭代聚类优化快速收敛。提出了一些矩阵计算的优化技术,以进一步提高大图上SA聚类的效率。大量的实验结果通过与最新的图聚类和汇总方法进行比较,证明了SA-Cluster的有效性。

著录项

  • 来源
    《ACM transactions on knowledge discovery from data》 |2011年第2期|p.12.1-12.33|共33页
  • 作者单位

    Department of Systems Engineering and Engineering Management, William M. W. Mong Engineering Building, Room 609, The Chinese University of Hong Kong,Shatin, N.T., Hong Kong;

    Department of Systems Engineering and Engineering Management, William M. W. Mong Engineering Building, Room 609, The Chinese University of Hong Kong,Shatin, N.T., Hong Kong;

    Department of Systems Engineering and Engineering Management, William M. W. Mong Engineering Building, Room 609, The Chinese University of Hong Kong,Shatin, N.T., Hong Kong;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    graph clustering; structural proximity; attribute similarity;

    机译:图聚类结构上的接近属性相似度;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号