首页> 外文会议>ACM SIGKDD international conference on Knowledge discovery and data mining >A spectral method to separate disconnected and nearly-disconnected web graph components
【24h】

A spectral method to separate disconnected and nearly-disconnected web graph components

机译:分离断开和几乎断开的Web图形组件的频谱方法

获取原文

摘要

Separation of connected components from a graph with disconnected graph components mostly use breadth-first search (BFS) or depth-first search (DFS) graph algorithms. Here we propose a new algebraic method to separate disconnected and nearly-disconnected components. This method is based on spectral graph partitioning, following a key observation that disconnected components will show up, after properly sorted, as step-function like curve in the lowest eigenvectors of the Laplacian matrix of the graph. Following an perturbative analysis framework, we systematically analyzed the graph structures, first on the disconnected subgraph case, and second on the effects of adding edges sparsely connecting different subgraphs as a perturbation. Several new results are derived, providing insights to spectral methods and related clustering objective function. Examples are given illustrating the concepts and results our methods. Comparing to the standard graph algorithms, this method has the same O(‖E ‖ + ‖V‖log(‖V‖)) complexity, but is easier to implement (using readily available eigensolvers). Further more the method can easily identify articulation points and bridges on nearly-disconnected graphs. Segmentation of a real example of Web graph for query amazon is given. We found that each disconnected or nearly-disconnected components forms a cluster on a clear topic.
机译:从具有断开的图形组件的图形中分离连接的组件时,大多使用广度优先搜索(BFS)或深度优先搜索(DFS)图算法。在这里,我们提出了一种新的代数方法来分离断开和几乎断开的组件。该方法基于光谱图分区,这是在对关键点进行观察之后得出的结论,即正确分类后,断开连接的分量将在图的拉普拉斯矩阵的最低特征向量中显示为阶跃函数,如曲线。按照摄动分析框架,我们系统地分析了图的结构,首先是在断开子图的情况下,其次是在添加边缘稀疏地连接不同子图作为摄动的影响上。得出了几个新结果,为光谱方法和相关的聚类目标函数提供了见识。给出了一些例子,说明了我们的方法的概念和结果。与标准图形算法相比,此方法具有相同的 O (‖ E ‖ +‖ V ‖ log(‖ V ‖)),但更易于实现(使用现成的特征求解器)。此外,该方法可以轻松识别几乎断开的图形上的关节点和桥。给出了查询 amazon 的Web图的真实示例的分割。我们发现,每个断开连接或几乎断开连接的组件都构成一个明确主题的集群。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号