首页> 外文会议>Web Information System and Application Conference >Extracting Dense Bipartite Graph Block in Web Community Discovery
【24h】

Extracting Dense Bipartite Graph Block in Web Community Discovery

机译:在Web社区发现中提取密集二分图块

获取原文

摘要

Community is a very important structure in the Web. The discovery of these communities is a challenging task. In many researches, it is an effective way of exhaustively extracting dense sub graphs to find communities. The pioneer works in[1], [2] uses a CBG(Complete Bipartite Graph) as a signature of a community core and discovers many implicit communities. However, the CBG is too strict and it excludes many possible community structures. Therefore, instead of CBG, DBG(Dense Bipartite Graph) is chosen as a signature. For instance, Reddy et al. [3] proposed degree-based (a, ß)density, Gibson et al. [4] and Dourisboure et al. [5] use a ratio-based ?-dense function to qualify the density of a DBG. In this paper, we analyze two previous density measurements and point out that in low density the structure of bipartite graph may be unreasonable because of the existence of cutting nodes. For this reason, we introduce DBGB(Dense Bipartite Graph Block). Subsequently, we employ two-step expansion to construct bipartite graph which decreases the number of unnecessary nodes and edges. In order to get optimal bipartite structure, we propose max DBGB and design an extracting algorithm. The new method is tested under 4 datasets collected by a Web crawler and dense cores have been extracted. We check 200 random sampling cores and 89 percent of them make sense. Meanwhile, we apply Dourisboure's method on one of the datasets with different scale and the cores extracted contain many cutting nodes. Consequently, the experiment results show that our method is effective.
机译:社区是Web中非常重要的结构。这些社区的发现是一项艰巨的任务。在许多研究中,这是穷举提取密集子图以查找社区的有效方法。开拓者在[1],[2]中使用CBG(完全二部图)作为社区核心的签名,并发现了许多隐性社区。但是,CBG过于严格,它排除了许多可能的社区结构。因此,代替CBG,而是选择DBG(密集二分图)作为签名。例如,Reddy等。 [3]提出了基于度的(a,ß)密度,Gibson等。 [4]和Dourisboure等。 [5]使用基于比率的α-密集函数来限定DBG的密度。在本文中,我们分析了两个先前的密度测量结果,并指出在低密度情况下,二分图的结构可能由于切割节点的存在而变得不合理。因此,我们引入了DBGB(密集二分图块)。随后,我们采用两步展开法构造二部图,该图减少了不必要的节点和边的数量。为了获得最佳的二分结构,我们提出了最大DBGB并设计了提取算法。该新方法在Web爬网程序收集的4个数据集下进行了测试,并提取了密集的核心。我们检查了200个随机采样核,其中89%有意义。同时,我们将Dourisboure方法应用于其中一个规模不同的数据集,并且提取的岩心包含许多切割节点。因此,实验结果表明我们的方法是有效的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号