【24h】

Mining Networks with Shared Items

机译:与共享项目的挖掘网络

获取原文

摘要

Recent advances in data processing have enabled the generation of large and complex graphs. Many researchers have developed techniques to investigate informative structures within these graphs. However, the vertices and edges of most real-world graphs are associated with its features, and only a few studies have considered their combination. In this paper, we specifically examine a large graph in which each vertex has associated items. Prom the graph, we extract subgraphs with common itemsets, which we call itemset-sharing subgraphs (ISSes). The problem has various potential applications such as the detection of gene networks affected by drugs or the findings of popular research areas of contributing researchers. We propose an efficient algorithm to enumerate ISSes in large graphs. This algorithm enumerates ISSes with two efficient data structures: a DFS itemset tree and a visited itemset table. In practice, the combination of these two structures enables us to compute optimal solutions efficiently. We demonstrate the efficiency of our algorithm in mining ISSes from synthetic graphs with more than one million edges. We also present experiments performed using two real biological networks and a citation network. The experiments show that our algorithm can find interesting patterns in real datasets.
机译:数据处理的最新进步使得能够生成大型和复杂的图形。许多研究人员已经开发了调查这些图中的信息结构的技术。然而,大多数真实世界图的顶点和边缘与其特征有关,并且只有少数研究考虑了它们的组合。在本文中,我们专门检查一个大图,其中每个顶点都有相关项目。 PROM图形,我们提取具有常见项目集的子图,我们调用项目集共享子图(ISSES)。问题具有各种潜在的应用,例如检测受药物影响的基因网络或促进研究人员的流行研究领域的结果。我们提出了一种高效的算法来枚举大图中的ISS。此算法枚举了具有两个有效数据结构的ISS:DFS项集树和访问的Itemset表。在实践中,这两个结构的组合使我们能够有效地计算最佳解决方案。我们展示了我们在综合图中挖掘矿业算法的效率,从拥有超过一百万边的综合图。我们还存在使用两个真实生物网络和引文网络进行的实验。实验表明,我们的算法可以在真实数据集中找到有趣的模式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号