Parallelizing maximal clique and k-plex enumeration over graph data

Zhuo Wang; Qun Chen; Boyi Hou; Bo Suo; Zhanhuai Li; Wei Pan; Zachary G. Ives

首页> 外文期刊>Journal of Parallel and Distributed Computing >Parallelizing maximal clique and k-plex enumeration over graph data

【24h】

Parallelizing maximal clique and k-plex enumeration over graph data

机译：图数据的并行最大集团和k重枚举

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In a wide variety of emerging data-intensive applications, such as social network analysis, Web document clustering, entity resolution, and detection of consistently co-expressed genes in systems biology, the detection of dense subgraphs cliques and k-plex is an essential component. Unfortunately, these problems are NP-Complete and thus computationally intensive at scale — hence there is a need to come up with techniques for distributing the computation across multiple machines such that the computation, which is too time-consuming on a single machine, can be efficiently performed on a machine cluster given that it is large enough. In this paper, we first propose a new approach for maximal clique and k-plex enumeration, which identifies dense subgraphs by binary graph partitioning. Given a connected graph G = (V, E), it has a space complexity of O(|E|) and a time complexity of O(|E|μ(G)), where μ(G) represents the number of different cliques (k-plexes) existing in G. It recursively divides a graph until each task is sufficiently small to be processed in parallel. We then develop parallel solutions and demonstrate how graph partitioning can enable effective load balancing. Finally, we evaluate the performance of the proposed approach on real and synthetic graph data and show that it performs considerably better than existing approaches in both centralized and parallel settings. In the parallel setting, it can achieve the speedups of up to lOx over existing approaches on large graphs. Our parallel algorithms are primarily implemented and evaluated on MapReduce, a popular shared-nothing parallel framework, but can easily generalize to other shared-nothing or shared-memory parallel frameworks. The work presented in this paper is an extension of our preliminary work on the approach of binary graph partitioning for maximal clique enumeration. In this work, we extend the proposed approach to handle maximal k-plex detection as well.

机译：在各种新兴的数据密集型应用程序中，例如社交网络分析，Web文档聚类，实体解析以及系统生物学中始终表达的共同表达基因的检测中，密集子图集团和k-plex的检测是必不可少的组件。不幸的是，这些问题是NP-Complete的，因此在规模上计算量很大-因此，需要提出一种将计算分布在多台计算机上的技术，这样一来，在一台计算机上的计算就很费时了。只要足够大，就可以在机器集群上高效地执行。在本文中，我们首先提出了一种用于最大集团和k-plex枚举的新方法，该方法通过二元图划分来识别密集子图。给定一个连通图G =（V，E），它的空间复杂度为O（| E |），时间复杂度为O（| E |μ（G）），其中μ（G）表示相异数它递归地划分图，直到每个任务足够小以至于可以并行处理。然后，我们开发并行解决方案，并演示图分区如何实现有效的负载平衡。最后，我们评估了该方法在真实和合成图形数据上的性能，并表明，在集中式和并行设置下，该方法的性能均比现有方法好得多。在并行设置中，与大图上的现有方法相比，它可以实现高达10倍的加速。我们的并行算法主要在MapReduce上实现和评估，MapReduce是一种流行的无共享并行框架，但是可以轻松地推广到其他无共享或共享内存并行框架。本文介绍的工作是对我们的最大工作集团枚举二进制图分区方法的初步工作的扩展。在这项工作中，我们将提出的方法扩展为也可以处理最大k重检测。

著录项

来源
《Journal of Parallel and Distributed Computing》 |2017年第8期|79-91|共13页
作者
Zhuo Wang; Qun Chen; Boyi Hou; Bo Suo; Zhanhuai Li; Wei Pan; Zachary G. Ives;
展开▼
作者单位

School of Computer Science and Engineering, Northwestern Polytechnical University, Xi'an, 710072, PR China;

School of Computer Science and Engineering, Northwestern Polytechnical University, Xi'an, 710072, PR China;

School of Computer Science and Engineering, Northwestern Polytechnical University, Xi'an, 710072, PR China;

School of Computer Science and Engineering, Northwestern Polytechnical University, Xi'an, 710072, PR China;

School of Computer Science and Engineering, Northwestern Polytechnical University, Xi'an, 710072, PR China;

School of Computer Science and Engineering, Northwestern Polytechnical University, Xi'an, 710072, PR China;

Computer and Information Science Department, University of Pennsylvania, Philadelphia, PA 19104-6389, USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Maximal clique enumeration; Maximal k-plex enumeration; Parallel graph processing; MapReduce;

机译：最大集团枚举;最大k-plex枚举;并行图处理;MapReduce;

相似文献

外文文献
中文文献
专利

1. A memory efficient maximal clique enumeration method for sparse graphs with a parallel implementation [J] . Yu Ting, Liu Mengchi Parallel Computing . 2019,第SEPa期

机译：具有并行实现的稀疏图的内存有效最大派系枚举方法
2. A memory efficient maximal clique enumeration method for sparse graphs with a parallel implementation [J] . Yu Ting, Liu Mengchi Parallel Computing . 2019,第Sepa期

机译：具有并行实现的稀疏图的记忆有效的最大Clique枚举方法
3. Efficient Maximal Clique Enumeration Over Graph Data [J] . Boyi Hou, Zhuo Wang, Qun Chen, Data Science and Engineering . 2016,第4期

机译：图形数据的有效最大集团枚举
4. Parallelizing Maximal Clique Enumeration Over Graph Data [C] . Qun Chen, Chao Fang, Zhuo Wang, International conference on database systems for advanced applications . 2016

机译：图形数据的并行最大集团枚举
5. Strategies for designing coordinated geographic visualization software for enumerated data: A component-based approach. [D] . Hardisty, Frank. 2003

机译：为枚举数据设计协调的地理可视化软件的策略：一种基于组件的方法。
6. On Finding and Enumerating Maximal and Maximum k-Partite Cliques in k-Partite Graphs [O] . Charles A. Phillips, Kai Wang, Erich J. Baker, -1

机译：关于在k零件图中查找和枚举最大和最大k零件团的问题
7. Efficient Maximal Clique Enumeration Over Graph Data [O] . 2016

机译：图形数据的有效最大集团枚举

Parallelizing maximal clique and k-plex enumeration over graph data

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅