首页> 外文期刊>Parallel and Distributed Systems, IEEE Transactions on >Engineering Parallel Algorithms for Community Detection in Massive Networks
【24h】

Engineering Parallel Algorithms for Community Detection in Massive Networks

机译:大规模网络中社区检测的工程并行算法

获取原文
获取原文并翻译 | 示例
           

摘要

The amount of graph-structured data has recently experienced an enormous growth in many applications. To transform such data into useful information, fast analytics algorithms and software tools are necessary. One common graph analytics kernel is disjoint community detection (or graph clustering). Despite extensive research on heuristic solvers for this task, only few parallel codes exist, although parallelism will be necessary to scale to the data volume of real-world applications. We address the deficit in computing capability by a flexible and extensible community detection framework with shared-memory parallelism. Within this framework we design and implement efficient parallel community detection heuristics: A parallel label propagation scheme; the first large-scale parallelization of the well-known Louvain method, as well as an extension of the method adding refinement; and an ensemble scheme combining the above. In extensive experiments driven by the algorithm engineering paradigm, we identify the most successful parameters and combinations of these algorithms. We also compare our implementations with state-of-the-art competitors. The processing rate of our fastest algorithm often reaches 50 M edges/second. We recommend the parallel Louvain method and our variant with refinement as both qualitatively strong and fast. Our methods are suitable for massive data sets with billions of edges. (A preliminary version of this paper appeared in .)
机译:最近,在许多应用程序中,图结构化数据的数量经历了巨大的增长。为了将此类数据转换为有用的信息,需要快速的分析算法和软件工具。一种常见的图分析内核是不相交的社区检测(或图聚类)。尽管针对此任务的启发式求解器进行了广泛的研究,但只有很少的并行代码存在,尽管要根据实际应用的数据量进行扩展,并行性是必需的。我们通过具有共享内存并行性的灵活且可扩展的社区检测框架来解决计算能力的不足。在此框架内,我们设计并实现了有效的并行社区检测启发式方法:众所周知的Louvain方法的首次大规模并行化,以及对该方法的扩展,增加了细化;以及结合以上所述的整体方案。在由算法工程范式驱动的大量实验中,我们确定了最成功的参数以及这些算法的组合。我们还将实施与最先进的竞争对手进行比较。我们最快的算法的处理速度通常达到50 M边/秒。我们建议使用并行的Louvain方法以及我们的改进方法,因为它在质量上既强大又快速。我们的方法适用于具有数十亿条边的海量数据集。 (本文的初步版本出现在。)

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号