首页> 外文会议>International Symposium on Computer Architecture and High Performance Computing >Accelerating Graph Analytics on CPU-FPGA Heterogeneous Platform
【24h】

Accelerating Graph Analytics on CPU-FPGA Heterogeneous Platform

机译:CPU-FPGA异构平台加速图分析

获取原文

摘要

Hardware accelerators for graph analytics have gained increasing interest. Vertex-centric and edge-centric paradigms are widely used to design graph analytics accelerators. However, both of them have notable drawbacks: vertex-centric paradigm requires random memory accesses to traverse edges and edge-centric paradigm results in redundant edge traversals. In this paper, we explore the tradeoffs between vertex-centric and edge-centric paradigms and propose a hybrid algorithm which dynamically selects between them during the execution. We introduce the notion of active vertex ratio, based on which we develop a simple but efficient paradigm selection approach. We develop a hybrid data structure to concurrently support vertex-centric and edge-centric paradigms. Based on the hybrid data structure, we propose a graph partitioning scheme to increase parallelism and enable efficient parallel computation on heterogeneous platforms. In each iteration, we use our paradigm selection approach to select the appropriate paradigm for each partition. Further, we map our hybrid algorithm onto a stateof-the-art heterogeneous platform which integrates a multi-core CPU and a Field-Programmable Gate Array (FPGA) in a cache coherent fashion. We use our design methodology to accelerate two fundamental graph algorithms, breadth-first search (BFS) and single-source shortest path (SSSP). Experimental results show that our CPU-FPGA co-processing achieves up to 1.5× (1.9×) speedup for BFS (SSSP) compared with optimized baseline designs. Compared with the state-of-the-art FPGA-based designs, our design achieves up to 4.0× (4.2×) throughput improvement for BFS (SSSP). Compared with a state-of-the-art multi-core design, our design demonstrates up to 1.5× (1.8×) speedup for BFS (SSSP).
机译:图形分析的硬件加速器已获得越来越令人利益。以顶点为中心和边缘的范式广泛用于设计图形分析加速器。然而,两个都有显着的缺点:以顶点为中心的范例需要随机内存访问到遍历边缘和以边缘为中心的范例导致冗余边缘遍历。在本文中,我们探讨了顶视为中心和以边缘的范例之间的权衡,并提出了一种混合算法,其在执行期间动态地选择它们。我们介绍了主动顶点比的概念,基于我们开发了一种简单但有效的范式选择方法。我们开发混合数据结构以同时支持以顶点为中心和边缘的范例。基于混合数据结构,我们提出了一种图形分区方案来增加并行性,并在异构平台上实现有效的并行计算。在每次迭代中,我们使用我们的范式选择方法为每个分区选择适当的范例。此外,我们将混合算法映射到现有技术的异构平台上,该平台集成了多核CPU和现场可编程门阵列(FPGA)以高速缓存的相干方式。我们使用我们的设计方法来加速两个基本的图形算法,广度第一搜索(BFS)和单源最短路径(SSSP)。实验结果表明,与优化基线设计相比,我们的CPU-FPGA协同加工可实现BFS(SSP)的1.5倍(1.9×)加速。与最先进的FPGA设计相比,我们的设计可实现BFS(SSP)的4.0倍(​​4.2倍)的吞吐量改进。与最先进的多核设计相比,我们的设计展示了BFS(SSP)的1.5倍(1.8×)加速。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号