首页> 外文会议>International Conference on Algorithms and Architectures for Parallel Processing >Towards a Deep-Pipelined Architecture for Accelerating Deep GCN on a Multi-FPGA Platform
【24h】

Towards a Deep-Pipelined Architecture for Accelerating Deep GCN on a Multi-FPGA Platform

机译:朝着深入管道的架构,用于加速多FPGA平台上的深态GCN

获取原文

摘要

CNN (convolutional neural networks) have achieved great success in learning features from Euclidean-structured data. While lots of learning tasks require dealing with graph data. In these application scenarios where CNN cannot operate, GCN (graph neural networks) have shown appealing performance and increasing attention in recent years. However, according to our research, the computational complexity and storage overhead of the network also increase, making it a challenge to accelerate on a single FPGA. Accordingly, in this work, we focus on accelerating a deep GCN (DAGCN) on a CPU-multi FPGA platform by proposing a deep-pipelined acceleration scheme. To fully explore the parallelism that exists in DAGCN, we propose a graph convolutional neural accelerator (GCNAR) characterized by integration of a multiple 1-D systolic array. In addition, we also adopt an existing CSR algorithm-based partitioning scheme for large-scale matrix-vector multiplication in the design of our GCNAR, which effectively improves the computational efficiency of GCNAR. Moreover, we develop performance and resource evaluation models to help us determine the optimal design parameters for maximizing the accelerator throughput. Evaluation on real-world graph datasets demonstrates that our FPGA-based solution can achieve comparable performance to state-of-the-art GCN accelerations. In addition, compared to CPU and GPU solutions, our accelerator can achieve 196 times and 115 times the improvement for graph classification respectively in terms of processing latency.
机译:CNN(卷积神经网络)在欧几里德结构数据的学习功能方面取得了巨大成功。虽然很多学习任务都需要处理图形数据。在CNN无法操作的这些应用方案中,GCN(图形神经网络)近年来表现出吸引力的性能和越来越长的关注。然而,根据我们的研究,网络的计算复杂度和存储开销也增加,使其成为加快单个FPGA是一个挑战。因此,在这项工作中,我们专注于通过提出深管内加速方案来加速CPU多FPGA平台的深入GCN(DAGCN)。为了完全探索DAGCN中存在的并行性,我们提出了一种图形卷积神经加速器(GCNAR),其特征在于,通过多个1-D收缩阵列集成。此外,我们还采用了一种现有的基于CSR算法的分区方案,用于大规模矩阵矢量乘法,在我们的GCNAR设计中,有效提高了GCNAR的计算效率。此外,我们开发性能和资源评估模型,以帮助我们确定最佳设计参数,用于最大化加速器吞吐量。实际图形数据集的评估表明我们的FPGA的解决方案可以实现与最先进的GCN加速度的可比性。此外,与CPU和GPU解决方案相比,我们的加速器可以在处理延迟方面达到196次和图表分类改进的115倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号