首页> 外文会议>IEEE International Symposium on High Performance Computer Architecture >GCNAX: A Flexible and Energy-efficient Accelerator for Graph Convolutional Neural Networks
【24h】

GCNAX: A Flexible and Energy-efficient Accelerator for Graph Convolutional Neural Networks

机译:GCNAX:图形卷积神经网络的灵活和节能的加速器

获取原文

摘要

Graph convolutional neural networks (GCNs) have emerged as an effective approach to extend deep learning for graph data analytics. Given that graphs are usually irregular, as nodes in a graph may have a varying number of neighbors, processing GCNs efficiently pose a significant challenge on the underlying hardware. Although specialized GCN accelerators have been proposed to deliver better performance over generic processors, prior accelerators not only under-utilize the compute engine, but also impose redundant data accesses that reduce throughput and energy efficiency. Therefore, optimizing the overall flow of data between compute engines and memory, i.e., the GCN dataflow, which maximizes utilization and minimizes data movement is crucial for achieving efficient GCN processing.In this paper, we propose a flexible and optimized dataflow for GCNs that simultaneously improves resource utilization and reduces data movement. This is realized by fully exploring the design space of GCN dataflows and evaluating the number of execution cycles and DRAM accesses through an analysis framework. Unlike prior GCN dataflows, which employ rigid loop orders and loop fusion strategies, the proposed dataflow can reconFigure the loop order and loop fusion strategy to adapt to different GCN configurations, which results in much improved efficiency. We then introduce a novel accelerator architecture called GCNAX, which tailors the compute engine, buffer structure and size based on the proposed dataflow. Evaluated on five real-world graph datasets, our simulation results show that GCNAX reduces DRAM accesses by a factor of $8.1 imes$ and $2.4 imes$, while achieving $8.9 imes, 1.6 imes$ speedup and $9.5 imes$, $2.3 imes$ energy savings on average over HyGCN and AWB-GCN, respectively.
机译:图表卷积神经网络(GCNS)已成为扩展图表数据分析的深度学习的有效方法。鉴于该图通常是不规则的,因为图中的节点可以具有不同数量的邻居,因此GCNS有效地对底层硬件提出重大挑战。虽然已经提出了专门的GCN加速器来提供更好的通用处理器的性能,但先前的加速器不仅利用计算发动机,而且还强加冗余数据访问,从而降低了吞吐量和能效。因此,优化计算引擎和存储器之间的数据流程,即GCN数据流最大化,最大限度地利用和最小化数据移动对于实现高效的GCN处理至关重要。在本文中,我们为同时的GCN提出了灵活和优化的数据流。提高资源利用率并减少数据移动。这是通过完全探索GCN数据流的设计空间并通过分析框架评估执行周期和DRAM访问的数量来实现。与采用刚性循环订单和环融合策略采用的先前GCN数据流不同,所提出的数据流可以重新配置环路顺序和环路融合策略,以适应不同的GCN配置,从而提高了大量提高的效率。然后,我们介绍一个名为GCNAX的新型加速器架构,基于所提出的数据流量来定制计算发动机,缓冲区结构和大小。在五个实际图形数据集中进行了评估,我们的仿真结果表明,GCNAX将DRAM访问减少了8.1倍和2.4次 Times $ 2.6 Times $ Speedup和9.5美元次以节能平均节省HYGCN和AWB-GCN。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号