A comprehensive reconfigurable computing approach to memory wall problem of large graph computation

Wang Xu; Zhu Yongxin; Huang Linan

首页> 外文期刊>Journal of systems architecture >A comprehensive reconfigurable computing approach to memory wall problem of large graph computation

【24h】

A comprehensive reconfigurable computing approach to memory wall problem of large graph computation

机译：解决大型图计算内存墙问题的全面可重构计算方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Graph computation problems that exhibit irregular memory access patterns are known to show poor performance on multiprocessor architectures. Although recent studies use FPGA technology to tackle the memory wall problem of graph computation by adopting a massively multi-threaded architecture, the performance is still far less than optimal memory performance due to the long memory access latency. In this paper, we propose a comprehensive reconfigurable computing approach to address the memory wall problem. First, we present an extended edge-streaming model with massive partitions to provide better load balance while taking advantage of the streaming bandwidth of external memory in processing large graphs. Second, we propose a two-level shuffle network architecture to significantly reduce the on chip memory requirement while provide high processing throughput that matches the bandwidth of the external memory. Third, we introduce a compact storage design based on graph compression schemes and propose the corresponding encoding and decoding hardware to reduce the data volume transferred between the processing engines and external memory. We validate the effectiveness of the proposed architecture by implementing three frequently-used graph algorithms on ML605 board, showing an up to 3.85 x improvement in terms of performance to bandwidth ratio over previously published FPGA-based implementations. (C) 2016 Elsevier B.V. All rights reserved.

机译：已知显示不规则内存访问模式的图形计算问题在多处理器体系结构上显示出较差的性能。尽管最近的研究使用FPGA技术通过采用大规模多线程体系结构来解决图形计算的内存墙问题，但是由于较长的内存访问等待时间，其性能仍远远低于最佳内存性能。在本文中，我们提出了一种全面的可重构计算方法来解决内存墙问题。首先，我们提出了具有大量分区的扩展边缘流模型，以提供更好的负载平衡，同时在处理大型图形时利用外部内存的流带宽。其次，我们提出了一种两级混洗网络架构，以显着减少片上存储器的需求，同时提供与外部存储器的带宽相匹配的高处理吞吐量。第三，我们介绍了一种基于图压缩方案的紧凑型存储设计，并提出了相应的编解码硬件，以减少处理引擎和外部存储器之间传输的数据量。我们通过在ML605板上实施三种常用的图形算法来验证所提出体系结构的有效性，与以前发布的基于FPGA的实现相比，在性能与带宽比方面显示出高达3.85倍的改进。（C）2016 Elsevier B.V.保留所有权利。

著录项

来源
《Journal of systems architecture》 |2016年第null期|共11页
作者
Wang Xu; Zhu Yongxin; Huang Linan;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类微型计算机;
关键词
Embedded architecture; FPGA; Graph computing; Hardware implementation; Memory wall problem;

机译：嵌入式体系结构;FPGA;图形计算;硬件实现;内存墙问题;

相似文献

外文文献
中文文献
专利

1. A comprehensive reconfigurable computing approach to memory wall problem of large graph computation [J] . Wang Xu, Zhu Yongxin, Huang Linan Journal of systems architecture . 2016,第Null期

机译：解决大型图计算内存墙问题的全面可重构计算方法
2. A link-elimination partitioning approach for application graph mapping in reconfigurable computing systems [J] . Mohtavipour Seyed Mehdi, Shahhoseini Hadi Shahriar Journal of supercomputing . 2020,第1期

机译：可重构计算系统中应用图映射的链路消除分区方法
3. Comprehensive evaluation of the apex beat using 64-slice computed tomography: Impact of left ventricular mass and distance to chest wall [J] . EharaS., OkuyamaT., ShiraiN., Journal of cardiology . 2010,第2期

机译：使用64层计算机断层扫描技术对心尖搏动进行综合评估：左心室质量和距胸壁的距离的影响
4. Addressing Memory Wall Problem of Graph Computation in Reconfigurable System [C] . Xu Wang, Linan Huang, Yongxin Zhu, 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, 2015 IEEE 12th International Conference on Embedded Software and Systems . 2015

机译：解决可重构系统中图形计算的内存壁问题
5. Memory and Network Interface Virtualization for Multi-Tenant Reconfigurable Compute Devices [D] . Rozhko, Daniel. 2018

机译：用于多租户可重新配置计算设备的内存和网络接口虚拟化
6. Height difference between the vestibular and palatal walls and palatal width: a cone beam computed tomography approach [O] . P. López-Jarana, C. M. Díaz-Castro, A. Falcão, 2021

机译：前庭和腭壁和腭宽度之间的高度差异：锥形束计算机断层扫描方法
7. A comprehensive approach to decipher biological computation to achieve next generation high-performance exascale computing. [O] . Conrad D. James, Adrian B. Schiess, Jamie Howell, 2013

机译：一种破译生物计算的综合方法，实现下一代高性能EnaScale计算。

A comprehensive reconfigurable computing approach to memory wall problem of large graph computation

摘要

著录项

相似文献

相关主题

期刊订阅