首页> 外文期刊>ACM transactions on reconfigurable technology and systems >Memory Interface Design for 3D Stencil Kernels on a Massively Parallel Memory System
【24h】

Memory Interface Design for 3D Stencil Kernels on a Massively Parallel Memory System

机译:大规模并行存储系统上3D模板内核的存储接口设计

获取原文
获取原文并翻译 | 示例

摘要

Massively parallel memory systems are designed to deliver high bandwidth at relatively low clock speed for memory-intensive applications implemented on programmable logic. For example, the Convey HC-1 provides 1,024 DRAM banks to each of four FPGAs through a full crossbar, presenting a peak bandwidth of 76.8GB/s to the user logic. Such highly parallel memory systems suffer from high latency, and their effective bandwidth is highly sensitive to access ordering. To achieve high performance, the user must use a customized memory interface that combines scheduling, latency hiding, and data reuse. In this article, we describe the design of a custom memory interface for 3D stencil kernels on the Convey HC-1 that incorporates these features. Experimental results show that the proposed memory interface achieves a speedup in runtime of 2.2 for 6-point stencil and 9.5 for 27-point stencil when compared to a naive memory interface.
机译:大规模并行存储系统旨在以相对较低的时钟速度为可编程逻辑上实现的内存密集型应用提供高带宽。例如,Convey HC-1通过一个完整的交叉开关为四个FPGA中的每一个提供1,024个DRAM组,从而为用户逻辑提供了76.8GB / s的峰值带宽。这样的高度并行的存储系统遭受高等待时间,并且它们的有效带宽对访问顺序高度敏感。为了实现高性能,用户必须使用结合了调度,延迟隐藏和数据重用的定制内存接口。在本文中,我们描述了Convey HC-1上具有这些功能的3D模具内核的自定义内存接口的设计。实验结果表明,与朴素的内存接口相比,所提出的内存接口在运行时对6点模板的加速比为2.2,对于27点模板的加速比为9.5。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号