【24h】

A Memory Interface for Multi-Purpose Multi-Stream Accelerators

机译:多功能多流加速器的存储器接口

获取原文
获取原文并翻译 | 示例

摘要

Power and programming challenges make heterogeneous multi-cores composed of cores and ASICs an attractive alternative to homogeneous multi-cores. Recently, multi-purpose loop-based generated accelerators have emerged as an especially attractive accelerator option. They have several assets: short design time (automatic generation), flexibility (multi-purpose) but low configuration and routing overhead (unlike FPGAs), computational performance (operations are directly mapped to hardware), and a focus on memory throughput by leveraging loop constructs. However, with multiple streams, the memory behavior of such accelerators can become at least as complex as that of superscalar processors, while they still need to retain the memory ordering predictability and throughput efficiency of DMAs. In this article, we show how to design a memory interface for multi-purpose accelerators which combines the ordering predictability of DMAs, retains key efficiency features of memory systems for complex processors, and requires only a fraction of their cost by leveraging the properties of streams references. We evaluate the approach with a synthesizable version of the memory interface for an example 9-task generated loop-based accelerator.
机译:电源和编程方面的挑战使得由内核和ASIC组成的异构多核成为同类多核的诱人替代品。最近,基于多用途循环生成的加速器已成为一种特别吸引人的加速器选项。它们具有以下几项资产:设计时间短(自动生成),灵活性(多用途)但配置和路由开销低(与FPGA不同),计算性能(操作直接映射到硬件)以及通过利用循环来关注内存吞吐量结构体。但是,对于多个流,此类加速器的存储行为可能至少与超标量处理器的存储行为一样复杂,而它们仍需要保留DMA的存储顺序可预测性和吞吐量效率。在本文中,我们展示了如何设计一种用于多功能加速器的存储器接口,该接口结合了DMA的顺序可预测性,保留了复杂处理器的存储器系统的关键效率功能,并且通过利用流的特性仅需要其成本的一小部分参考。我们以一个9任务生成的基于循环的加速器为例,使用存储器接口的可综合版本评估该方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号