首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Conflict-Free Loop Mapping for Coarse-Grained Reconfigurable Architecture with Multi-Bank Memory
【24h】

Conflict-Free Loop Mapping for Coarse-Grained Reconfigurable Architecture with Multi-Bank Memory

机译:具有多存储体的粗粒度可重配置体系结构的无冲突循环映射

获取原文
获取原文并翻译 | 示例

摘要

Coarse-grained reconfigurable architecture (CGRA) is a promising architecture with high performance, high power-efficiency and attraction of flexibility. The computation-intensive parts of an application (e.g., loops) are often mapped on CGRA for acceleration. Due to the high parallel data access demands, the architecture with multi-bank memory is proposed to improve parallelism. For CGRA with multi-bank memory, a joint solution, which simultaneously considers the memory partitioning and modulo scheduling, is proposed to achieve a valid mapping with better performance. In this solution, the modulo scheduling and operator scheduling are used to achieve a valid loop mapping and a valid data placement without any memory access conflicts. By avoiding the pipelining stalls caused by conflicts, the performance of loop mapping is greatly improved. The experimental results on benchmarks of the Livermore, Polybench and Mediabench show that our approach can improve the performance of loops on CGRA to 1.89 , 1.49 and 1.37 compared with REGIMap, HTDM and REGIMap with memory partitioning, at cost of an acceptable increase in compilation time.
机译:粗粒度可重构体系结构(CGRA)是一种有前途的体系结构,具有高性能,高能效和灵活性。应用程序(例如循环)的计算密集型部分通常映射在CGRA上以加速执行。由于对并行数据访问的需求很高,因此提出了具有多存储体存储器的体系结构以提高并行性。对于具有多库内存的CGRA,提出了一种同时考虑内存分区和模调度的联合解决方案,以实现具有更好性能的有效映射。在此解决方案中,模调度和操作员调度用于实现有效的循环映射和有效的数据放置,而不会发生任何内存访问冲突。通过避免由冲突引起的流水线停顿,可以大大提高循环映射的性能。在Livermore,Polybench和Mediabench基准测试中的实验结果表明,与具有内存分区的REGIMap,HTDM和REGIMap相比,我们的方法可以将CGRA上的循环性能提高到1.89、1.49和1.37,其代价是可以增加编译时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号