首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Exploiting Parallelism of Imperfect Nested Loops on Coarse-Grained Reconfigurable Architectures
【24h】

Exploiting Parallelism of Imperfect Nested Loops on Coarse-Grained Reconfigurable Architectures

机译:在粗粒度可重构体系结构上利用不完美嵌套循环的并行性

获取原文
获取原文并翻译 | 示例

摘要

Coarse-grained reconfigurable architecture (CGRA) is a promising parallel computing platform that provides high performance, high power efficiency and flexibility. However, for imperfect nested loops, the existing loop mapping methods often result in low execution performance and poor hardware utilization. To tackle this problem, this paper makes three contributions: 1) a highly effective and general approach to map imperfect loops on CGRA; 2) a global optimization strategy to search the optimal initiation intervals (IIs); 3) a powerful kernel compression method to reduce the oversized kernel. Experiment results show that our approach can reduce the total computing latency by 20.5, 58.5 and 73.2 percent compared to the state-of-the-art approaches on 2×2 , 4×4 and 8×8 CGRA respectively. Moreover, the compilation time and configuration context size is acceptable in practice.
机译:粗粒度可重构体系结构(CGRA)是一个有前途的并行计算平台,可提供高性能,高能效和灵活性。但是,对于不完善的嵌套循环,现有的循环映射方法通常会导致执行性能低下和硬件利用率低下。为了解决这个问题,本文做出了三点贡献:1)一种在CGRA上映射不完美回路的高效通用方法; 2)搜索最佳启动间隔(II)的全局优化策略; 3)强大的内核压缩方法,可减少过大的内核。实验结果表明,与最新的2×2、4×4和8×8 CGRA方法相比,我们的方法可以将总计算延迟减少20.5%,58.5%和73.2%。而且,编译时间和配置上下文大小在实践中是可以接受的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号