...
首页> 外文期刊>IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems >CRIMSON: Compute-Intensive Loop Acceleration by Randomized Iterative Modulo Scheduling and Optimized Mapping on CGRAs
【24h】

CRIMSON: Compute-Intensive Loop Acceleration by Randomized Iterative Modulo Scheduling and Optimized Mapping on CGRAs

机译:Crimson:由随机迭代模数调度和CGRAS上的优化映射计算密集型循环加速

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Coarse-grain reconfigurable arrays (CGRAs) are emerging accelerators that promise low-power acceleration of compute-intensive loops in applications. The acceleration achieved by CGRA relies on the efficient mapping of the compute-intensive loops by the CGRA compiler, onto the CGRA architecture. The CGRA mapping problem, being NP-complete, is performed in a two-step process, namely, scheduling and mapping. The scheduling algorithm allocates timeslots to the nodes of the data flow graph, and the mapping algorithm maps the scheduled nodes onto the processing elements of the CGRA. On a mapping failure, the initiation interval (II) is increased and a new schedule is obtained for the increased II. Most previous mapping techniques use the iterative modulo scheduling (IMS) algorithm to find a schedule for a given II. Since IMS generates a resource-constrained as-soon-as-possible (ASAP) scheduling, even with increased II, it tends to generate a similar schedule that is not mappable. Therefore, IMS does not explore the schedule space effectively. To address these issues, this article proposes CRIMSON, compute-intensive loop acceleration by randomized IMS and optimized mapping technique that generates random modulo schedules by exploring the schedule space, thereby creating different modulo schedules at a given and increased II. CRIMSON also employs a novel conservative test after scheduling to prune valid schedules that are not mappable. From our study conducted on the top 24 performance-critical loops (run for more than 7% of application time) from MiBench, Rodinia, and Parboil, we found that previous state-of-the-art approaches that use IMS, such as RAMP and GraphMinor could not map five and seven loops, respectively, on a 4 x 4 CGRA, whereas CRIMSON was able to map them all. For loops mapped by the previous approaches, CRIMSON achieved a comparable II.
机译:粗粒可重构阵列(CGRAS)是新兴加速器,其在应用中承诺低功耗加速。 CGRA实现的加速度依赖于CGRA编译器对CGRA架构的计算密集环循环的有效映射。 CGRA映射问题是NP-Cleante,在两步处理中,即调度和映射进行。调度算法将时隙分配给数据流图的节点,映射算法将调度节点映射到CGRA的处理元件上。在映射失败上,提高启动间隔(II),并且获得了增加的II。最先前的映射技术使用迭代模数调度(IMS)算法来查找给定II的计划。由于IMS生成资源受限于尽可能的(ASAP)调度,即使增加II,它也倾向于生成不可映射的类似计划。因此,IMS没有有效地探索时间表空间。为了解决这些问题,本文通过随机IMS提出了Crimson,Compute-ConsteLy Loop加速和优化的映射技术,通过探索时间表来生成随机模数调度,从而在给定和增加的II处创建不同的模数时间表。 CISHSON在调度后,还采用了一种新颖的保守测试,以修剪不可映射的有效时间表。从我们的研究到前24位性能关键循环(从Mibench,Rodinia和Parboil中获取超过7%),我们发现以前使用IMS的最先进的方法,如斜坡和格子摩尔分别无法映射五个和七个循环,在4 x 4 cgra上,而Crimson能够映射它们。对于通过先前方法映射的环,Crimson实现了类似的II。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号