首页> 外文会议>Asia and South Pacific Design Automation Conference >Efficient mapping of CDFG onto coarse-grained reconfigurable array architectures
【24h】

Efficient mapping of CDFG onto coarse-grained reconfigurable array architectures

机译:将CDFG高效映射到粗粒度可重配置阵列架构上

获取原文

摘要

In the approaching era of IoT, flexible and low power accelerators have become essential to meet aggressive energy efficiency targets. During the last few decades, Coarse Grain Reconfigurable Arrays (CGRA) have demonstrated high energy efficiency as accelerators, especially for high-performance streaming applications. While existing CGRAs mostly rely on partial and full predication techniques to support conditional branches, inefficient architecture and mapping support for handling control flow limits the use of CGRAs in accelerating either only inner loop bodies, or transformed loops specifically adapted to the target CGRA. This paper proposes a novel CGRA architecture with support for jump and conditional jump instructions and a lightweight global synchronization mechanism to enable complete Control Data Flow Graph (CDFG) mapping in an ultra-low-power environment. The architecture is coupled with a complete design flow that efficiently maps applications with heavy control flow starting from a generic C language description. The proposed mapping approach reduces the impact of wasteful instruction issues in the conventional approaches of predication providing an average energy improvement of 1.44× and 1.6× when compared to the state of the art partial and full predication techniques. Moreover, the proposed method achieves an average speed-up up to 21× and an energy improvement up to 50.42× while executing applications with heavy control flow with respect to sequential execution on a low-power embedded CPU, demonstrating its suitability for next generation IoT applications.
机译:在接近物联网的时代,灵活,低功耗的加速器已成为实现积极的能效目标的关键。在过去的几十年中,粗粒度可重构阵列(CGRA)作为加速器已证明具有很高的能效,尤其是对于高性能流应用而言。尽管现有的CGRA大多依靠部分和全部谓词技术来支持条件分支,但对处理控制流的低效率体系结构和映射支持限制了CGRA在仅加速内部循环体或特定于目标CGRA的转换循环中的使用。本文提出了一种新颖的CGRA架构,该架构支持跳转和条件跳转指令以及轻量级的全局同步机制,以在超低功耗环境中实现完整的控制数据流图(CDFG)映射。该体系结构与完整的设计流程结合在一起,该流程从通用C语言描述开始,以繁重的控制流程有效地映射应用程序。与现有技术的部分和全部预测技术相比,所提出的映射方法减少了传统预测方法中浪费指令问题的影响,从而平均能量提高了1.44倍和1.6倍。此外,提出的方法在低功耗嵌入式CPU上执行顺序控制繁重的应用程序时,可实现平均速度提高21倍,能耗提高50.42倍,这证明了其适用于下一代物联网的适用性应用程序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号