...
首页> 外文期刊>Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on >Selective Flexibility: Creating Domain-Specific Reconfigurable Arrays
【24h】

Selective Flexibility: Creating Domain-Specific Reconfigurable Arrays

机译:选择性的灵活性:创建特定于域的可重配置阵列

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Historically, hardware acceleration technologies have either been application-specific, therefore lacking in flexibility, or fully programmable, thereby suffering from notable inefficiencies on an application-by-application basis. To address the growing need for domain-specific acceleration technologies, this paper describes a design methodology (i) to automatically generate a domain-specific coarse-grained array from a set of representative applications and (ii) to introduce limited forms of architectural generality to increase the likelihood that additional applications can be successfully mapped onto it. In particular, coarse-grained arrays generated using our approach are intended to be integrated into customizable processors that use application-specific instruction set extensions to accelerate performance and reduce energy; rather than implementing these extensions using application-specific integrated circuit (ASIC) logic, which lacks flexibility, they can be synthesized onto our reconfigurable array instead, allowing the processor to be used for a variety of applications in related domains. Results show that our array is around $2times$ slower and $15times$ larger than an ultimately efficient ASIC implementation, and thus far more efficient than field-programmable gate arrays (FPGAs), which are known to be 3–4$times$ slower and 20–40$times$ larger. Additionally, we estimate that our array is usually around $2times$ larger and $2times$ slower than an accelerator synthesized using traditional datapath merging, which has, if any, very limited flexib- lity beyond the design set of DFGs.
机译:从历史上看,硬件加速技术要么是特定于应用程序的,因此缺乏灵活性,要么是完全可编程的,因此在逐个应用程序的基础上效率低下。为了满足对特定领域的加速技术不断增长的需求,本文介绍了一种设计方法(i)从一组代表性应用程序自动生成特定于领域的粗粒度数组,以及(ii)将有限形式的体系结构通用性引入增加了其他应用程序可以成功映射到它的可能性。特别是,使用我们的方法生成的粗粒度数组旨在集成到可定制的处理器中,这些处理器使用特定于应用程序的指令集扩展来加速性能并降低能耗。无需使用缺乏灵活性的专用集成电路(ASIC)逻辑来实现这些扩展,而是可以将它们合成到我们的可重新配置阵列中,从而使处理器可用于相关领域中的各种应用。结果表明,与最终高效的ASIC实现相比,我们的阵列要慢2倍左右,并且要大15倍,因此比已知的现场可编程门阵列(FPGA)的效率要高3到4倍。和大20–40 $ times $。此外,我们估计我们的阵列通常比使用传统数据路径合并合成的加速器大2倍,慢2倍,而传统数据路径合并的灵活性超出了DFG设计集的范围(如果有的话)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号