首页> 外文会议>IEEE 17th International Symposium on High Performance Computer Architecture >Dynamically Specialized Datapaths for energy efficient computing
【24h】

Dynamically Specialized Datapaths for energy efficient computing

机译:动态专业化数据路径,用于节能计算

获取原文

摘要

Due to limits in technology scaling, energy efficiency of logic devices is decreasing in successive generations. To provide continued performance improvements without increasing power, regardless of the sequential or parallel nature of the application, microarchitectural energy efficiency must improve. We propose Dynamically Specialized Datapaths to improve the energy efficiency of general purpose programmable processors. The key insights of this work are the following. First, applications execute in phases and these phases can be determined by creating a path-tree of basic-blocks rooted at the inner-most loop. Second, specialized datapaths corresponding to these path-trees, which we refer to as DySER blocks, can be constructed by interconnecting a set of heterogeneous computation units with a circuit-switched network. These blocks can be easily integrated with a processor pipeline. A synthesized RTL implementation using an industry 55nm technology library shows a 64-functional-unit DySER block occupies approximately the same area as a 64 KB single-ported SRAM and can execute at 2 GHz. We extend the GCC compiler to identify path-trees and code-mapping to DySER and evaluate the PAR-SEC, SPEC and Parboil benchmarks suites. Our results show that in most cases two DySER blocks can achieve the same performance (within 5%) as having a specialized hardware module for each path-tree. A 64-FU DySER block can cover 12% to 100% of the dynamically executed instruction stream. When integrated with a dual-issue out-of-order processor, two DySER blocks provide geometric mean speedup of 2.1X (1.15X to 10X), and geometric mean energy reduction of 40% (up to 70%), and 60% energy reduction if no performance improvement is required.
机译:由于技术扩展的限制,逻辑器件的能效在连续几代中都在下降。为了在不增加功率的情况下提供持续的性能改进,无论应用程序是顺序的还是并行的,微体系结构的能源效率都必须提高。我们提出了动态专用数据路径,以提高通用可编程处理器的能效。这项工作的主要见解如下。首先,应用程序分阶段执行,可以通过创建以最内部循环为根的基本块的路径树来确定这些阶段。其次,可以通过将一组异构计算单元与电路交换网络互连来构建对应于这些路径树的专用数据路径(我们称为DySER块)。这些模块可以很容易地与处理器管线集成在一起。使用工业55纳米技术库的综合RTL实现显示,一个64个功能单元的DySER块与一个64 KB单端口SRAM占用的面积大致相同,并且可以在2 GHz的频率下执行。我们扩展了GCC编译器,以识别路径树和代码映射到DySER,并评估PAR-SEC,SPEC和Parboil基准套件。我们的结果表明,在大多数情况下,两个DySER模块可以达到与每个路径树专用的硬件模块相同的性能(不超过5%)。 64-FU DySER块可以覆盖动态执行的指令流的12%至100%。当与双问题无序处理器集成时,两个DySER模块可提供2.1倍(1.15倍至10倍)的几何平均速度提速,以及40%的几何平均能量降低(最高70%)和60%的能量如果不需要提高性能,则减少。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号