首页> 外文期刊>Computer architecture news >Synchroscalar: A Multiple Clock Domain, Power-Aware, Tile-Based Embedded Processor
【24h】

Synchroscalar: A Multiple Clock Domain, Power-Aware, Tile-Based Embedded Processor

机译:同步标量:多时钟域,基于功耗的,基于图块的嵌入式处理器

获取原文
获取原文并翻译 | 示例
           

摘要

We present Synchroscalar, a tile-based architecture for embedded processing that is designed to provide the flexibility of DSPs while approaching the power efficiency of ASICs. We achieve this goal by providing high parallelism and voltage scaling while minimizing control and communication costs. Specifically, Synchroscalar uses columns of processor tiles organized into statically-assigned frequency-voltage domains to minimize power consumption. Furthermore, while columns use SIMD control to minimize overhead, data-dependent computations can be supported by extremely flexible statically-scheduled communication between columns. We provide a detailed evaluation of Synchroscalar including SPICE simulation, wire and device models, synthesis of key components, cycle-level simulation, and compiler- and hand-optimized signal processing applications. We find that the goal of meeting, not exceeding, performance targets with data-parallel applications leads to designs that depart significantly from our intuitions derived from general-purpose microprocessor design. In particular, synchronous design and substantial global interconnect are desirable in the low-frequency, low-power domain. This global interconnect supports parallelization and reduces processor idle time, which are critical to energy efficient implementations of high bandwidth signal processing. Overall, Synchroscalar provides programma-bility while achieving power efficiencies within 8-30X of known ASIC implementations, which is 10-60X better than conventional DSPs. In addition, frequency-voltage scaling in Synchroscalar provides between 3-32% power savings in our application suite.
机译:我们介绍Synchroscalar,这是一种用于嵌入式处理的基于图块的体系结构,旨在提供DSP的灵活性,同时达到ASIC的功率效率。我们通过提供高并行度和电压缩放比例,同时将控制和通信成本降至最低来实现这一目标。具体而言,同步标量使用组织为静态分配的频率-电压域的处理器磁贴列来最大程度地降低功耗。此外,尽管列使用SIMD控制来最大程度地减少开销,但列之间非常灵活的静态调度通信可以支持与数据相关的计算。我们提供对同步标量的详细评估,包括SPICE仿真,电线和设备模型,关键组件的综合,周期级仿真以及编译器和手动优化的信号处理应用程序。我们发现,使用数据并行应用程序达到而不是超过性能目标的目标导致设计与我们从通用微处理器设计中得出的直觉大相径庭。特别地,在低频,低功率域中,期望同步设计和实质性的全局互连。这种全局互连支持并行化并减少处理器空闲时间,这对于高带宽信号处理的节能实施至关重要。总的来说,同步标量提供了可编程性,同时在8到30倍的已知ASIC实现中实现了功率效率,这比传统DSP改善了10到60倍。此外,同步标量中的频率-电压缩放在我们的应用套件中可节省3-32%的功耗。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号