...
首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >TLIA: Efficient Reconfigurable Architecture for Control-Intensive Kernels with Triggered-Long-Instructions
【24h】

TLIA: Efficient Reconfigurable Architecture for Control-Intensive Kernels with Triggered-Long-Instructions

机译:TLIA:具有触发长指令的控制密集型内核的高效可重构体系结构

获取原文
获取原文并翻译 | 示例

摘要

Coarse-Grained Reconfigurable Architectures (CGRAs), which provide high performance, low power and flexibility, is viewed as a promising trend for computing. CGRAs are mostly employed to process compute-intensive kernels because of their inefficiency for control flows. Various methods have been proposed to alleviate this problem, and triggered instruction is one of the state-of-the-art techniques. In this paper, a reconfigurable architecture called Triggered-Long-Instruction Architecture (TLIA) is proposed to enhance the triggered instructions with parallel condition method. In the proposed architecture, triggered instruction set is employed on processing elements (PEs). In this way, over-serialized execution and branch instructions are both eliminated. In the meanwhile, each PE has an improved data-path with three ALUs which is inspired by the parallel condition method. In this way, the amount of parallelism inside each control flow is increased by paralleling predicate computations and predicated operations. Moreover, multiple triggered instructions, which may have internal control dependence, can be executed on PEs in parallel. The strategy of issuing instructions is implemented in hardware, and verified by FPGA. Experimental results show that the performance is improved by 20.9 to 140.0 percent, the area is reduced by 24.5 percent, and the power is reduced by 32.5 percent over the equivalent Triggered Instruction Architecture (TIA).
机译:提供高性能,低功耗和灵活性的粗粒度可重构体系结构(CGRA)被视为计算的有希望的趋势。由于CGRA对于控制流的效率低下,因此大多用于处理计算密集型内核。已经提出了减轻该问题的各种方法,并且触发指令是最新技术之一。本文提出了一种可重构的结构,称为触发长指令体系结构(TLIA),以利用并行条件方法增强触发指令。在所提出的架构中,触发指令集被用于处理元件(PE)上。这样,消除了过度序列化的执行和分支指令。同时,每个PE都有一条改进的数据路径,其中包含3个ALU,这受到并行条件方法的启发。这样,通过并行化谓词计算和谓词运算,可以增加每个控制流内部的并行度。此外,可以在PE上并行执行可能具有内部控制依赖性的多个触发指令。发出指令的策略在硬件中实现,并由FPGA进行验证。实验结果表明,与同等的触发式指令体系结构(TIA)相比,性能提高了20.9%至140.0%,面积减小了24.5%,功耗降低了32.5%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号