Instruction scheduling for a clustered VLIW processor with a word-interleaved cache

Enric Gibert; Jesus Sanchez; Antonio Gonzalez

首页> 外文期刊>Concurrency and Computation >Instruction scheduling for a clustered VLIW processor with a word-interleaved cache

【24h】

Instruction scheduling for a clustered VLIW processor with a word-interleaved cache

机译：带字交错缓存的群集VLIW处理器的指令调度

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Clustering is a common technique to overcome the wire delay problem incurred by the evolution of technology. Fully distributed architectures, where the register file, the functional units and the data cache are partitioned, are particularly effective to deal with these constraints and moreover they are very scalable. In this paper, effective instruction scheduling techniques for a word-interleaved cache clustered VLIW processor are presented. Such scheduling techniques rely on (ⅰ) loop unrolling and variable alignment to increase the fraction of local accesses, (ⅱ) a latency assignment process to schedule memory instructions with an appropriate latency, and (ⅲ) different heuristics to assign memory instructions to clusters. Memory consistency is guaranteed by constraining the assignment of memory instructions to clusters. In addition, the use of Attraction Buffers is also introduced. An Attraction Buffer is a hardware mechanism that allows some data replication in order to increase the number of local accesses and, in consequence, reduces stall time. Performance results for the Mediabench benchmark suite demonstrate the effectiveness of the presented techniques and mechanisms. The number of local accesses is increased by more than 25% by using the mentioned scheduling techniques, while stall time is reduced by more than 30% when Attraction Buffers are used. Finally, IPC results for such an architecture are 10% and 5% better compared to those of a clustered VLIW processor with a centralized/unified data cache depending on the scheduling heuristic, respectively.

机译：聚类是一种克服技术发展带来的布线延迟问题的常用技术。对寄存器文件，功能单元和数据高速缓存进行分区的全分布式体系结构，对于处理这些约束特别有效，而且它们具有很高的可伸缩性。在本文中，提出了一种有效的用于字交错高速缓存群集VLIW处理器的指令调度技术。此类调度技术依赖于（ⅰ）循环展开和变量对齐以增加本地访问的比例，（ⅱ）延迟分配过程以适当的延迟来调度内存指令，以及（ⅲ）不同的启发式方法将内存指令分配给集群。通过限制对群集的内存指令分配，可以保证内存一致性。此外，还介绍了吸引力缓冲区的使用。吸引力缓冲区是一种硬件机制，它允许一些数据复制以增加本地访问的数量，从而减少停顿时间。 Mediabench基准套件的性能结果证明了所提出的技术和机制的有效性。通过使用上述调度技术，本地访问的数量增加了25％以上，而使用吸引力缓冲区时，停顿时间减少了30％以上。最后，与具有集中式/统一数据缓存的群集VLIW处理器相比，根据调度启发式方法，这种架构的IPC结果分别好10％和5％。

著录项

来源
《Concurrency and Computation》 |2006年第11期|p.1391-1411|共21页
作者
Enric Gibert; Jesus Sanchez; Antonio Gonzalez;
展开▼
作者单位

Departament d'Arquitectura de Computadors, Universitat Politecnica de Catalunya, Modul C6-E208, Campus Nord, Jordi Girona 1-3, 08034 Barcelona, Spain;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
instruction scheduling; modulo scheduling; clustered VLIW processor; distributed cache; partitioned cache;

机译：指令调度;模调度;集群VLIW处理器;分布式缓存;分区缓存;

相似文献

外文文献
中文文献
专利

1. Instruction scheduling with k-successor tree for clustered VLIW processors [J] . Xuemeng Zhang, Hui Wu, Jingling Xue Design automation for embedded systems . 2013,第2期

机译：集群VLIW处理器的k后继树指令调度
2. MULHI Cache: An Instruction Cache Mechanism for VLIW Processors [J] . TAKUYA NAKAIKE, TAKAYUKI ABE, NOBUYUKI OOBA 情報処理学会論文誌 . 1999,第5期

机译：MULHI缓存：VLIW处理器的指令缓存机制
3. Instruction scheduling and transformation for a VLIW unified reduced instruction set computer/digital signal processor processor with shared register architecture [J] . Cheng-Yu Lee, Min-Chin Hung, Rong-Guey Chang Concurrency and computation: practice and experience . 2014,第1期

机译：具有共享寄存器架构的VLIW统一精简指令集计算机/数字信号处理器处理器的指令调度和转换
4. Effective instruction scheduling techniques for an interleaved cache clustered VLIW processor [C] . Enric Gibert, Jesus Sanchez, Antonio Gonzalez Annual ACM/IEEE international symposium on Microarchitecture;ACM/IEEE international symposium on Microarchitecture . 2002

机译：交错式缓存群集VLIW处理器的有效指令调度技术
5. BULLDOG: A COMPILER FOR VLIW ARCHITECTURES (PARALLEL COMPUTING, REDUCED-INSTRUCTION-SET, TRACE SCHEDULING, SCIENTIFIC). [D] . ELLIS, JOHN R. 1985

机译：BULLDOG：VLIW体系结构的编译器（并行计算，简化指令集，跟踪计划，科学）。
6. Optimizing Instruction Scheduling and Register Allocation for Register-File-Connected Clustered VLIW Architectures [O] . Haijing Tang, Xu Yang, Siye Wang, 2013

机译：连接寄存器文件的集群式VLIW架构的优化指令调度和寄存器分配
7. Effective instruction scheduling techniques for an interleaved cache clustered VLIW processor [O] . Gibert Codina, Enric, Sánchez Navarro, Jesús, González Colás, Antonio María 2002

机译：用于交织缓存集群VLIW处理器的有效指令调度技术

Instruction scheduling for a clustered VLIW processor with a word-interleaved cache

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅