Instruction level parallelism through microthreading - A scalable approach to chip multiprocessors

Bousias K; Hasasneh N; Jesshope C

首页> 外文期刊>The Computer journal >Instruction level parallelism through microthreading - A scalable approach to chip multiprocessors

【24h】

Instruction level parallelism through microthreading - A scalable approach to chip multiprocessors

机译：通过微线程进行指令级并行处理-一种可扩展的芯片多处理器方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Most microprocessor chips today use an out-of-order instruction execution mechanism. This mechanism allows superscalar processors to extract reasonably high levels of instruction level parallelism (ILP). The most significant problem with this approach is a large instruction window and the logic to support instruction issue from it. This includes generating wake-up signals to waiting instructions and a selection mechanism for issuing them. Wide-issue width also requires a large multi-ported register file, so that each instruction can read and write its operands simultaneously. Neither structure scales well with issue width leading to poor performance relative to the gates used. Furthermore, to obtain this ILP, the execution of instructions must proceed speculatively. An alternative, which avoids this complexity in instruction issue and eliminates speculative execution, is the microthreaded model. This model fragments sequential code at compile time and executes the fragments out of order while maintaining in-order execution within the fragments. The only constraints on the execution of fragments are the dependencies between them, which are managed in a distributed and scalable manner using synchronizing registers. The fragments of code are called microthreads and they capture ILP and loop concurrency. Fragments can be interleaved on a single processor to give tolerance to latency in operands or distributed to many processors to achieve speedup. The implementation of this model is fully scalable. It supports distributed instruction issue and a fully scalable register file, which implements a distributed, shared-register model of communication and synchronization between multiple processors on a single chip. This paper introduces the model, compares it with current approaches and presents an analysis of some of the implementation issues. It also presents results showing scalable performance with issue width over several orders of magnitude, from the same binary code.

机译：今天，大多数微处理器芯片都使用乱序指令执行机制。该机制允许超标量处理器提取合理水平的指令级并行性（ILP）。这种方法最重要的问题是较大的指令窗口以及支持从中发出指令的逻辑。这包括生成等待指令的唤醒信号和发出指令的选择机制。较宽的宽度还需要一个大型的多端口寄存器文件，以便每个指令可以同时读取和写入其操作数。相对于所使用的门，这两种结构都不能很好地按比例缩小尺寸，从而导致性能不佳。此外，要获得此ILP，必须以推测方式执行指令。一种避免在指令发布中如此复杂并消除推测执行的替代方法是微线程模型。该模型在编译时对顺序代码进行分段，并按顺序执行分段，同时在分段内保持顺序执行。片段执行的唯一约束是它们之间的依赖关系，可以使用同步寄存器以分布式和可伸缩的方式对其进行管理。代码片段称为微线程，它们捕获ILP和循环并发。片段可以在单个处理器上交错以容忍操作数中的延迟，也可以分配给许多处理器以实现加速。此模型的实现是完全可伸缩的。它支持分布式指令发布和完全可扩展的寄存器文件，该文件在单个芯片上的多个处理器之间实现了通信和同步的分布式共享寄存器模型。本文介绍了该模型，将其与当前方法进行了比较，并对一些实施问题进行了分析。它还显示了来自同一二进制代码的结果，显示了具有可扩展性能，并且问题宽度超过几个数量级。

著录项

来源
《The Computer journal》 |2006年第2期|p. 211-233|共23页
作者
Bousias K; Hasasneh N; Jesshope C;
展开▼
作者单位

Univ Amsterdam, Dept Comp Sci, NL-1012 WX Amsterdam, Netherlands;

Univ Hull, Dept Elect Engn, Kingston Upon Hull HU6 7RX, N Humberside, England;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
concurrency; CMP; microthreads; code fragments; ARCHITECTURE; PROCESSORS;

机译：并发;CMP;微线程;代码片段;体系结构;处理器;

相似文献

外文文献
中文文献
专利

1. The Cell Broadband Engine: Exploiting Multiple Levels of Parallelism in a Chip Multiprocessor [J] . Michael Gschwind International journal of parallel programming . 2007,第3期

机译：单元宽带引擎：在芯片多处理器中开发多个并行级别
2. The impact of exploiting instruction-level parallelism on shared-memory multiprocessors [J] . Pai V.S., Ranganathan P. IEEE Transactions on Computers . 1999,第2期

机译：利用指令级并行性对共享内存多处理器的影响
3. Tuning Parallelism of Sequential Applications via Thread Level Speculation for Chip Multiprocessors [J] . Cong Liu, Li Shen, Libo Huang, Advanced Science Letters . 2012,第Null期

机译：通过芯片多处理器的线程级别推测来调整顺序应用程序的并行性
4. The impact of instruction-level parallelism on multiprocessor performance and simulation methodology [C] . Pai, V.S., Ranganathan, . 1997

机译：指令级并行性对多处理器性能和仿真方法的影响
5. Design and evaluation of a technology-scalable architecture for instruction -level parallelism [D] . Nagarajan, Ramadass 2007

机译：指令级并行技术的技术可扩展架构的设计和评估
6. Exploiting Thread-Level and Instruction-Level Parallelism to Cluster Mass Spectrometry Data using Multicore Architectures [O] . Fahad Saeed, Jason D. Hoffert, Trairak Pisitkun, -1

机译：利用多核体系结构利用线程级和指令级并行性对质谱数据进行聚类
7. Instruction Level Parallelism through Microthreading -- A Scalable Approach to Chip Multiprocessors [O] . Kostas Bousias, Nabil Hasasneh, Chris Jesshope 2006

机译：通过微线程实现指令级并行 - 一种可扩展的芯片多处理器方法
8. Chip Multiprocessors Offer an Economical, Scalable Architecture for Future Microprocessors, Thread-Level Speculation Support Allows Them to Speed Up Past Software [R] . Hammond, L. , Hubbert, B. A. , Siu, M. , 2000

机译：芯片多处理器为未来的微处理器提供经济，可扩展的架构，线程级推测支持允许他们加速过去的软件

Instruction level parallelism through microthreading - A scalable approach to chip multiprocessors

摘要

著录项

相似文献

相关主题

期刊订阅