Scalability aspects of instruction distribution algorithms for clustered processors

Aneesh Aggarwal; Franklin M.

首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Scalability aspects of instruction distribution algorithms for clustered processors

【24h】

Scalability aspects of instruction distribution algorithms for clustered processors

机译：集群处理器的指令分配算法的可伸缩性方面

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In the evolving submicron technology, making it particularly attractive to use decentralized designs. A common form of decentralization adopted in processors is to partition the execution core into multiple clusters. Each cluster has a small instruction window, and a set of functional units. A number of algorithms have been proposed for distributing instructions among the clusters. The first part of this paper analyzes (qualitatively as well as quantitatively) the effect of various hardware parameters such as the type of cluster interconnect, the fetch size, the cluster issue width, the cluster window size, and the number of clusters on the performance of different instruction distribution algorithms. The study shows that the relative performance of the algorithms is very sensitive to these hardware parameters and that the algorithms that perform relatively better with four or fewer clusters are generally not the best ones for a larger number of clusters. This is important, given that with an imminent increase in the transistor budget, more clusters are expected to be integrated on a single chip. The second part of the paper investigates alternate interconnects that provide scalable performance as the number of clusters is increased. In particular, it investigates two hierarchical interconnects - a single ring of crossbars and multiple rings of crossbars - as well as instruction distribution algorithms to take advantage of these interconnects. Our study shows that these new interconnects with the appropriate distribution techniques achieve an IPC (instructions per cycle) that is 15-20 percent better than the most scalable existing configuration, and is within 2 percent of that achieved by a hypothetical ideal processor having a 1-cycle latency crossbar interconnect. These results confirm the utility and applicability of hierarchical interconnects and hierarchical distribution algorithms in clustered processors.

机译：在不断发展的亚微米技术中，使用分散式设计特别有吸引力。处理器采用的一种分散形式是将执行核心划分为多个集群。每个集群都有一个小的指令窗口和一组功能单元。已经提出了许多用于在集群之间分配指令的算法。本文的第一部分（定性和定量地）分析了各种硬件参数的影响，例如集群互连的类型，访存大小，集群发布宽度，集群窗口大小以及集群数量对性能的影响。不同的指令分配算法。研究表明，算法的相对性能对这些硬件参数非常敏感，并且对于四个或更少的群集，性能相对较好的算法通常不是针对大量群集的最佳算法。鉴于晶体管预算的迫在眉睫的增长，预计将在单个芯片上集成更多的群集，这一点很重要。本文的第二部分研究了随着群集数量的增加而提供可扩展性能的备用互连。特别是，它研究了两个分层的互连-一个交叉开关和多个交叉开关-以及利用这些互连的指令分配算法。我们的研究表明，这些具有适当分布技术的新型互连实现的IPC（每个周期的指令）比最可扩展的现有配置好15-20％，并且在假设理想处理器的IPC为1的情况下不到2％。周期延迟纵横开关互连。这些结果证实了集群处理器中分层互连和分层分布算法的实用性和适用性。

著录项

来源
《IEEE Transactions on Parallel and Distributed Systems 》 |2005年第10期| p.944-955| 共12页
作者
Aneesh Aggarwal; Franklin M.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术 ;
关键词
instruction sets; multiprocessor interconnection networks; parallel architectures; pipeline processing; resource allocation; clustered processor architecture; hierarchical interconnect; instruction distribution algorithm; instructions per cycle; interconnection a;

机译：指令集;多处理器互连网络;并行体系结构;管道处理;资源分配;集群处理器体系结构;分层互连;指令分配算法;每周期指令;互连a;

相似文献

外文文献
中文文献
专利

1. Cluster Monte Carlo distributions in fractal dimensions between two and three: Scaling properties and dynamical aspects for the Ising model - art. no. 104422 [J] . Monceau P., Hsiao PY. Physical Review, B. Condensed Matter . 2002 ,第10期

机译：分形维数在2到3之间的群集蒙特卡洛分布：Ising模型的缩放特性和动力学方面-艺术。没有。 104422
2. Cluster Monte Carlo distributions in fractal dimensions between two and three: Scaling properties and dynamical aspects for the Ising model - art. no. 104422 [J] . Monceau P., Hsiao PY. Physical Review, B. Condensed Matter . 2002 ,第10期

机译：分形维数在2到3之间的群集蒙特卡洛分布：Ising模型的缩放特性和动力学方面-艺术。没有。 104422
3. Scalable clustering and mapping algorithm for application distribution on heterogeneous and irregular FPGA clusters [J] . Lester Kalms, Diana Goehringer Journal of Parallel and Distributed Computing . 2019 ,第Nova期

机译：可扩展的集群和映射算法，用于异构和不规则FPGA集群上的应用程序分发
4. An empirical study of the scalability aspects of instruction distribution algorithms for clustered processors [C] . Aggarwal A., Franklin M. Performance Analysis of Systems and Software, 2001. ISPASS. 2001 IEEE International Symposium on . 2001

机译：集群处理器指令分配算法可扩展性方面的实证研究
5. Scalable frameworks and algorithms for cluster ensembles and clustering data streams. [D] . Hore, Prodip. 2007

机译：用于集群集成和集群数据流的可扩展框架和算法。
6. Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale [O] . Scott Emmons, Stephen Kobourov, Mike Gallant, -1

机译：大规模网络聚类算法和聚类质量指标分析
7. Instruction distribution heuristics for quad-cluster, dynamically-scheduled, superscalar processors [O] . Amirali Baniasadi 2000

机译：四簇，动态调度，超标量处理器的指令分布启发式算法

Scalability aspects of instruction distribution algorithms for clustered processors

摘要

著录项

相似文献

相关主题

期刊订阅