【24h】

Architecture and design of AlphaServer GS320

机译:AlphaServer GS320的体系结构和设计

获取原文
获取外文期刊封面目录资料

摘要

This paper describes the architecture and implementation of the AlphaServer GS320, a cache-coherent non-uniform memory access multiprocessor developed at Compaq. The AlphaServer GS320 architecture is specifically targeted at medium-scale multiprocessing with 32 to 64 processors. Each node in the design consists of four Alpha 21264 processors, up to 32GB of coherent memory, and an aggressive IO subsystem. The current implementation supports up to 8 such nodes for a total of 32 processors. While snoopy-based designs have been stretched to medium-scale multiprocessors by some vendors, providing sufficient snoop bandwidth remains a major challenge especially in systems with aggressive processors. At the same time, directory protocols targeted at larger scale designs lead to a number of inherent inefficiencies relative to snoopy designs. A key goal of the AlphaServer GS320 architecture has been to achieve the best-of-both-worlds, partly by exploiting the bounded scale of the target systems.This paper focuses on the unique design features used in the AlphaServer GS320 to efficiently implement coherence and consistency. The guiding principle for our directory-based protocol is to address correctness issues related to rare protocol races without burdening the common transaction flows. Our protocol exhibits lower occupancy and lower message counts compared to previous designs, and provides more efficient handling of 3-hop transactions. Furthermore, our design naturally lends itself to elegant solutions for deadlock, livelock, starvation, and fairness. The AlphaServer GS320 architecture also incorporates a couple of innovative techniques that extend previous approaches for efficiently implementing memory consistency models. These techniques allow us to generate commit events (which are used for ordering purposes) well in advance of formulating the reply to a transaction. Furthermore, the separation of the commit event allows time-critical replies to by-pass inbound requests without violating ordering properties. Even though our design specifically targets medium-scale servers, many of the same techniques can be applied to larger-scale directory-based and smaller-scale snoopy-based designs. Finally, we evaluate the performance impact of some of the above optimizations and present a few competitive benchmark results.
机译:本文介绍了AlphaServer GS320的体系结构和实现,这是在Compaq开发的一种缓存一致性非均匀内存访问多处理器。 AlphaServer GS320体系结构专门针对具有32至64个处理器的中型多处理。设计中的每个节点都包含四个Alpha 21264处理器,高达32GB的相干内存以及一个强大的IO子系统。当前的实现最多支持8个这样的节点,总共32个处理器。尽管某些供应商已将基于探听的设计扩展到中型多处理器,但提供足够的探听带宽仍然是一项主要挑战,尤其是在具有主动处理器的系统中。同时,针对大规模设计的目录协议导致相对于窥探设计的许多固有的低效率。 AlphaServer GS320体系结构的关键目标是实现两全其美,部分是通过利用目标系统的有限规模实现的。本文重点介绍AlphaServer GS320中使用的独特设计功能,以有效地实现一致性和一致性。一致性。我们基于目录的协议的指导原则是解决与稀有协议争用相关的正确性问题,而又不会增加常见的事务流。与以前的设计相比,我们的协议具有更低的占用率和更少的消息数,并且可以更有效地处理3跳交易。此外,我们的设计自然为死锁,活锁,饥饿和公平提供了优雅的解决方案。 AlphaServer GS320体系结构还结合了一些创新技术,这些技术扩展了以前的方法,可以有效地实现内存一致性模型。这些技术使我们能够在制定对交易的回复之前就生成提交事件(用于订购目的)。此外,提交事件的分离允许对时间要求严格的答复绕过入站请求,而不会违反排序属性。即使我们的设计专门针对中型服务器,许多相同的技术也可以应用于基于目录的大型目录和基于Snoopy的小型设计。最后,我们评估了上述一些优化对性能的影响,并提出了一些具有竞争力的基准测试结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号