【24h】

Register write specialization register read specialization

机译:寄存器写专精寄存器读专精

获取原文

摘要

With the continuous shrinking of transistor size, processor designers are facing new difficulties to achieve high clock frequency. The register file read time, the wake up and selection logic traversal delay and the bypass network transit delay with also their respective power consumptions constitute major difficulties for the design of wide issue superscalar processors.In this paper, we show that transgressing a rule, that has so far been applied in the design of all the superscalar processors, allows to reduce these difficulties. Currently used general-purpose ISAs feature a single logical register file (and generally a floating-point register file). Up to now all superscalar processors have allowed any general-purpose functional unit to read and write any physical general purpose register.First, we propose Register Write Specialization, i.e, forcing distinct groups of functional units to write only in distinct subsets of the physical register file, thus limiting the number of write portson each individual register. Register Write Specialization significantly reduces the access time, the power consumption and the silicon area of the register file without impairing performance.Second, we propose to combine Register Write Specialization with Register Read Specialization for clustered superscalar processors. This limits the number of read ports on each individual register and simplifies both the wakeup logic and the bypass network. With a 8-way 4-cluster WSRS architecture, the complexities of the wake-up logic entry and bypass point are equivalent to the ones found with a conventional 4-way issue processor. More physical registers are needed in WSRS architectures. Nevertheless, using WSRS architecture allows a dramatic reduction of the total silicon area devoted to the physical register file (by a factor four to six). Its power consumption is more than halved and its read access time is shortened by one third. Some extra hardware and/or a few extra pipeline stages are needed for register renaming. WSRS architecture induces constraints on the policy for allocating instructions to clusters. However, performance of a 8-way 4-cluster WSRS architecture stands the comparison with the one of a conventional 8-way 4-cluster conventional superscalar processor.
机译:随着晶体管尺寸的不断缩小,处理器设计人员面临着实现高时钟频率的新困难。寄存器文件读取时间,唤醒和选择逻辑遍历延迟以及旁路网络传输延迟以及它们各自的功耗构成了宽问题超标量处理器设计的主要困难。迄今为止已应用于所有超标量处理器的设计中,可以减少这些困难。当前使用的通用ISA具有单个逻辑寄存器文件(通常是浮点寄存器文件)。到目前为止,所有超标量处理器都允许任何通用功能单元读写任何物理通用寄存器。首先,我们提出 Register Write Specialization ,即强制不同的功能单元组只进行写操作。在物理寄存器文件的不同子集中,因此限制了每个单独寄存器的写端口数。寄存器写专门化显着减少了寄存器文件的访问时间,功耗和硅面积,而又不影响性能。其次,我们建议将寄存器写专门化与寄存器读专门化结合起来,用于集群超标量处理器。这限制了每个单独寄存器上的读取端口的数量,并简化了唤醒逻辑和旁路网络。使用8路4集群WSRS架构,唤醒逻辑入口和旁路点的复杂性等同于传统4路问题处理器的复杂性。 WSRS体系结构中需要更多的物理寄存器。但是,使用WSRS体系结构可以极大地减少专用于物理寄存器文件的总硅片面积(减少4到6倍)。它的功耗减少了一半以上,读取访问时间缩短了三分之一。重命名寄存器需要一些额外的硬件和/或一些额外的流水线阶段。 WSRS体系结构对将指令分配给集群的策略产生了约束。但是,8路4集群WSRS体系结构的性能与常规8路4集群常规超标量处理器的性能相比。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号