【24h】

Complexity Effective Bypass Networks

机译:复杂性有效的旁路网络

获取原文
获取原文并翻译 | 示例

摘要

Superscalar processors depend heavily on broadcast-based bypass networks to improve performance by exploiting more instruction level parallelism. However, increasing clock speeds and shrinking technology make broadcasting slower and difficult to implement, especially for wide issue and deeply pipelined processors. High latency bypass networks delay the execution of dependent instructions, which could result in significant performance loss.rnIn this paper, we first perform a detailed analysis of the performance impact due to delays in the execution of dependent instructions caused by high latency bypass networks. We found that the performance impact due to delayed data-dependent instruction execution varies based on the data dependence present in a program and on the type of instructions constituting the program code. We also found that the performance impact varies significantly with the hardware configuration, and that with a high latency bypass network, the processor hardware critical for near-maximal performance reduces considerably. We then propose Single FU bypass networks to reduce the bypass network latency, where results from an FU are forwarded only to itself. The new bypass network design is based on the observations that an instruction's result is mostly required by just one other instruction and that the operands of many instructions come from a single other instruction. The new bypass network results in significant reduction in the data forwarding latency, while incurring only a small impact (about 2% for most of the SPEC2K benchmarks) on the instructions per cycle (IPC) count. However, reduced bypass latency can potentially increase the clock speed. Single FU bypass networks are also much more scalable than the broadcast-based bypass networks, for more wide and more deeply pipelined future microprocessors.
机译:超标量处理器在很大程度上依赖于基于广播的旁路网络,以通过利用更多的指令级并行性来提高性能。但是,时钟速度的提高和技术的缩减使广播速度变慢且难以实现,尤其是对于发行量大和流水线深的处理器。高延迟旁路网络会延迟相关指令的执行,这可能会导致严重的性能损失。在本文中,我们首先对由于高延迟旁路网络导致的相关指令执行延迟而对性能造成的影响进行详细分析。我们发现,由于延迟的数据相关指令执行而导致的性能影响会根据程序中存在的数据相关性以及构成程序代码的指令类型而有所不同。我们还发现,性能影响随硬件配置的不同而有很大差异,而对于高延迟旁路网络而言,对接近最大性能至关重要的处理器硬件会大大降低。然后,我们提出了单FU旁路网络以减少旁路网络等待时间,其中FU的结果仅转发给自身。新的旁路网络设计基于以下观察结果:一条指令的大部分结果仅由另一条指令所需,并且许多指令的操作数来自另一条指令。新的旁路网络可显着减少数据转发延迟,同时对每个周期的指令(IPC)计数仅产生很小的影响(对于大多数SPEC2K基准来说约为2%)。但是,减少旁路等待时间可能会提高时钟速度。单FU旁路网络也比基于广播的旁路网络具有更大的可扩展性,适用于更广泛,更深入的流水线未来微处理器。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号