首页> 外文期刊>Very Large Scale Integration (VLSI) Systems, IEEE Transactions on >Impact of Die-to-Die and Within-Die Parameter Variations on the Clock Frequency and Throughput of Multi-Core Processors
【24h】

Impact of Die-to-Die and Within-Die Parameter Variations on the Clock Frequency and Throughput of Multi-Core Processors

机译:芯片对芯片和芯片内部参数变化对多核处理器的时钟频率和吞吐量的影响

获取原文
获取原文并翻译 | 示例

摘要

A statistical performance simulator is developed to explore the impact of parameter variations on the maximum clock frequency (FMAX) and throughput distributions of multi-core processors in a future 22 nm technology. The simulator captures the effects of die-to-die (D2D) and within-die (WID) transistor and interconnect parameter variations on critical path delays in a die. A key component of the simulator is an analytical multi-core processor throughput model, which enables computationally efficient and accurate throughput calculations, as compared with cycle-accurate performance simulators, for single-threaded and highly parallel multi-threaded (MT) workloads. Based on microarchitecture designs from previous microprocessors, three multi-core processors with either small, medium, or large cores are projected for the 22 nm technology generation to investigate a range of design options. These three multi-core processors are optimized for maximum throughput within a constant die area. A traditional single-core processor is also scaled to the 22 nm technology to provide a baseline comparison. The salient contributions from this paper are: 1) product-level variation analysis for multi-core processors must focus on throughput, rather than just FMAX, and 2) multi-core processors are more variation tolerant than single-core processors due to the larger impact of memory latency and bandwidth on throughput. To elucidate these two points, statistical simulations indicate that multi-core and single-core processors with an equivalent total core area have similar FMAX distributions (mean degradation of 9% and standard deviation of 5%) for MT applications. In contrast to single-core processors, memory latency and bandwidth constraints significantly limit the throughput dependency on FMAX in multi-core processors, thus reducing the throughput mean degradation and standard deviation by $ sim $50% for the small and medium core designs and by $ sim $30% for the large core design. This improvement in the throughput distribution indicates that multi-core processors could significantly reduce the product design and process development complexities due to parameter variations as compared to single-core processors, enabling faster time to market for high-performance microprocessor products.
机译:开发了统计性能模拟器,以探索参数变化对未来22 nm技术中最大时钟频率(FMAX)和多核处理器的吞吐量分布的影响。该模拟器捕获管芯到管芯(D2D)和管芯内(WID)晶体管的影响,以及互连参数变化对管芯中关键路径延迟的影响。模拟器的关键组件是分析型多核处理器吞吐量模型,与周期精确的性能模拟器相比,该模型能够为单线程和高度并行的多线程(MT)工作负载实现计算有效且准确的吞吐量计算。基于以前微处理器的微体系结构设计,预计将在22 nm技术世代中使用具有小核,中核或大核的三个多核处理器,以研究一系列设计方案。这三个多核处理器经过优化,可在恒定裸片面积内实现最大吞吐量。传统的单核处理器也可以扩展到22 nm技术,以提供基线比较。本文的主要贡献是:1)多核处理器的产品级变异分析必须关注吞吐量,而不仅仅是FMAX; 2)由于更大,多核处理器比单核处理器更能容忍变异内存延迟和带宽对吞吐量的影响。为了阐明这两点,统计模拟表明,对于MT应用程序,具有相等的总内核面积的多核和单核处理器具有相似的FMAX分布(平均降级为9%,标准偏差为5%)。与单核处理器相比,内存延迟和带宽限制显着限制了多核处理器中对FMAX的吞吐量依赖性,从而将中小型内核设计的吞吐量平均降低和标准偏差降低了sim $ 50%,降低了$ 50。 sim $ 30%用于大型核心设计。吞吐量分布的这种改善表明,与单核处理器相比,由于参数变化,多核处理器可以显着降低产品设计和工艺开发的复杂性,从而可以缩短高性能微处理器产品的上市时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号