首页> 外文期刊>Very Large Scale Integration (VLSI) Systems, IEEE Transactions on >Analyzing Potential Throughput Improvement of Power- and Thermal-Constrained Multicore Processors by Exploiting DVFS and PCPG
【24h】

Analyzing Potential Throughput Improvement of Power- and Thermal-Constrained Multicore Processors by Exploiting DVFS and PCPG

机译:通过利用DVFS和PCPG分析功率和散热受限的多核处理器的潜在吞吐量提高

获取原文
获取原文并翻译 | 示例

摘要

Process variability from a range of sources is growing as technology is scaled below 65 nm, increasing variations of transistor delay and leakage current both within a die and across dies. This, in turn, negatively impacts maximum operating frequency and total power consumption of processors. Meanwhile, manufacturers have integrated more cores in a single die to improve the throughput of processors running highly-parallel workloads. However, many existing workloads do not have high enough parallelism to exploit multiple cores in a processor. First, in this paper, we maximize the throughput of power- and thermal-constrained multicore processors using per-core power gating and dynamic voltage/frequency scaling. When we do not have enough parallelism to effectively use all cores, we turn off some cores using per-core power gates that are already available in commercial multicore processors. This provides extra power and thermal headroom, and allows active cores to run faster through voltage/frequency scaling within power, thermal, and voltage scaling limits. Our analysis using a 32 nm predictive technology model demonstrates that jointly optimizing the number of active cores and maximum operating frequency can improve the throughput of a 16-core processor running workloads with limited parallelism by up to 14%. Second, we extend our throughput analysis and optimization to consider the impact of within-die spatial process variations that lead to considerable core-to-core frequency and leakage power variations in multicore processors. Our analysis shows that exploiting core-to-core frequency variations can improve the throughput of a 16-core processor by up to 57%.
机译:随着技术被缩放到65 nm以下,各种来源的工艺差异都在增加,这增加了芯片内以及芯片间晶体管延迟和漏电流的变化。反过来,这会对处理器的最大工作频率和总功耗产生负面影响。同时,制造商在单个裸片中集成了更多内核,以提高运行高度并行工作负载的处理器的吞吐量。但是,许多现有的工作负载没有足够高的并行性来利用处理器中的多个内核。首先,在本文中,我们使用每核功率门控和动态电压/频率缩放功能,将功率和散热受限的多核处理器的吞吐量最大化。当我们没有足够的并行度来有效使用所有内核时,我们将使用商用多核处理器中已经可用的每核功率门关闭一些内核。这提供了额外的功率和散热空间,并允许有源内核通过在功率,热量和电压缩放限制内的电压/频率缩放更快地运行。我们使用32 nm预测技术模型进行的分析表明,共同优化活动内核的数量和最大工作频率可以将并行性受限的16核处理器运行工作负载的吞吐量提高多达14%。其次,我们扩展吞吐量分析和优化,以考虑芯片内空间过程变化的影响,这些变化会导致多核处理器中相当大的核心到核心频率和泄漏功率变化。我们的分析表明,利用内核之间的频率变化可以将16核处理器的吞吐量提高多达57%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号