首页> 外文期刊>Computer architecture news >Thread Criticality Predictors for Dynamic Performance, Power, and Resource Management in Chip Multiprocessors
【24h】

Thread Criticality Predictors for Dynamic Performance, Power, and Resource Management in Chip Multiprocessors

机译:芯片多处理器中用于动态性能,电源和资源管理的线程临界预测器

获取原文
获取原文并翻译 | 示例

摘要

With the shift towards chip multiprocessors (CMPs), exploiting and managing parallelism has become a central problem in computer systems. Many issues of parallelism management boil down to discerning which running threads or processes are critical, or slowest, versus which are non-critical. If one can accurately predict critical threads in a parallel program, then one can respond in a variety of ways. Possibilities include running the critical thread at a faster clock rate, performing load balancing techniques to offload work onto currently non-critical threads, or giving the critical thread more on-chip resources to execute faster.rnThis paper proposes and evaluates simple but effective thread criticality predictors for parallel applications. We show that accurate predictors can be built using counters that are typically already available on-chip. Our predictor, based on memory hierarchy statistics, identifies thread criticality with an average accuracy of 93% across a range of architectures.rnWe also demonstrate two applications of our predictor. First, we show how Intel's Threading Building Blocks (TBB) parallel runtime system can benefit from task stealing techniques that use our criticality predictor to reduce load imbalance. Using criticality prediction to guide TBB's task-stealing decisions improves performance by 13-32% for TBB-based PARSEC benchmarks running on a 32-core CMP. As a second application, criticality prediction guides dynamic energy optimizations in barrier-based applications. By running the predicted critical thread at the full clock rate and frequency-scaling non-critical threads, this approach achieves average energy savings of 15% while negligibly degrading performance for SPLASH-2 and PARSEC benchmarks.
机译:随着向芯片多处理器(CMP)的发展,利用和管理并行性已成为计算机系统中的中心问题。并行管理的许多问题归结为辨别哪个正在运行的线程或进程是关键的或最慢的,而不是哪个是非关键的。如果可以在并行程序中准确预测关键线程,则可以以多种方式进行响应。可能的情况包括以更快的时钟频率运行关键线程,执行负载平衡技术以将工作分流到当前非关键线程上,或者为关键线程提供更多的片上资源以更快地执行。rn本文提出并评估了简单但有效的线程关键性并行应用程序的预测器。我们表明,可以使用通常已经在芯片上使用的计数器来构建准确的预测器。我们的预测器基于内存层次结构统计信息,在一系列体系结构中以平均93%的平均准确度识别线程临界。rn我们还演示了预测器的两种应用。首先,我们展示了英特尔的线程构建模块(TBB)并行运行时系统如何从任务窃取技术中受益,这些任务窃取技术使用我们的关键性预测因子来减少负载不平衡。对于在32核CMP上运行的基于TBB的PARSEC基准,使用临界预测来指导TBB的任务窃取决策可将性能提高13-32%。作为第二个应用程序,关键性预测指导基于势垒的应用程序中的动态能量优化。通过以全时钟速率运行预测的关键线程并按频率缩放非关键线程,此方法可实现15%的平均节能量,而对SPLASH-2和PARSEC基准测试的性能可忽略不计。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号