首页> 外文期刊>Computer architecture news >Criticality Stacks: Identifying Critical Threads in Parallel Programs using Synchronization Behavior
【24h】

Criticality Stacks: Identifying Critical Threads in Parallel Programs using Synchronization Behavior

机译:关键性堆栈:使用同步行为识别并行程序中的关键线程

获取原文
获取原文并翻译 | 示例

摘要

Analyzing multi-threaded programs is quite challenging, but is necessary to obtain good multicore performance while saving energy. Due to synchronization, certain threads make others wait, because they hold a lock or have yet to reach a barrier. We call these critical threads, i.e., threads whose performance is determinative of program performance as a whole. Identifying these threads can reveal numerous optimization opportunities, for the software developer and for hardware. In this paper, we propose a new metric for assessing thread criticality, which combines both how much time a thread is performing useful work and how many co-running threads are waiting. We show how thread criticality can be calculated online with modest hardware additions and with low overhead. We use our metric to create criticality stacks that break total execution time into each thread's criticality component, allowing for easy visual analysis of parallel imbalance. To validate our criticality metric, and demonstrate it is better than previous metrics, we scale the frequency of the most critical thread and show it achieves the largest performance improvement. We then demonstrate the broad applicability of criticality stacks by using them to perform three types of optimizations: (1) program analysis to remove parallel bottlenecks, (2) dynamically identifying the most critical thread and accelerating it using frequency scaling to improve performance, and (3) showing that accelerating only the most critical thread allows for targeted energy reduction.
机译:分析多线程程序非常具有挑战性,但是在节省能源的同时获得良好的多核性能是必需的。由于同步,某些线程使其他线程等待,因为它们持有锁或尚未到达屏障。我们将这些关键线程称为性能至关重要的线程,这些线程的性能决定了程序整体的性能。识别这些线程可以为软件开发人员和硬件揭示大量优化机会。在本文中,我们提出了一种用于评估线程关键性的新指标,该指标将线程执行有用工作的时间和等待多少个同时运行的线程结合在一起。我们展示了如何通过适度的硬件添加和低开销在线计算线程的关键程度。我们使用度量标准来创建临界值堆栈,这些堆栈将总执行时间分成每个线程的临界值组件,从而可以轻松地对并行不平衡进行可视化分析。为了验证我们的关键性指标并证明它比以前的指标更好,我们调整了最关键的线程的频率并显示出它实现了最大的性能改进。然后,我们通过使用临界堆栈进行三种类型的优化来证明它们的广泛适用性:(1)程序分析以消除并行瓶颈;(2)动态识别最关键的线程并使用频率缩放来加速以提高性能;以及( 3)显示仅加速最关键的线程可以降低目标能耗。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号