CODE TRANSFORMATIONS FOR ENHANCING THE PERFORMANCE OF SPECULATIVELY PARALLEL THREADS

SHENGYUE WANG; PEN-CHUNG YEW; ANTONIA ZHAI

首页> 外文期刊>Journal of Circuits, Systems, and Computers >CODE TRANSFORMATIONS FOR ENHANCING THE PERFORMANCE OF SPECULATIVELY PARALLEL THREADS

【24h】

CODE TRANSFORMATIONS FOR ENHANCING THE PERFORMANCE OF SPECULATIVELY PARALLEL THREADS

机译：代码转换，以增强指定并行线程的性能

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

As technology advances, microprocessors that integrate multiple cores on a single chip are becoming increasingly common. How to use these processors to improve the performance of a single program has been a challenge. For general-purpose applications, it is especially difficult to create efficient parallel execution due to the complex control flow and ambiguous data dependences. Thread-level speculation and transactional memory provide two hardware mechanisms that are able to optimistically parallelize potentially dependent threads. However, a compiler that performs detailed performance trade-off analysis is essential for generating efficient parallel programs for these hardwares. This compiler must be able to take into consideration the cost of intra-thread as well as inter-thread value communication. On the other hand, the ubiquitous existence of complex, input-dependent control flow and data dependence patterns in general-purpose applications makes it impossible to have one technique optimize all program patterns. In this paper, we propose three optimization techniques to improve the thread performance: (ⅰ) scheduling instruction and generating recovery code to reduce the critical forwarding path introduced by synchronizing memory resident values; (ⅱ) identifying reduction variables and transforming the code the minimize the serializing execution; and (ⅲ) dynamically merging consecutive iterations of a loop to avoid stalls due to unbalanced workload. Detailed evaluation of the proposed mechanism shows that each optimization technique improves a subset but none improve all of the SPEC2000 benchmarks. On average, the proposed optimizations improve the performance by 7% for the set of the SPEC2000 benchmarks that have already been optimized for register-resident value communication.

机译：随着技术的进步，在单个芯片上集成多个内核的微处理器变得越来越普遍。如何使用这些处理器来提高单个程序的性能一直是一个挑战。对于通用应用程序，由于复杂的控制流和模糊的数据依赖关系，很难创建有效的并行执行。线程级推测和事务性内存提供了两种硬件机制，能够乐观地并行化潜在依赖的线程。但是，执行详细性能折衷分析的编译器对于为这些硬件生成有效的并行程序至关重要。该编译器必须能够考虑线程内以及线程间值通信的成本。另一方面，通用应用程序中普遍存在着复杂的，依赖于输入的控制流和依赖于数据的模式，因此不可能有一种技术来优化所有程序模式。在本文中，我们提出了三种优化技术来提高线程性能：（ⅰ）调度指令并生成恢复代码以减少通过同步内存驻留值而引入的关键转发路径；（ⅱ）确定归约变量并转换代码，以最大程度地减少序列化执行；（ⅲ）动态合并循环的连续迭代，以避免由于工作负载不平衡而造成的停顿。对提出的机制的详细评估表明，每种优化技术都可以改善一个子集，但不能改善所有SPEC2000基准。平均而言，对于已经针对寄存器-居民价值通信进行了优化的SPEC2000基准集，建议的优化将性能提高了7％。

著录项

来源
《Journal of Circuits, Systems, and Computers》 |2012年第2期|p.1240008.1-1240008.23|共23页
作者
SHENGYUE WANG; PEN-CHUNG YEW; ANTONIA ZHAI;
展开▼
作者单位

Oracle Corporation, Santa Clara,California, 95054, USA;

Department of Computer Science and Engineering,University of Minnesota,Minneapolis, Minnesota, 55455, USA;

Department of Computer Science and Engineering,University of Minnesota,Minneapolis, Minnesota, 55455, USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
thread-level speculation; multicore systems; compiler optimizations; parallelizing compiler;

机译：线程级推测;多核系统;编译器优化;并行化编译器;

相似文献

外文文献
中文文献
专利

1. SPECULATIVE SYNCHRONIZATION: PROGRAMMABILITY AND PERFORMANCE FOR PARALLEL CODES [J] . Jose F. Martinez, Josep Torrellas IEEE Micro . 2003,第6期

机译：可编程同步：并行代码的可编程性和性能
2. Thread partitioning and value prediction for exploiting speculative thread-level parallelism [J] . Marcuello P., Gonzalez A., Tubella J. IEEE Transactions on Computers . 2004,第2期

机译：线程分区和值预测，以利用推测性线程级并行性
3. Speculative Parallelization Using Software Multi-threaded Transactions [J] . Arun Raman, Hanjun Kim, Thomas R. Mason, Computer architecture news . 2010,第1期

机译：使用软件多线程事务进行投机并行化
4. Balancing Thread Partition for Efficiently Exploiting Speculative Thread-Level Parallelism [C] . Yaobin Wang, Hong An, Bo Liang, International Symposium on Advances in Visual Computing(ISVC 2007); 20071126-28; Lake Tahoe,NV(US) . 2007

机译：平衡线程分区，以有效利用推测性线程级并行性
5. Hydra: A chip multiprocessor with support for speculative thread-level parallelization. [D] . Hammond, Lance Stirling. 2002

机译：Hydra：一种芯片多处理器，支持推测线程级并行化。
6. Enhancing the usability and performance of structured association mapping algorithms using automation parallelization and visualization in the GenAMap software system [O] . Ross E Curtis, Anuj Goyal, Eric P Xing 2012

机译：使用GenAMap软件系统中的自动化并行化和可视化功能来增强结构化关联映射算法的可用性和性能
7. Speculative synchronization: Programmability and performance for parallel codes [O] . José F. Martínez, Josep Torrellas 2003

机译：推测性同步：并行代码的可编程性和性能

CODE TRANSFORMATIONS FOR ENHANCING THE PERFORMANCE OF SPECULATIVELY PARALLEL THREADS

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅