Complementing user-level coarse-grain parallelism with implicit speculative parallelism

机译：用隐含的推测性并行性补充用户级别的粗粒度并行性

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Multi-core and many-core systems are the norm in contemporary processor technology and are expected to remain so for the foreseeable future. Programs using parallel programming primitives like PThreads or OpenMP often exploit coarse-grain parallelism, because it offers a good trade-off between programming effort versus performance gain. Some parallel applications show limited or no scaling beyond a number of cores. Given the abundant number of cores expected in future many-cores, several cores would remain idle in such cases while execution performance stagnates. This paper proposes using cores that do not contribute to performance improvement for running implicit fine-grain speculative threads. In particular, we present a many-core architecture and protocol that allow applications with coarse-grain explicit parallelism to further exploit implicit speculative parallelism within each thread. Implicit speculative parallelism frees the programmer from the additional effort to explicitly partition the work into finer and properly synchronized tasks. Our results show that, for a many-core comprising of 128 cores supporting implicit speculative parallelism in clusters of 2 or 4 cores, performance improves on top of the highest scalability point by 41% on average for the 4-core cluster and by 27% on average for the 2-core cluster. These performance improvements come with an energy consumption that is close to - and sometimes better than - the baseline. This approach often leads to better performance and energy efficiency compared to existing alternatives such as Core Fusion and Frequency Boosting. We also investigate the tradeoffs between explicit and implicit threads as input dataset sizes vary. Finally, we present a dynamic mechanism to choose the number of explicit and implicit threads, which performs within 6% of the static oracle selection of threads.

机译：多核和多核系统是当今处理器技术的规范，并且在可预见的将来有望保持这种状态。使用并行编程原语（例如PThreads或OpenMP）的程序经常利用粗粒度并行性，因为它在编程工作量与性能增益之间提供了良好的折衷。某些并行应用程序显示扩展数量超出核心数量限制或没有扩展。考虑到将来的多核中将有大量的核，在这种情况下，几个核将保持空闲状态，而执行性能则停滞不前。本文提出了使用对运行隐式细粒度投机线程无益于性能提升的内核。特别是，我们提出了一种多核体系结构和协议，该体系结构和协议允许具有粗粒度显式并行性的应用程序进一步利用每个线程内的隐式推测并行性。隐式的推测并行性使程序员摆脱了将工作明确地划分为更精细和适当同步的任务的额外工作。我们的结果表明，对于包含2个或4个内核的群集中支持隐式推测并行性的128个内核的多内核，在最高可扩展点之上，性能对于4核群集平均提高了41％，而平均性能提高了27％平均而言，两核集群。这些性能改进带来的能耗接近于基线，有时甚至高于基线。与诸如Core Fusion和Frequency Boosting之类的现有替代方案相比，这种方法通常可以带来更好的性能和能效。随着输入数据集大小的变化，我们还将研究显式线程和隐式线程之间的权衡。最后，我们提出了一种动态机制来选择显式和隐式线程的数量，该机制执行静态oracle线程选择的6％以内。

著录项

来源
《IEEE/ACM International Symposium on Microarchitecture》|2011年|284-295|共12页
会议地点
作者
Nikolas Ioannou; Marcelo Cintra;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Informatics; Servers; Lead; Writing; Acceleration; Hardware; Routing;

机译：信息学;服务器;领导;写作;加速;硬件;路由;

相似文献

外文文献
中文文献
专利

1. Exploiting coarse-grain speculative parallelism [J] . Hari K. Pyla, Calvin Ribbens, Srinidhi Varadarajan ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2011,第10期

机译：利用粗粒度的推测并行性
2. Deep Jam: Conversion of Coarse-Grain Parallelism to Fine-Grain and Vector Parallelism [J] . Albert Cohen, and William Jalby, Patrick Carribault, Journal of instruction-level parallelism . 2007,第2007期

机译：深度阻塞：将粗粒度并行化转换为细粒度和矢量并行化
3. A dynamically reconfigurable video compression scheme using FPGAs with coarse-grain parallelism [J] . Ramachandran S., Srinivasan S. VLSI Design . 2002,第2期

机译：使用具有粗粒度并行度的FPGA的动态可重新配置视频压缩方案
4. Complementing user-level coarse-grain parallelism with implicit speculative parallelism [C] . Nikolas Ioannou, Marcelo Cintra IEEE/ACM International Symposium on Microarchitecture . 2011

机译：用隐式推测平行度补充用户级粗晶行度
5. Multigrain Parallelism: Bridging Coarse-Grain Parallel Programming and Fine-Grain Event-Driven Multithreading [D] . Arteaga, Jaime. 2017

机译：杂粮并行：桥接粗粒度并行编程和细粒度事件驱动的多线程
6. Parallelism between regulatory effects of erythrocyte glycoproteins on phagocytosis and on the alternative complement pathway. [O] . H Shinomiya, T Sukegawa, M Hatanaka, 1983

机译：红细胞糖蛋白对吞噬作用和替代补体途径的调节作用之间存在平行关系。
7. Complementing user-level coarse-grain parallelism with implicit speculative parallelism [O] . Nikolas Ioannou, Marcelo Cintra 2011

机译：用隐式推测并行性补充用户级粗粒并行性

Complementing user-level coarse-grain parallelism with implicit speculative parallelism

摘要

著录项

相似文献

相关主题

期刊订阅