首页> 外文会议>ACM SIGPLAN Conference on Programming Language Design and Implementation >Effective Parallelization of Loops in the Presence of I/O Operations
【24h】

Effective Parallelization of Loops in the Presence of I/O Operations

机译:在I / O操作的情况下,循环的有效并行化

获取原文

摘要

Software-based thread-level parallelization has been widely studied for exploiting data parallelism in purely computational loops to improve program performance on multiprocessors. However, none of the previous efforts deal with efficient parallelization of hybrid loops, i.e., loops that contain a mix of computation and I/O operations. In this paper, we propose a set of techniques for efficiently parallelizing hybrid loops. Our techniques apply DOALL parallelism to hybrid loops by breaking the cross-iteration dependences caused by I/O operations. We also support speculative execution of I/O operations to enable speculative parallelization of hybrid loops. Helper threading is used to reduce the I/O bus contention caused by the improved parallelism. We provide an easy-to-use programming model for exploiting parallelism in loops with I/O operations. Parallelizing hybrid loops using our model requires few modifications to the code. We have developed a prototype implementation of our programming model. We have evaluated our implementation on a 24-core machine using eight applications, including a widely-used genomic sequence assembler and a multi-player game server, and others from PARSEC and SPEC CPU2000 benchmark suites. The hybrid loops in these applications take 23%-99% of the total execution time on our 24-core machine. The parallelized applications achieve speedups of 3.0×-12.8× with hybrid loop parallelization over the sequential versions of the same applications. Compared to the versions of applications where only computation loops are parallelized, hybrid loop parallelization improves the application performance by 68% on average.
机译:基于软件的线程并行化已被广泛研究用于利用纯粹计算环路中的数据并行性,以提高多处理器上的程序性能。然而,以前的努力均未处理混合循环的有效并行化,即包含计算混合和I / O操作的循环。在本文中,我们提出了一系列技术,用于有效地平行混合循环。我们的技术通过破坏由I / O操作引起的交叉迭代依赖性将DOALLPARAPTINALIC应用于混合循环。我们还支持I / O操作的推测执行,以实现混合循环的推测性并行化。帮助螺纹用于减少由改进的并行性引起的I / O总线争用。我们提供了一种易于使用的编程模型,用于利用I / O操作的循环中的并行性。使用我们的模型并行化混合循环需要对代码的修改很少。我们开发了我们编程模型的原型实现。我们在使用八个应用程序中评估了我们在24核机上的实现,包括广泛使用的基因组序列汇编器和多人游戏服务器,以及来自Parsec和Spec CPU2000基准套件的其他人。这些应用中的混合循环在我们的24核机上的总执行时间的23%-99%。并行化应用程序通过相同应用程序的顺序版本实现3.0×-12.8倍的速度为3.0×-12.8倍。与仅计算循环是并行化的应用程序的版本相比,混合循环并行化平均将应用程序性能提高了68%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号