首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Fusion of loops for parallelism and locality
【24h】

Fusion of loops for parallelism and locality

机译:融合循环以实现并行性和局部性

获取原文
获取原文并翻译 | 示例

摘要

Loop fusion improves data locality and reduces synchronization in data-parallel applications. However, loop fusion is not always legal. Even when legal, fusion may introduce loop-carried dependences which prevent parallelism. In addition, performance losses result from cache conflicts in fused loops. In this paper, we present new techniques to: (1) allow fusion of loop nests in the presence of fusion-preventing dependences, (2) maintain parallelism and allow the parallel execution of fused loops with minimal synchronization, and (3) eliminate cache conflicts in fused loops. We describe algorithms for implementing these techniques in compilers. The techniques are evaluated on a 56-processor KSR2 multiprocessor and on a 18-processor Convex SPP-1000 multiprocessor. The results demonstrate performance improvements for both kernels and complete applications. The results also indicate that careful evaluation of the profitability of fusion is necessary as more processors are used.
机译:循环融合改善了数据局部性,并减少了数据并行应用程序中的同步。但是,循环融合并不总是合法的。即使合法,融合也可能引入循环携带的依赖性,从而阻止并行性。此外,性能损失是由融合循环中的缓存冲突引起的。在本文中,我们提出了以下新技术:(1)在存在防止融合的依赖性的情况下允许循环嵌套的融合;(2)保持并行性并允许以最小的同步性并行执行融合的循环;(3)消除缓存融合循环中的冲突。我们描述了在编译器中实现这些技术的算法。在56处理器KSR2多处理器和18处理器Convex SPP-1000多处理器上对这些技术进行了评估。结果证明了内核和完整应用程序的性能改进。结果还表明,随着使用更多加工商,有必要仔细评估融合的利润率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号