首页> 外文OA文献 >Dynamic and speculative polyhedral parallelization of loop nests using binary code patterns
【2h】

Dynamic and speculative polyhedral parallelization of loop nests using binary code patterns

机译:使用二进制代码模式的循环嵌套的动态和推测性多面体并行化

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Speculative parallelization is a classic strategy for automatically parallelizing codes that cannot be handled at compile-time due to the use ofdynamic data and control structures. Another motivation of being speculative is to adapt the code to the current execution context, by selecting at run-time an efficient parallel schedule. However, since this parallelization scheme requires on-the-fly semantics verification, it is in general difficult to perform advanced transformations for optimization and parallelism extraction. We propose a framework dedicated tospeculative parallelization of scientific nested loop kernels, able to transform thecode at runtime by re-scheduling the iterations to exhibit parallelism and data locality. The run-time process includes a transformation selection guided by profiling phases on short samples, using an instrumented version of the code. During this phase, the accessed memory addresses are interpolated to build a predictor of the forthcoming accesses. The collected addresses are also used to compute on-the-fly dependence distance vectors by tracking accesses to common addresses. Interpolating functions and distance vectors are then employed in dynamicdependence analysis and in selecting a parallelizing transformation that, if the prediction is correct, does not induce any rollback during execution. In order to ensure that the rollback time overhead stays low, the code is executed in successive slices of the outermost original loop of the nest. Each slice can be either a parallelized version, a sequential original version, or an instrumented version. Moreover, such slicing of the execution provides the opportunity of transforming differently the code to adapt to the observed execution phases. Parallel codegeneration is achieved almost at no cost by using binary code patterns that are generated at compile-time and that are simply patched at run-time to result in the transformed code. The framework has been implemented with extensions of the LLVM compiler and an x86-64 runtime system. Significant speed-ups are shown on a set of benchmarks that could not have been handled efficiently by a compiler.
机译:推测性并行化是一种经典的策略,用于自动并行化由于使用动态数据和控制结构而无法在编译时处理的代码。进行推测的另一个动机是通过在运行时选择有效的并行调度来使代码适应当前的执行上下文。但是,由于这种并行化方案需要即时语义验证,因此通常很难执行高级转换以进行优化和并行化提取。我们提出了一个框架,专门用于科学嵌套循环内核的推测性并行化,该框架能够通过重新调度迭代以显示并行性和数据局部性来在运行时转换代码。运行时过程包括转换的选择,该转换的选择是使用代码的工具化版本对简短样本进行概要分析的阶段进行的。在此阶段,对访问的内存地址进行插值以建立即将发生的访问的预测变量。所收集的地址还用于通过跟踪对公共地址的访问来计算动态依赖距离矢量。然后在动态相关性分析和选择并行化变换中使用插值函数和距离矢量,如果预测正确,则在执行过程中不会引起任何回滚。为了确保回滚时间开销较低,该代码在嵌套的最外层原始循环的连续切片中执行。每个切片可以是并行化版本,顺序原始版本或检测版本。此外,执行的这种切片提供了机会,以不同的方式转换代码以适应观察到的执行阶段。通过使用在编译时生成的二进制代码模式几乎免费地实现了并行代码生成,这些二进制代码模式在运行时简单地进行了修补以生成转换后的代码。该框架已通过LLVM编译器和x86-64运行时系统的扩展实现。一组编译器无法有效处理的基准测试显示了显着的加速效果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号