首页> 外文会议>IEEE International Conference on Application-specific Systems, Architectures and Processors >Symbolic parallelization of loop programs for massively parallel processor arrays
【24h】

Symbolic parallelization of loop programs for massively parallel processor arrays

机译:大型平行处理器阵列的循环程序的符号并行化

获取原文

摘要

In this paper, we present a first solution to the unsolved problem of joint tiling and scheduling a given loop nest with uniform data dependencies symbolically. This problem arises for loop programs for which the iterations shall be optimally scheduled on a processor array of unknown size at compile-time. Still, we show that it is possible to derive parameterized latency-optimal schedules statically by proposing two new program transformations: In the first step, the iteration space is tiled symbolically into orthotopes of parametrized extensions. The resulting tiled program is subsequently scheduled symbolically. Here, we show that the maximal number of potential optimal schedules is upper bounded by 2n n! where n is the dimension of the loop nest. However, the real number of optimal schedule candidates being much less than this. At run-time, once the size of the processor array becomes known, simple comparisons of latency-determining expressions finally steer which of these schedules will be dynamically activated and the corresponding program configuration executed on the resulting processor array so to avoid any further run-time optimization or expensive recompilations.
机译:在本文中,我们向联合划线的未解决问题提出了第一个解决方案,并将带有统一的数据依赖性符号依赖性的给定循环嵌套的解决方案。此问题出现用于循环程序,其中迭代应在编译时在未知大小的处理器数组上进行最佳安排。尽管如此,我们仍然可以通过提出两个新的程序转换静态地静态地派生参数化延迟 - 最佳时间表:在第一步中,迭代空间符号地铺叠成参数化扩展的原位。由此产生的平铺程序符号安排。在这里,我们表明潜在的最佳时间表的最大数量是2 n n的上限!其中n是循环嵌套的尺寸。但是,实际数量的最佳时间表候选者远非这么做。在运行时,一旦处理器阵列的大小被知道,延迟确定表达式的简单比较最终操纵将动态激活哪些计划,并且在生成的处理器阵列上执行的相应节目配置以避免任何进一步运行 - 时间优化或昂贵的重新编译。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号