首页> 外文会议>IPPS/SPDP'99 workshops >Efficient Program Partitioning Based on Compiler COntrolled COmmunication
【24h】

Efficient Program Partitioning Based on Compiler COntrolled COmmunication

机译:基于编译控制通信的高效程序分区

获取原文

摘要

In this paper, we present an efficient framework for intraprocedural performance based program patitioning for sequential loop nests. Due to the limitations of static dependence analysis especiaally in the inter-procedural sense, many loop nests are identified as sequential but available task parallelism amongst them could be potentially exploited. SInce this available parallelism is quite limited, performance based program analysis and partitioning which carefully analyzes the interaction between the loop nests and the underlying architectural characteristics must the undertaken to effectively use this parallelism. We propose a ocmpiler driven approach that configures underlying architecture to support a given communication mechanism. We then devise an iterative program partitioning algorithm that generates efficient program partitioning by analyzing interaction between effective cost of communication andthe corresponding partitions. We model this problem as one of partitioning a directed acyclic task graph in which each node is identified with a sequential loop nest an the edges denote the precedences and communication betwee nthe nodes corresponding to data transfer between loop nests. We introduce the concept of behavioral edges between edges and nodes in the task graph for capturing the interactions between ocmputation and ocmmunication through parametric functions. We present an efficient interative partitioining algorithm using the behavioral edge augmented PDG to increementally compute and improve the schedule. A significant performance improvement is demosnstrated by using our framework on soem applciation which exhibit this type of parallelism.
机译:在本文中,我们为顺序循环嵌套的基于内部性能的程序提供了一种有效的框架。由于静态依赖性分析的局限性特别地在过程中,许多循环嵌套被识别为潜在的序列但可用的任务并行性,可能会潜在利用。由于这种可用的并行性非常有限,基于性能的程序分析和分区,仔细分析了循环巢和底层架构特征之间的交互必须有效地使用这种并行性。我们提出了一种OCMPiler驱动方法,该方法配置基础架构以支持给定的通信机制。然后,我们设计了一种迭代程序分区算法,通过分析通信的有效成本与相应分区之间的交互来生成有效的程序分区。我们将该问题塑造为划分的定向非循环任务图之一,其中每个节点用顺序循环嵌套识别,边缘表示与循环嵌套之间的数据传输对应的节点之间的预义和通信。我们在任务图中介绍了边缘和节点之间的行为边缘的概念,以捕获通过参数函数捕获灭用和ocmunic之间的交互。我们使用行为边缘增强PDG提出了一种有效的偏执算法,以令人疑惑地计算和改进计划。通过使用我们的框架对展示这种类型的并行性的SOEM应用框架进行了显着的性能改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号