首页> 外文会议>Parallel and distributed processing >Efficient Program Partitioning Based on Compiler COntrolled COmmunication
【24h】

Efficient Program Partitioning Based on Compiler COntrolled COmmunication

机译:基于编译器通信的高效程序分区

获取原文
获取原文并翻译 | 示例

摘要

In this paper, we present an efficient framework for intraprocedural performance based program patitioning for sequential loop nests. Due to the limitations of static dependence analysis especiaally in the inter-procedural sense, many loop nests are identified as sequential but available task parallelism amongst them could be potentially exploited. SInce this available parallelism is quite limited, performance based program analysis and partitioning which carefully analyzes the interaction between the loop nests and the underlying architectural characteristics must the undertaken to effectively use this parallelism. We propose a ocmpiler driven approach that configures underlying architecture to support a given communication mechanism. We then devise an iterative program partitioning algorithm that generates efficient program partitioning by analyzing interaction between effective cost of communication andthe corresponding partitions. We model this problem as one of partitioning a directed acyclic task graph in which each node is identified with a sequential loop nest an the edges denote the precedences and communication betwee nthe nodes corresponding to data transfer between loop nests. We introduce the concept of behavioral edges between edges and nodes in the task graph for capturing the interactions between ocmputation and ocmmunication through parametric functions. We present an efficient interative partitioining algorithm using the behavioral edge augmented PDG to increementally compute and improve the schedule. A significant performance improvement is demosnstrated by using our framework on soem applciation which exhibit this type of parallelism.
机译:在本文中,我们为顺序循环嵌套提供了一种基于过程内性能的基于程序分配的有效框架。由于静态依存关系分析的局限性,特别是在过程间的意义上,许多循环嵌套被标识为顺序的,但其中的可用任务并行性可能会被利用。由于这种可用的并行性非常有限,因此必须进行基于性能的程序分析和分区,以仔细分析循环嵌套与基础体系结构特征之间的相互作用,以有效地使用此并行性。我们提出了一种ocmpiler驱动的方法,该方法可配置基础体系结构以支持给定的通信机制。然后,我们设计一种迭代程序分区算法,该算法通过分析有效通信成本与相应分区之间的交互作用来生成高效程序分区。我们将此问题建模为对有向无环任务图进行分区的一种方法,在该图中,每个节点都由顺序的循环嵌套标识,边表示优先级,并且节点之间的通信对应于循环嵌套之间的数据传输。我们在任务图中介绍了边缘和节点之间的行为边缘的概念,用于通过参数函数捕获过位和正位之间的交互。我们提出了一种使用行为边缘增强PDG来递增地计算和改进进度表的有效的交互参与算法。通过使用我们的soem应用框架显示出这种类型的并行性,可以显着提高性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号