首页> 外文期刊>Journal of Parallel and Distributed Computing >Topology and computational-power aware tile mapping of perfectly nested loops with dependencies on distributed systems
【24h】

Topology and computational-power aware tile mapping of perfectly nested loops with dependencies on distributed systems

机译:具有分布式系统依赖性的完美嵌套循环的拓扑和计算 - 功率感知区块映射

获取原文
获取原文并翻译 | 示例

摘要

Nested loops are main source of the parallelism in many scientific applications. Partitioning the iteration space of nested loops with data dependencies into tiles and assigning them to processing nodes for parallel execution is essential for achieving high performance. Although most of the previous work focused on tiling on fully connected homogeneous distributed systems, some studies have been devoted to tiling on partially connected distributed systems. In this paper, we address the parallelization of perfectly nested loops with dependencies on partially connected heterogeneous distributed systems and present a topology and computational-power aware tile mapping. This work aims to take into account not only the node's computational power when tiling iteration space of nested loops but also the exploitation of the network topology when mapping tiles to processing nodes. This approach allows minimizing the parallel execution time by improving the load balancing and minimizing the communication costs. We demonstrate the performance of proposed method by comparing it with the computational-power aware tile mapping and the topology aware tile mapping. The experimental results show that the proposed method improves the parallel execution time by up to 62% and 28% compared with the computational-power aware tile mapping and the topology aware tile mapping, respectively. (C) 2019 Elsevier Inc. All rights reserved.
机译:嵌套环是许多科学应用中并行性的主要来源。将嵌套循环的迭代空间与数据依赖性分区,并将其分配给处理节点以进行并行执行对于实现高性能是必不可少的。虽然以前的大多数工作都集中在平铺上完全连接的均匀分布式系统上,但一些研究已经致力于在部分连接的分布式系统上铺平。在本文中,我们通过部分连接的异构分布式系统的依赖性解决了完美嵌套环路的并行化,并呈现了拓扑和计算功率感知瓷砖映射。这项工作旨在在嵌套环路的迭代空间时,不仅考虑节点的计算能力,还要考虑嵌套环路的迭代空间,还要在将图块映射到处理节点时的网络拓扑的开发。这种方法允许通过改善负载平衡并最小化通信成本来最小化并行执行时间。我们通过将其与计算功率感知区块映射和拓扑意识块映射进行比较来展示所提出的方法的性能。实验结果表明,与计算功率感知瓷砖映射和拓扑意识图块映射相比,该方法将平行执行时间提高至62%和28%。 (c)2019 Elsevier Inc.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号