首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Optimizing Communications of Dynamic Data Redistribution on Symmetrical Matrices in Parallelizing Compilers
【24h】

Optimizing Communications of Dynamic Data Redistribution on Symmetrical Matrices in Parallelizing Compilers

机译:并行编译器中对称矩阵上动态数据重新分配的通信优化

获取原文
获取原文并翻译 | 示例

摘要

Dynamic data redistribution is used to enhance data locality and algorithm performance by reducing interprocessor communication in many parallel scientific applications on distributed memory multicomputers. Since the redistribution is performed at runtime, there is a performance tradeoff between the efficiency of the new data decomposition for a subsequent phase of an algorithm and the cost of redistributing data among processors. In this paper, we present a processor replacement scheme to minimize the cost of interprocessor data exchange during runtime. The main idea of the proposed technique is to develop a replacement function for reordering logical processors in the destination phase. Based on the replacement function, a realigned sequence of destination processors can be derived and is then used to perform data decomposition in the receiving phase. Together with local matrix and compressed CRS vectors transposition schemes, the interprocessor communication can be eliminated during runtime. A significant improvement of this approach is that the realignment of data can be performed without interprocessor communication for special cases. The second contribution of the present technique is that the complicated communication sets generation could be simplified by applying local matrix transposition. Consequently, the indexing cost could be reduced significantly. The proposed techniques can be applied in both dense and sparse applications. A generalized symmetric redistribution algorithm is also presented in this work. To analyze the efficiency of the proposed technique, the theoretical analysis proves that up to (p-1)/p data transmission cost can be saved. For general cases, the symmetric redistribution algorithm saves 1/p communication overheads compared with the traditional method. Experimental results also show that the proposed techniques provide superior performance in most data redistribution instances
机译:动态数据重新分配用于通过减少分布式内存多计算机上许多并行科学应用中的处理器间通信来增强数据局部性和算法性能。由于重新分配是在运行时执行的,因此在算法后续阶段的新数据分解效率与在处理器之间重新分配数据的成本之间存在性能折衷。在本文中,我们提出了一种处理器替换方案,以最大程度地减少运行时处理器间数据交换的成本。提出的技术的主要思想是开发一种替换功能,用于在目标阶段对逻辑处理器进行重新排序。基于替换功能,可以导出目标处理器的重新排列序列,然后将其用于接收阶段中的数据分解。连同局部矩阵和压缩的CRS矢量转置方案,可以在运行时消除处理器间的通信。此方法的显着改进是,在特殊情况下,无需处理器间通信即可执行数据重新对齐。本技术的第二个贡献是可以通过应用局部矩阵转置来简化复杂的通信集生成。因此,索引成本可以大大降低。所提出的技术可以应用于密集和稀疏应用。这项工作中还提出了一种广义的对称再分配算法。为了分析所提出技术的效率,理论分析证明可以节省多达(p-1)/ p的数据传输成本。在一般情况下,与传统方法相比,对称重新分配算法节省了1 / p的通信开销。实验结果还表明,所提出的技术可在大多数数据重新分配实例中提供出色的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号