首页> 外文会议> >Topology-aware task mapping for reducing communication contention on large parallel machines

【24h】

Topology-aware task mapping for reducing communication contention on large parallel machines

机译：拓扑感知任务映射，可减少大型并行机上的通信争用

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Communication latencies constitute a significant factor in the performance of parallel applications. With techniques such as wormhole routing, the variation in no-load latencies became insignificant, i.e., the no-load latencies for far-away processors were not significantly higher (and too small to matter) than those for nearby processors. Contention in the network is then left as the major factor affecting latencies. With networks such as fat-trees of hypercubes, with number of wires growing as P log P, even this is not a very significant factor. However, for torus and grid networks now being used in large machines such as BlueGene/L and the Cray XT3, such contention becomes an issue. We quantify the effect of this contention with benchmarks that vary the number of hops traveled by each communicated byte. We then demonstrate a process mapping strategy that minimizes the impact of topology by heuristically minimizing the total number of hop-bytes communicated. This strategy, and its variants, are implemented in an adaptive runtime system in Charm++ and adaptive MPI, so it is available for a broad class of applications.

机译：通信延迟是并行应用程序性能的重要因素。通过诸如虫孔路由之类的技术，空载等待时间的变化变得微不足道，即，远处处理器的空载等待时间不会比附近处理器的空载等待时间显着更高（并且太小而无足轻重）。然后，网络中的争用将成为影响延迟的主要因素。对于诸如超立方体的胖树之类的网络，并且导线的数量以P log P的形式增长，即使这不是一个非常重要的因素。但是，对于如今在大型计算机（例如BlueGene / L和Cray XT3）中使用的环面和网格网络，这种争执成为一个问题。我们使用基准来量化此争用的效果，这些基准会改变每个通信字节传输的跃点数。然后，我们演示了一种进程映射策略，该策略通过启发式地最小化通信的跳字节总数来最小化拓扑的影响。该策略及其变体在Charm ++和自适应MPI的自适应运行时系统中实现，因此可用于广泛的应用程序。

著录项

来源
《》|2006年|P.10|共1页
会议地点
作者
Agarwal; T.; Sharma; A.; Laxmikant; A.; Kale; L.V.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类工业技术;
关键词
grid computing; message passing; parallel machines; BlueGene/L; Charm++; Cray XT3; adaptive MPI; adaptive runtime system; communication latency; grid network; hop-bytes; parallel machine; process mapping; topology-aware task mapping; torus network; wormhole routing;

机译：网格计算;消息传递;并行机; BlueGene / L; Charm ++; Cray XT3;自适应MPI;自适应运行时系统;通信延迟;网格网络;跳字节;并行机;过程映射;拓扑感知任务映射; torus网络;虫洞路由;

相似文献

外文文献
中文文献
专利

1. Optimal Group Paging Frequency for Machine-to-Machine Communications in LTE Networks With Contention Resolution [J] . Zhan Wen, Sun Xinghua, Li Yitong, Internet of Things Journal, IEEE . 2019,第6期

机译：具有争用分辨率LTE网络中机器到机器通信的最佳组寻呼频率
2. Topology-Aware and Dependence-Aware Scheduling and Memory Allocation for Task-Parallel Languages [J] . Drebes Andi, Pop Antoniu, Heydemann Karine, ACM Transactions on Architecture and Code Optimization . 2014,第3期

机译：任务并行语言的拓扑感知和依赖关系感知调度与内存分配
3. QTMS: A quadratic time complexity topology-aware process mapping method for large-scale parallel applications on shared HPC system [J] . Yan Baicheng, Xiao Limin, Qin Guangjun, Parallel Computing . 2020,第Juna期

机译：QTMS：一种二次时间复杂性拓扑信息，用于共享HPC系统的大规模并行应用的过程映射方法
4. Topology-aware task mapping for reducing communication contention on large parallel machines [C] . Agarwal T., Sharma A., Laxmikant A., IEEE International Parallel and Distributed Processing Symposium . 2006

机译：拓扑感知任务映射，用于减少大型并行机上的通信争用
5. Performance analysis and acceleration of nuclear physics application on high-performance computing platforms using GPGPUs and topology-aware mapping techniques [D] . Oryspayev, Dossay. 2016

机译：使用GPGPU和拓扑信息映射技术对高性能计算平台核物理应用的性能分析与加速
6. The Robot Selection Problem for Mini-Parallel Kinematic Machines: A Task-Driven Approach to the Selection Attributes Identification [O] . Cinzia Amici, Nicola Pellegrini, Monica Tiboni 2020

机译：迷你平行运动机器的机器人选择问题：选择属性识别的任务驱动方法
7. Topology-aware task mapping for reducing communication contention on large parallel machines [O] . Tarun Agarwal, Amit Sharma, Laxmikant V. Kalé 2006

机译：拓扑感知任务映射，可减少大型并行机上的通信争用

Topology-aware task mapping for reducing communication contention on large parallel machines

摘要

著录项

相似文献

相关主题

期刊订阅