Unveiling the Interplay Between Global Link Arrangements and Network Management Algorithms on Dragonfly Networks

机译：揭示蜻蜓网络上的全局链路布置与网络管理算法之间的相互作用

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Network messaging delay historically constitutes a large portion of the wall-clock time for High Performance Computing (HPC) applications, as these applications run on many nodes and involve intensive communication among their tasks. Dragonfly network topology has emerged as a promising solution for building exascale HPC systems owing to its low network diameter and large bisection bandwidth. Dragonfly includes local links that form groups and global links that connect these groups via high bandwidth optical links. Many aspects of the dragonfly network design are yet to be explored, such as the performance impact of the connectivity of the global links, i.e., global link arrangements, the bandwidth of the local and global links, or the job allocation algorithm. This paper first introduces a packet-level simulation framework to model the performance of HPC applications in detail. The proposed framework is able to simulate known MPI (message passing interface) routines as well as applications with custom-defined communication patterns for a given job placement algorithm and network topology. Using this simulation framework, we investigate the coupling between global link bandwidth and arrangements, communication pattern and intensity, job allocation and task mapping algorithms, and routing mechanisms in dragonfly topologies. We demonstrate that by choosing the right combination of system settings and workload allocation algorithms, communication overhead can be decreased by up to 44%. We also show that circulant arrangement provides up to 15% higher bisection bandwidth compared to the other arrangements, but for realistic workloads, the performance impact of link arrangements is less than 3%.

机译：网络消息传递延迟历史上构成了高性能计算（HPC）应用程序的大部分壁钟时间，因为这些应用程序在许多节点上运行并涉及其任务之间的密集通信。蜻蜓网络拓扑由于其低网络直径和大平衡带宽而成为建立Exascale HPC系统的有希望的解决方案。 DragonFly包括通过高带宽光链路形成组和全局链接的本地链接。尚未探索蜻蜓网络设计的许多方面，例如全局链路连接的性能影响，即全局链路布置，本地和全局链路的带宽或作业分配算法。本文首先介绍了一种数据包级仿真框架，以详细介绍HPC应用程序的性能。所提出的框架可以模拟已知的MPI（消息传递接口）例程以及具有定义定义的通信模式的应用程序，用于给定作业放置算法和网络拓扑。使用此仿真框架，我们研究了全局链路带宽和布置，通信模式和强度，作业分配和任务映射算法之间的耦合，以及蜻蜓拓扑中的路由机制。我们证明，通过选择系统设置和工作负载分配算法的正确组合，可以减少最多44 ％的通信开销。我们还表明，与其他布置相比，循环安排提供多达15 ％的平衡带宽，但对于其他布置，但是对于现实工作负载，链路布置的性能影响小于3 ％。

著录项

来源
《IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing》|2017年|568p|共10页
会议地点
作者
Fulya Kaplan; Ozan Tuncer; Vitus J. Leung; Scott K. Hemmert; Ayse K. Coskun;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP301-53;
关键词
Routing; Bandwidth; Network topology; Ports (Computers); Resource management; Optical fiber communication; Topology;

机译：路由;带宽;网络拓扑;端口（计算机）;资源管理;光纤通信;拓扑;

相似文献

外文文献
中文文献
专利

1. The management of international manufacturing networks: a missing link towards total management of global networks [J] . Cheng Yang, Farooq Sami, Johansen John, Production Planning & Control . 2019,第1a4期

机译：国际制造网络管理：全球网络总管理的缺失联系
2. Measurement and computer modeling of temporary arrangements of polygonal actin structures in trabecular meshwork cells which consist of cross-linked actin networks and polygonal actin arrangements [J] . ZhengY., CurrieL., PollockN., Journal of ocular pharmacology and therapeutics: The official journal of the Association for Ocular Pharmacology and Therapeutics . 2014,第2a3期

机译：小梁网状细胞中多边形肌动蛋白结构的临时排列的测量和计算机建模，该结构由交联的肌动蛋白网络和多边形肌动蛋白排列组成
3. Cross-Layer Rate Control in Wireless Networks with Lossy Links: Leaky-Pipe Flow, Effective Network Utility Maximization and Hop-by-Hop Algorithms [J] . Gao Q., Zhang J., Hanly S.V. Wireless Communications, IEEE Transactions on . 2009,第6期

机译：有损链路的无线网络中的跨层速率控制：管道泄漏流量，有效的网络实用程序最大化和逐跳算法
4. Unveiling the Interplay Between Global Link Arrangements and Network Management Algorithms on Dragonfly Networks [C] . Fulya Kaplan, Ozan Tuncer, Vitus J. Leung, IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing . 2017

机译：在蜻蜓网络上揭示全局链接安排和网络管理算法之间的相互作用
5. Incorporating link correlations in models and algorithms for localization in wireless sensor networks [D] . Agrawal, Piyush 2012

机译：将链路相关性纳入模型和算法中以在无线传感器网络中进行定位
6. A Novel Clustering Algorithm for Mobile Ad Hoc Networks Based on Determination of Virtual Links Weight to Increase Network Stability [O] . Abbas Karimi, Abbas Afsharfarnia, Faraneh Zarafshan, -1

机译：基于虚拟链路权重确定以提高网络稳定性的移动自组织网络的新型聚类算法
7. The management of international manufacturing networks: a missing link towards total management of global networks [O] . Yang Cheng, Sami Farooq, John Johansen, 2019

机译：国际制造网络管理：全球网络总管理的缺失链接

Unveiling the Interplay Between Global Link Arrangements and Network Management Algorithms on Dragonfly Networks

摘要

著录项

相似文献

相关主题

期刊订阅