首页> 外文会议>Conference on Lightwave Technology >Extending collective operations with application semantics for improving multi-cluster performance
【24h】

Extending collective operations with application semantics for improving multi-cluster performance

机译:使用应用语义扩展集体操作,以提高多群序列

获取原文
获取外文期刊封面目录资料

摘要

We identify two ways of increasing the performance of allreduce-style of collective operations in a multi-cluster with large WAN latencies: (i) hiding latency in system noise, and (ii) conditional-allreduce where knowledge about the application is used to reduce the number of WAN messages. In our multicluster, system noise was not large enough to hide the WAN latency. But, the latency could be hidden using conditional-allreduce, since on many iterations only cluster-local values were needed, and many of the values needed from other clusters were prefetched. A speedup of 2.4 was achieved for a microbenchmark. Prefetching introduced a small overhead in the cluster with the slowest hosts.
机译:我们确定两种方法来增加具有大WAN延迟的多集群中的分组式集体运营的性能的方法:(i)在系统噪声中隐藏延迟,(ii)有条件 - 释放关于应用程序的知识用于减少WAN消息的数量。在我们的多板中,系统噪声不足以隐藏WAN延迟。但是,延迟可以使用条件复发隐藏,因为许多迭代只需要群集 - 本地值,并且预取其他群集所需的许多值。为微弱的Markmark实现了2.4的加速。预取在群集中引入了最慢的主机的小开销。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号