【24h】

Efficient Implementation of Allreduce on BlueGene/L Collective Network

机译:在BlueGene / L集体网络上有效实施Allreduce

获取原文
获取原文并翻译 | 示例

摘要

BlueGene/L is currently in the pole position on the Top500 list. In its full configuration the system will leverage 65,536 compute nodes. Application scalability is a crucial issue for a system of such size. On BlueGene/L scalability is made possible through the efficient exploitation of special communication. The BlueGene/L system software provides its own optimized version for collective communication routines in addition to the general purpose MPICH2 implementation. The collective network is a natural platform for reduction operations due to its built-in arithmetic units. Unfortunately ALUs of the collective network can handle only fixed point operands. Therefore efficient exploitation of that network for the purpose of floating point reductions is a challenging task. In this paper we present our experiences with implementing an efficient collective network algorithm for Allreduce sums of floating point numbers.
机译:BlueGene / L目前在“ Top500”列表中位居榜首。在完整配置下,系统将利用65,536个计算节点。对于这种规模的系统,应用程序可伸缩性是至关重要的问题。在BlueGene / L上,可通过有效利用特殊通信来实现可伸缩性。除了通用MPICH2实现之外,BlueGene / L系统软件还为集体通信例程提供了自己的优化版本。集合网络具有内置的算术单元,因此是进行归约运算的自然平台。不幸的是,集体网络的ALU只能处理定点操作数。因此,为了减少浮点数而有效利用该网络是一项艰巨的任务。在本文中,我们介绍了实现有效的集合网络算法以减少浮点数之和的经验。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号