首页> 外国专利> MITIGATING COMMUNICATION BOTTLENECKS DURING PARAMETER EXCHANGE IN DATA-PARALLEL DNN TRAINING

MITIGATING COMMUNICATION BOTTLENECKS DURING PARAMETER EXCHANGE IN DATA-PARALLEL DNN TRAINING

机译:在数据平行DNN培训中减轻参数交换期间的通信瓶颈

摘要

Technologies are disclosed herein for dynamically generating communication primitives for use in model parameter synchronization during data-parallel DNN training by packing directed spanning trees. An interconnect topology for communication between GPUs in a computing system is determined. A quantity of directed spanning trees are generated for transmitting data between the GPUs using the interconnect topology and packed. The directed spanning trees define the connections between GPUs that are to be utilized for the transmission and the amount of data to be transmitted on each connection. Program code is generated for implementing the data transfer defined by the directed spanning trees. When the program code is executed, the directed spanning trees are used to pipeline the transmission of chunks of data, such as model parameters used during data-parallel DNN training, between the GPUs. The program code can also determine an optimal chunk size for data to be transferred between the GPUs.
机译:本文公开了用于通过包装定向跨度树在数据并行DNN训练期间动态地生成用于模型参数同步的通信基元。确定用于计算系统中GPU之间的通信的互连拓扑。生成一定数量的指向生成树,用于使用互连拓扑结构和包装在GPU之间传输数据。定向生成树定义要被用于传输和数据量的每个连接上要发送的GPU之间的连接。生成程序代码,用于实现由定向的生成树定义的数据传输。当执行程序代码时,定向的生成树用于向流水线流水,例如在GPU之间的数据并行DNN训练期间使用的模型参数。该程序代码还能够确定一个最佳的块大小,用于向GPU之间传送的数据。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号