首页>
外国专利>
MITIGATING COMMUNICATION BOTTLENECKS DURING PARAMETER EXCHANGE IN DATA-PARALLEL DNN TRAINING
MITIGATING COMMUNICATION BOTTLENECKS DURING PARAMETER EXCHANGE IN DATA-PARALLEL DNN TRAINING
展开▼
机译:在数据平行DNN培训中减轻参数交换期间的通信瓶颈
展开▼
页面导航
摘要
著录项
相似文献
摘要
Technologies are disclosed herein for dynamically generating communication primitives for use in model parameter synchronization during data-parallel DNN training by packing directed spanning trees. An interconnect topology for communication between GPUs in a computing system is determined. A quantity of directed spanning trees are generated for transmitting data between the GPUs using the interconnect topology and packed. The directed spanning trees define the connections between GPUs that are to be utilized for the transmission and the amount of data to be transmitted on each connection. Program code is generated for implementing the data transfer defined by the directed spanning trees. When the program code is executed, the directed spanning trees are used to pipeline the transmission of chunks of data, such as model parameters used during data-parallel DNN training, between the GPUs. The program code can also determine an optimal chunk size for data to be transferred between the GPUs.
展开▼