Math. Comput. Sci. Div., Argonne Nat. Lab., Argonne, IL, USA;
application program interfaces; memory architecture; parallel architectures; shared memory systems; MPI-and-ULT; asynchronous MPI communication-with-user-level thread; data sharing; high-performance conjugate gradient benchmark; hybrid MPI-and-Thread runtime system; inter-node communication; intra-node parallelism; large scale SMP cluster; shared-memory architecture; threading libraries; Computational modeling; Context; Instruction sets; Kernel; Message systems; Runtime; Switches; MPI+X; Message Passing Interface; Ove;
机译:与可扩展DL训练的参数服务器中的计算重叠通信
机译:GPU / CPU异构并行空间域分解MOC方法的重叠通信与计算
机译:通过自动并行化和无阻塞集体操作的运行时调整来最大化通信计算重叠
机译:MPI + ULT:使用用户级线程重叠通信和计算
机译:高效并行全对计算框架:使用计算 - 通信重叠
机译:撤回重叠信号序列控制通知GRP58的核定位和内质网保留生化与生物物理研究通讯377(2)(2008)407–412
机译:使用用户级通信网络在工作站群集上进行有效的并行计算