首页> 外文会议>Recent Advances in Parallel Virtual Machine and Message Passing Interface >MPI Support for Multi-core Architectures: Optimized Shared Memory Collectives
【24h】

MPI Support for Multi-core Architectures: Optimized Shared Memory Collectives

机译:MPI对多核体系结构的支持:优化的共享内存集合

获取原文
获取原文并翻译 | 示例

摘要

With local core counts on the rise, taking advantage of shared-memory to optimize collective operations can improve performance. We study several on-host shared memory optimized algorithms for MPI_Bcast, MPLReduce, and MPI_Allreduce, using tree-based, and reduce-scatter algorithms. For small data operations with relatively large synchronization costs fan-in/fan-out algorithms generally perform best. For large messages data manipulation constitute the largest cost and reduce-scatter algorithms are best for reductions. These optimization improve performance by up to a factor of three. Memory and cache sharing effect require deliberate process layout and careful radix selection for tree-based methods.
机译:随着本地核心数量的增加,利用共享内存来优化集体操作可以提高性能。我们使用基于树和减少分散的算法研究了几种针对MPI_Bcast,MPLReduce和MPI_Allreduce的主机上共享内存优化算法。对于具有相对较大同步成本的小数据操作,扇入/扇出算法通常效果最佳。对于大消息,数据处理构成了最大的成本,而减少分散算法最适合于减少。这些优化将性能提高了三倍。内存和缓存共享效果需要基于树的方法进行仔细的进程布局和仔细的基数选择。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号