首页> 外文期刊>Parallel Processing Letters >HYBRID-PARALLEL SPARSE MATRIX-VECTOR MULTIPLICATION WITH EXPLICIT COMMUNICATION OVERLAP ON CURRENT MULTICORE-BASED SYSTEMS
【24h】

HYBRID-PARALLEL SPARSE MATRIX-VECTOR MULTIPLICATION WITH EXPLICIT COMMUNICATION OVERLAP ON CURRENT MULTICORE-BASED SYSTEMS

机译:电流多核系统上具有显式通信重叠的混合并行稀疏矩阵-矢量乘法

获取原文
获取原文并翻译 | 示例
           

摘要

We evaluate optimized parallel sparse matrix-vector operations for several representative application areas on widespread multicore-based cluster configurations. First the single-socket baseline performance is analyzed and modeled with respect to basic architectural properties of standard multicore chips. Beyond the single node, the performance of parallel sparse matrix-vector operations is often limited by communication overhead. Starting from the observation that nonblocking MPI is not able to hide communication cost using standard MPI implementations, we demonstrate that explicit overlap of communication and computation can be achieved by using a dedicated communication thread, which may run on a virtual core. Moreover we identify performance benefits of hybrid MPI/OpenMP programming due to improved load balancing even without explicit communication overlap. We compare performance results for pure MPI, the widely used "vector-like" hybrid programming strategies, and explicit overlap on a modern multicore-based cluster and a Cray XE6 system.
机译:我们在广泛的基于多核的群集配置上评估了几个代表性应用领域的优化并行稀疏矩阵矢量运算。首先,针对标准多核芯片的基本架构特性,对单路基准性能进行分析和建模。除了单个节点之外,并行稀疏矩阵矢量运算的性能通常受到通信开销的限制。从观察到无阻塞MPI不能使用标准MPI实现来隐藏通信成本开始,我们证明了使用专用的通信线程(可以在虚拟内核上运行)可以实现通信和计算的显式重叠。此外,即使没有明确的通信重叠,我们也可以通过改善负载平衡来确定MPI / OpenMP混合编程的性能优势。我们比较了纯MPI,广泛使用的“矢量样”混合编程策略以及在基于现代多核的群集和Cray XE6系统上的显式重叠的性能结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号