首页> 外文期刊>Concurrency, practice and experience >Cooperative and out-of-core execution of the irregular wavefront propagation pattern on hybrid machines with IntelⓇ Xeon Phi™
【24h】

Cooperative and out-of-core execution of the irregular wavefront propagation pattern on hybrid machines with IntelⓇ Xeon Phi™

机译:在具有Intel®Xeon Phi™的混合计算机上,不规则波前传播模式的协作和核心外执行

获取原文
获取原文并翻译 | 示例
           

摘要

The Irregular Wavefront Propagation Pattern (IWPP) is a core computing structure in severalrnimage analysis operations. Efficient implementation of IWPP on the Intel Xeon Phi is difficultrnbecause of the irregular data access and computation characteristics. The traditional IWPPrnalgorithm relies on atomic instructions, which are not available in the SIMD set of the Intel Phi.rnTo overcome this limitation, we have proposed a new IWPP algorithm that can take advantagernof non-atomic SIMD instructions supported on the Intel Xeon Phi. We have also developed andrnevaluated methods to useCPUand IntelPhi cooperatively for parallel execution of theIWPPalgorithms.rnOur new cooperative IWPP version is also able to handle large out-of-core images thatrnwould not fit into the memory of the accelerator. The new IWPP algorithm is used to implementrnthe Morphological Reconstruction and Fill Holes operations, which are operations commonlyrnfound in image analysis applications. The vectorization implemented with the new IWPP hasrnattained improvements of up to about 5×on top of the original IWPPand significant gains as comparedrnto state-of-the-art the CPU and GPU versions. The new version running on an Intel Phi isrn6.21× and 3.14× faster than running on a 16-core CPU and on a GPU, respectively. Finally, therncooperative execution using two Intel Phi devices and a multi-coreCPUhas reached performancerngains of 2.14× as compared to the execution using a single Intel Xeon Phi.
机译:不规则波前传播模式(IWPP)是几种图像分析操作中的核心计算结构。由于不规则的数据访问和计算特性,很难在Intel Xeon Phi上高效实施IWPP。传统的IWPPrn算法依赖于Intel Phi的SIMD集中没有的原子指令。为克服此限制,我们提出了一种新的IWPP算法,该算法可以利用Intel Xeon Phi支持的非原子SIMD指令。我们还开发并重新评估了将CPU和Intel Phi协同使用以并行执行IWPP算法的方法。我们的新的IWPP协同版本也能够处理加速器内存中无法容纳的大型核外图像。新的IWPP算法用于实现形态重建和填充孔操作,这些操作是图像分析应用程序中常见的操作。与最新的CPU和GPU版本相比,使用新的IWPP实现的矢量化已在原始IWPP的基础上进行了多达5倍的改进,并获得了可观的收益。在Intel Phi上运行的新版本分别比在16核CPU和GPU上运行的速度快6.21倍和3.14倍。最终,与使用单个Intel Xeon Phi的执行相比,使用两个Intel Phi设备和多核CPU的合作执行的性能收益达到了2.14倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号