首页> 外文会议>International conference on computational science >An OpenMP Implementation of the TVD-Hopmoc Method Based on a Synchronization Mechanism Using Locks Between Adjacent Threads on Xeon Phi (TM) Accelerators

【24h】

An OpenMP Implementation of the TVD-Hopmoc Method Based on a Synchronization Mechanism Using Locks Between Adjacent Threads on Xeon Phi (TM) Accelerators

机译：基于使用Xeon Phi（TM）加速器上相邻线程之间的锁的同步机制的TVD-Hopmoc方法的OpenMP实现

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This work focuses on the study of the 1-D TVD-Hopmoc method executed in shared memory manycore environments. In particular, this paper studies barrier costs on Intel® Xeon Phi™ (KNC and KNL) accelerators when using the OpenMP standard. This paper employs an explicit synchronization mechanism to reduce spin and thread scheduling times in an OpenMP implementation of the 1-D TVD-Hopmoc method. Basically, we define an array that represents threads and the new scheme consists of synchronizing only adjacent threads. Moreover, the new approach reduces the OpenMP scheduling time by employing an explicit work-sharing strategy. In the beginning of the process, the array that represents the computational mesh of the numerical method is partitioned among threads, instead of permitting the OpenMP API to perform this task. Thereby, the new scheme diminishes the OpenMP spin time by avoiding OpenMP barriers using an explicit synchronization mechanism where a thread only waits for its two adjacent threads. The results of the new approach is compared with a basic parallel implementation of the 1-D TVD-Hopmoc method. Specifically, numerical simulations shows that the new approach achieves promising performance gains in shared memory manycore environments for an OpenMP implementation of the 1-D TVD-Hopmoc method.

机译：这项工作的重点是在共享内存多核环境中执行的1-D TVD-Hopmoc方法的研究。特别是，本文研究了使用OpenMP标准时，英特尔®至强融核™（KNC和KNL）加速器的壁垒成本。本文采用显式同步机制来减少一维TVD-Hopmoc方法的OpenMP实现中的自旋和线程调度时间。基本上，我们定义了一个表示线程的数组，新方案包括仅同步相邻线程。此外，新方法通过采用明确的工作共享策略来减少OpenMP调度时间。在过程的开始，代表数值方法的计算网格的数组在线程之间分配，而不是允许OpenMP API执行此任务。因此，新方案通过使用显式同步机制来避免OpenMP障碍，从而减少了OpenMP旋转时间，在显式同步机制中，线程仅等待其两个相邻线程。将该新方法的结果与一维TVD-Hopmoc方法的基本并行实现进行了比较。具体而言，数值模拟表明，对于一维TVD-Hopmoc方法的OpenMP实现，该新方法在共享内存多核环境中实现了可观的性能提升。

著录项

来源
《International conference on computational science》|2018年|701-707|共7页
会议地点
作者
Frederico L. Cabral; Carla Osthoff; Gabriel P. Costa; Sanderson L. Gonzaga de Oliveira; Diego Brandao; Mauricio Kischinhevsky;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. A parallel Non-Local means denoising algorithm implementation with OpenMP and OpenCL on Intel Xeon Phi Coprocessor [J] . Zhu Huming, Wu Yanfei, Li Pei, Journal of computational science . 2016,第pta3期

机译：在Intel Xeon Phi协处理器上使用OpenMP和OpenCL进行并行的非本地均值降噪算法实现
2. An efficient MPI/OpenMP parallelization of the Hartree-Fock-Roothaan method for the first generation of Intel® Xeon Phi™ processor architecture [J] . Mironov Vladimir, Moskovsky Alexander, DMello Michael, Experimental Mechanics . 2019,第1期

机译：用于第一代英特尔®至强融核™处理器体系结构的Hartree-Fock-Roothaan方法的高效MPI / OpenMP并行化
3. Task based Cholesky decomposition on Xeon Phi architectures using OpenMP [J] . Joseph Dorris, Asim YarKhan, Jakub Kurzak, International Journal of Computational Science and Engineering . 2018,第3期

机译：基于任务基于OpenMP的Xeon Phi架构上的Cholesky分解
4. An OpenMP Implementation of the TVD-Hopmoc Method Based on a Synchronization Mechanism Using Locks Between Adjacent Threads on Xeon Phi (TM) Accelerators [C] . Frederico L. Cabral, Carla Osthoff, Gabriel P. Costa, International Conference on Computational Science . 2018

机译：基于Xeon Phi（TM）加速器上的相邻线程之间的锁定机制的TVD-HopMoc方法的OpenMP实现
5. An Experimental Evaluation of the OpenMP Thread Mapping for LU Factorisation on Xeon Phi Coprocessor and on Hybrid CPU-MIC Platform [O] . Beata Bylina, Jaroslaw Bylina 2018

机译：Xeon Phi Coprocessor和混合CPU-MIC平台LU分解的OpenMP线程映射的实验评价

An OpenMP Implementation of the TVD-Hopmoc Method Based on a Synchronization Mechanism Using Locks Between Adjacent Threads on Xeon Phi (TM) Accelerators

摘要

著录项

相似文献

相关主题

期刊订阅