Shared-Memory Parallel Probabilistic Graphical Modeling Optimization:Comparison of Threads, OpenMP,and Data-Parallel Primitives

机译：共享内存并行概率图形建模优化：线程，OpenMP和数据并行基元的比较

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This work examines performance characteristics of multiple shared-memory implementations of a probabilistic graphical modeling (PGM) optimization code, which forms the basis for an advanced, state-of-the art image segmentation method. The work is motivated by the need to accelerate scientific image analysis pipelines in use by experimental science, such as at x-ray light sources, and is motivated by the need for platform-portable codes that perform well across many different computational architectures. The primary focus of this work and its main contribution is an in-depth study of shared-memory parallel performance of different implementations, which include those using alternative parallelization approaches such as C11-threads, OpenMP, and data parallel primitives (DPPs). Our results show that, for this complex data-intensive algorithm, the DPP implementation exhibits better runtime performance, but also exhibits less favorable scaling characteristics than the C11-threads and OpenMP counterparts. Based upon a set of experiments that collect hardware performance counters on multiple platforms, the reason for the runtime performance difference appears to be due primarily to algorithmic efficiency gains: the reformulation from the traditional C11-threads and OpenMP expression of the solution into that of data parallel primitives results in significantly fewer instructions being executed. This study is the first of its type to do performance analysis using hardware counters for comparing methods based on VTK-m-based data-parallel primitives with those based on more traditional OpenMP or threads-based parallelism. It is timely, as there is increasing awareness of the need for platform portability in light of increasing node-level parallelism and increasing device heterogeneity.

机译：这项工作检查了概率图形建模（PGM）优化代码的多个共享内存实现的性能特征，这些代码为先进的最新图像分割方法奠定了基础。这项工作的动机是需要加速实验科学（例如X射线光源）使用的科学图像分析流程，并且需要在许多不同的计算体系结构中表现良好的平台可移植代码。这项工作的主要重点及其主要贡献是深入研究了不同实现的共享内存并行性能，其中包括使用替代并行化方法（例如C11线程，OpenMP和数据并行原语（DPP））的共享性能。我们的结果表明，对于这种复杂的数据密集型算法，DPP实现具有更好的运行时性能，但与C11线程和OpenMP对应项相比，其缩放特性也较差。基于收集多个平台上的硬件性能计数器的一组实验，运行时性能差异的原因似乎主要是由于算法效率的提高：从传统的C11线程和解决方案的OpenMP表达式到数据表示的重新构造并行原语导致执行的指令明显减少。这项研究是首次使用硬件计数器进行性能分析，以将基于基于VTK-m的数据并行原语的方法与基于更传统的OpenMP或基于线程的并行性的方法进行比较。这是及时的，因为随着节点级并行性的提高和设备异构性的提高，人们越来越意识到对平台可移植性的需求。

著录项

来源
《International Conference ISC High Performance: International Conference on High Performance Computing》|2020年|127-145|共19页
会议地点
作者
Talita Perciano; Colleen Heinemann; David Camp; Brenton Lessley; E. Wes Bethel;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Probabilistic graphical models; Modeling optimization; Markov random fields; Image segmentation; Computer vision; Data parallel primitives; Shared-memory parallel; Platform portability;

机译：概率图形模型;建模优化;马尔可夫随机场;图像分割计算机视觉;数据并行原语;共享内存并行;平台可移植性;

相似文献

外文文献
中文文献
专利

1. A comparison of the shared-memory parallel programming models OpenMP, OpenACC and Kokkos in the context of implicit solvers for high-order FEM [J] . Eichstadt Jan, Vymazal Martin, Moxey David, Computer physics communications . 2020,第期

机译：在高阶FEM隐式求解器上下文中的共享内存并行编程模型OpenMP，OPENACC和Kokkos的比较
2. Comparative Evaluation and Case Studies of Shared-Memory and Data-Parallel Execution Patterns [J] . XiaodongZhang, LinSun Scientific programming . 1999,第1期

机译：共享内存和数据并行执行模式的比较评估和案例研究
3. High-Scalability Parallelization of a Molecular Modeling Application: Performance and Productivity Comparison Between OpenMP and MPI Implementations [J] . Russell Brown, Ilya Sharapov International journal of parallel programming . 2007,第5期

机译：分子建模应用程序的可高度并行化：OpenMP和MPI实现之间的性能和生产率比较
4. DPP-PMRF: Rethinking Optimization for a Probabilistic Graphical Model Using Data-Parallel Primitives [C] . Brenton Lessley, Talita Perciano, Colleen Heinemann, . 2018

机译：DPP-PMRF：重新思考使用数据并行基元的概率图形模型的优化
5. Exploration of parallelism for probabilistic graphical models. [D] . Xia, Yinglong. 2010

机译：探索概率图形模型的并行性。
6. Learned graphical models for probabilistic planning provide a new class of movement primitives [O] . Elmar A. Rückert, Gerhard Neumann, Marc Toussaint, 2012

机译：学习的概率规划图形模型提供了新的运动原语类
7. Shared-Memory Parallel Probabilistic Graphical Modeling Optimization: Comparison of Threads, OpenMP, and Data-Parallel Primitives [O] . Talita Perciano, Colleen Heinemann, David Camp, 2020

机译：共享内存并行概率图形建模优化：线程，OpenMP和数据并行基元的比较

Shared-Memory Parallel Probabilistic Graphical Modeling Optimization:Comparison of Threads, OpenMP,and Data-Parallel Primitives

摘要

著录项

相似文献

相关主题

期刊订阅