首页> 美国卫生研究院文献>PeerJ Computer Science >Parallelisation of equation-based simulation programs on heterogeneous computing systems
【2h】

Parallelisation of equation-based simulation programs on heterogeneous computing systems

机译:基于等式的异构计算系统仿真程序的平行化

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Numerical solutions of equation-based simulations require computationally intensive tasks such as evaluation of model equations, linear algebra operations and solution of systems of linear equations. The focus in this work is on parallel evaluation of model equations on shared memory systems such as general purpose processors (multi-core CPUs and manycore devices), streaming processors (Graphics Processing Units and Field Programmable Gate Arrays) and heterogeneous systems. The current approaches for evaluation of model equations are reviewed and their capabilities and shortcomings analysed. Since stream computing differs from traditional computing in that the system processes a sequential stream of elements, equations must be transformed into a data structure suitable for both types. The postfix notation expression stacks are recognised as a platform and programming language independent method to describe, store in computer memory and evaluate general systems of differential and algebraic equations of any size. Each mathematical operation and its operands are described by a specially designed data structure, and every equation is transformed into an array of these structures (a Compute Stack). Compute Stacks are evaluated by a stack machine using a Last In First Out queue. The stack machine is implemented in the DAE Tools modelling software in the C99 language using two Application Programming Interface (APIs)/frameworks for parallelism. The Open Multi-Processing (OpenMP) API is used for parallelisation on general purpose processors, and the Open Computing Language (OpenCL) framework is used for parallelisation on streaming processors and heterogeneous systems. The performance of the sequential Compute Stack approach is compared to the direct C++ implementation and to the previous approach that uses evaluation trees. The new approach is 45% slower than the C++ implementation and more than five times faster than the previous one. The OpenMP and OpenCL implementations are tested on three medium-scale models using a multi-core CPU, a discrete GPU, an integrated GPU and heterogeneous computing setups. Execution times are compared and analysed and the advantages of the OpenCL implementation running on a discrete GPU and heterogeneous systems are discussed. It is found that the evaluation of model equations using the parallel OpenCL implementation running on a discrete GPU is up to twelve times faster than the sequential version while the overall simulation speed-up gained is more than three times.
机译:基于等式的模拟的数值解需要计算密集的任务,例如模型方程的评估,线性代数和线性方程系统的解决方案。在这项工作中的重点是对共享内存系统的模型方程的并行评估,例如通用处理器(多核CPU和多核设备),流处理器(图形处理单元和现场可编程门阵列)和异构系统。综述了模型方程评估的当前方法及其能力和分析的缺点。由于流计算与传统计算的不同之处在于系统处理顺序元素流,因此必须将方程转换为适合于两种类型的数据结构。 Postfix符号表达式堆栈被识别为一个平台和编程语言独立方法,用于描述计算机存储器中的存储,并评估任何大小的差分和代数方程的一般系统。每个数学操作及其操作数由特殊设计的数据结构描述,并且每个等式都被变换为这些结构的阵列(计算堆栈)。计算堆栈由堆栈计算机使用返回队列中的最后一个堆栈计算机进行评估。堆栈机器在C99语言中的DAE工具建模软件中实现,使用两个应用程序编程接口/框架进行并行性。开放式多处理(OpenMP)API用于通用处理器上的平行化,并且开放计算语言(OpenCL)框架用于流式处理器和异构系统上的平行化。顺序计算堆栈方法的性能与直接C ++实现进行比较,并以使用评估树的前一个方法。新方法比C ++实现慢45%,比前一个速度快五倍于。使用多核CPU,离散GPU,集成GPU和异构计算设置,在三个中等规模模型上测试OpenMP和OpenCL实现。比较和分析执行时间,并讨论了在离散GPU和异构系统上运行的OpenCL实现的优点。结果发现,使用在离散GPU上运行的并行OpenCL实现的模型方程的评估高达12次比顺序版本快,而总体模拟加速已经超过三次。

著录项

  • 期刊名称 PeerJ Computer Science
  • 作者

    Dragan D. Nikolić;

  • 作者单位
  • 年(卷),期 2018(-1),-1
  • 年度 2018
  • 页码 -1
  • 总页数 32
  • 原文格式 PDF
  • 正文语种
  • 中图分类
  • 关键词

    机译:建模;模拟;异构计算;并行计算;差分代数方程;基于等式的;流处理器;OpenCL;OpenMP;

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号