...
首页> 外文期刊>Parallel and Distributed Systems, IEEE Transactions on >Accelerating ODE-Based Simulation of General and Heterogeneous Biophysical Models Using a GPU
【24h】

Accelerating ODE-Based Simulation of General and Heterogeneous Biophysical Models Using a GPU

机译:使用GPU加速基于ODE的通用和异构生物物理模型的仿真

获取原文
获取原文并翻译 | 示例
           

摘要

Flint is a simulator that numerically integrates heterogeneous biophysical models described by a large set of ordinary differential equations. It uses an internal bytecode representation of simulation-related expressions to handle various biophysical models built for general purposes. We propose two acceleration methods for Flint using a graphics processing unit (GPU). The first method interprets multiple bytecodes in parallel on the GPU. It automatically parallelizes the simulation using a level scheduling algorithm. We implement an interpreter of the Flint bytecode that is suited for running on the GPU, which reduces both the number of memory accesses and divergent branches to achieve higher performance. The second method translates a model into a source code for both the CPU and the GPU through the internal bytecode, which speeds up the compilation of the generated source codes, because the code size is diminished because of bytecode unification. With large models such that tens of thousands or more expressions can be evaluated simultaneously, the translated code running on the GPU achieves computational performance of up to 2.7 higher than that running on a CPU. Otherwise, with small models, the CPU is faster than the GPU. Therefore, the translated code dynamically determines on which to run either the CPU or the GPU by profiling initial few iterations of the simulation.
机译:Flint是一个模拟器,该模拟器在数值上集成了由大量常微分方程组描述的异质生物物理模型。它使用与模拟相关的表达式的内部字节码表示形式来处理为通用目的而构建的各种生物物理模型。我们使用图形处理单元(GPU)为Flint提出了两种加速方法。第一种方法在GPU上并行解释多个字节码。它使用级别调度算法自动并行化仿真。我们实现了适合在GPU上运行的Flint字节码的解释器,从而减少了内存访问次数和分支分支,从而实现了更高的性能。第二种方法通过内部字节码将模型转换为CPU和GPU的源代码,由于字节码统一而减小了代码大小,从而加快了所生成源代码的编译速度。对于大型模型,可以同时评估成千上万个表达式,与在CPU上运行的代码相比,在GPU上运行的已转换代码可实现高达2.7的计算性能。否则,在小型机型中,CPU的速度要比GPU快。因此,转换后的代码通过对模拟的最初几次迭代进行概要分析来动态确定要在哪个CPU或GPU上运行。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号